The Tangent FFT
The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just \(4n\lg n-6n+8\) arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was wi
- PDF / 450,256 Bytes
- 10 Pages / 430 x 660 pts Page_size
- 55 Downloads / 214 Views
Abstract. The split-radix FFT computes a size-n complex DFT, when n is a large power of 2, using just 4n lg n−6n+8 arithmetic operations on real numbers. This operation count was first announced in 1968, stood unchallenged for more than thirty years, and was widely believed to be best possible. Recently James Van Buskirk posted software demonstrating that the split-radix FFT is not optimal. Van Buskirk’s software computes a sizen complex DFT using only (34/9 + o(1))n lg n arithmetic operations on real numbers. There are now three papers attempting to explain the improvement from 4 to 34/9: Johnson and Frigo, IEEE Transactions on Signal Processing, 2007; Lundy and Van Buskirk, Computing, 2007; and this paper. This paper presents the “tangent FFT,” a straightforward in-place cache-friendly DFT algorithm having exactly the same operation counts as Van Buskirk’s algorithm. This paper expresses the tangent FFT as a sequence of standard polynomial operations, and pinpoints how the tangent FFT saves time compared to the split-radix FFT. This description is helpful not only for understanding and analyzing Van Buskirk’s improvement but also for minimizing the memory-access costs of the FFT. Keywords: Tangent FFT, split-radix FFT, modified split-radix FFT, scaled odd tail, DFT; convolution,polynomial multiplication, algebraic complexity, communication complexity.
1
Introduction
Consider the problem of computing the size-n complex DFT (“discrete Fourier transform”), where n is a power of 2; i.e., evaluating an n-coefficient univariate complex polynomial f at all of the nth roots of 1. The input is a sequence of n complex numbers f0 , f1 , . . . , fn−1 representing the polynomial f = f0 + f1 x + · · ·+fn−1 xn−1 . The output is the sequence f (1), f (ζn ), f (ζn2 ), . . . , f (ζnn−1 ) where ζn = exp(2πi/n). The size-n FFT (“fast Fourier transform”) is a well-known algorithm to compute the size-n DFT using (5+o(1))n lg n arithmetic operations on real numbers. One can remember the coefficient 5 as half the total cost of a complex addition
Permanent ID of this document: a9a77cef9a7b77f9b8b305e276d5fe25. Date of this document: 2007.09.19.
S. Bozta¸s and H.F. Lu (Eds.): AAECC 2007, LNCS 4851, pp. 291–300, 2007. c Springer-Verlag Berlin Heidelberg 2007
292
D.J. Bernstein
(2 real operations), a complex subtraction (2 real operations), and a complex multiplication (6 real operations). The FFT was used for astronomical calculations by Gauss in 1805; see, e.g., [6, pages 308–310], published in 1866. It was reinvented and republished on several subsequent occasions and was finally popularized in 1965 by Cooley and Tukey in [2]. The advent of high-speed computers meant that users in the 1960s were trying to handle large values of n in a wide variety of applications and could see large benefits from the FFT. The Cooley-Tukey paper spawned a torrent of FFT papers—showing, among other things, that Gauss had missed a trick. The original FFT is not the optimal way to compute the DFT. In 1968, Yavne stated that one could compute the DFT using
Data Loading...