About 40 years ago the staid world of numerical analysis was shaken by the announcement of an algorithm for taking the Fourier transform in about 4*n*log(n) steps. It had been thought to require 4n2 steps and was a common calculation despite the cost. The paper by Cooley & Tukey (Math. Comput. 19, 297−301 (1965)) described a strange calculation that produced the correct results although the reason was difficult for most to understand. About 20 years ago a very short Fortran program to perform the calculation was published. I do not have that reference but the program was so short that I transcribed it to PL/I at the time and just now (2001) from PL/I to the GNU dialect of C. This program must rank near the top according to the combined attributes of short, fast, useful and obscure. Technically the calculation is for the Discrete Fourier Transform, or DFT. Here is my most recent C version. (OCaml and Algol 68 too) Here is an MP Code for 2D FFT.

A recent slogan is “A program must not only work but must appear to work”. The above program works but alas it does not appear to me to work. There are two difficulties in understanding the short program:

It recently occurred to me that there is an easy and elementary proof that the program is correct even though it sheds no light on how the program works. Program proofs usually “explain” is some formal sense the logic of the program. Not so here. If you know a little vector space theory see this for a synopsis of the proof.

The routine “fft” takes a reference to an array of complex numbers and returns when it has replaced those numbers with their discrete Fourier transform. The size must be a power of two. This proof is for the particular size, 1024, of the array. (If your array size is different you can modify and run my code.)

One prerequisite is a little theory on vector spaces, in particular a 1024 dimensional vector space over the field of complex numbers. The 1024 complex numbers that are the inputs to fft are the components of a vector in the space and so are the returned values. The transformation between input in output is linear.

I provide here a top level view of the proof here with links to elaborations of the steps. This is the only glue that holds the proof together.

  1. I divide the identifiers for numbers into three classes
    Control
    The fixed identifiers (declared “int”)
    bControl
    The complex identifiers “b” & “w”.
    Linear
    These are the complex numbers, (type “c”), aside from b & w.
    Each of these identifiers takes on a sequence of values as the routine runs.
  2. Each Control and bControl identifier takes on a sequence of values during the execution of the routine and these values do not depend on the complex inputs to the routine.
  3. There are no references to uninitialized variables.
  4. Array subscript values are always in bounds.
  5. In summary so far: The order of execution of the statements and expressions of the routine is deterministic. Each control construct, “if”, “for” and “while” is driven by Control values and the sequence of control values does not depend on the complex inputs (a[]).
  6. The Linear values, several per complex identifier, are each a linear combination of the routine inputs.
  7. The output values are each a linear combination of the inputs.
  8. Two linear transformations that agree on each member of some set of basis elements, agree everywhere.
  9. The routine produces the correct values for each call when the input array is all 0 except for one element with value 1.
  10. The above set of input possibilities constitute a basis for the space of inputs for the routine.
  11. The routine is always correct.

This is a multiprocessor version for a 2D transform. See these amusing and useful notes on 2D Fourier transforms. Transformed images