Of course to prove a program correct one needs a concept of the right answer. There is a simple expression for the DFT. (A[i] after the return) = Σjωij(A[j] before the call) where the summation is from 0 thru 1023 and ω = e2πi/1024, or the 1024th root of unity. ω1024 = 1 and ωn → n=0 mod 1024.
The main of this program is a direct, efficient and simple verification that the above values are indeed returned within a tolerance, whenever a basis vector is submitted. It could be simpler except the obvious simpler versions suffer a loss of significance in the computation of p, against which to test the results. The verification code is plain compared with the fft routine, which is truly obscure.