This version is for the floating point FFT.

This FFT is  my  own  classic  'Quad-2'  FFT,  and  is the standard,
'official' FFT for my pi program.

Under optimal conditions, the FFT  has  a maximum limit of 8 million
digits.  Beyond that, the program has to use the fractal multiply to
break large chunks  into  small  enough  blocks.   (At 8m digits, it
consumes 32megs of memory.)

Depending upon the quality  of  your  trig,  and  whether you have a
'long double' FPU, the limit may be as low as 1m digits.

There are versions for x87/68882 systems  that  have  FPU  registers
wider  than  a  standard  53  bit  mantissa  'double', a version for
x87/6882 systems that need some  encouragment to use their wider FPU
registers, and two versions for systems with only a 53 bit  mantissa
(64 bit total size) 'double'.

There is also an  option  to  compile  for 'long double' data.  This
increases the maximum multiplication limit, at the expense  of  more
memory  and  runtime.   (Depending upon the compiler and system, any
where from 25% more to 100% more.)  I'm not quite sure how  high  it
could go.  It'd probably  depend  upon  your trig method, anyway.  I
have it set fairly high.  I can NOT  guarantee  it  will  work  that
high.  Based on the 'trig bits' formula I used to estimate where the
regular  'double'  method  would fail, this should be around where I
set it.  But I can't guarantee it.  It will depend on the quality of
the trig, too.

The FFT itself is fairly standard.  It's been tweaked a bit  to  run
better  on  memory  cache  systems,  but it's still a fairly typical
'real' value FFT.  Many  systems  will  have their own FFT libraries
that are even better tweaked for that particular system.  You  might
be able to use one  of  those  instead  of the RealFFT() function in
FFT.c.  You'll need to  define  the 'USE_EXTERNAL_FFT' in sys_cfg.h,
and do a bit of code modification in fft.c and bigmul.c in order  to
correctly interface your FFT.


There are a lot of things I _could_ do to the FFT but haven't.

I  could  modify  the  trig  to use two small tables.  That might be
efficient enough  to  allow  just  trig  method  to  be  used on all
computers.  I  could  tinker  with  the  trig  and  possibly  use  a
recurance  in  the  iterative part, and a sine call in the recursive
part.  Or perhaps use  a  precomputed  table  in the iterative part.
Or.....

There are a lot of 'ors'.  But I haven't done those  things  because
the  FFT  takes up so much memory that a typical desktop system just
doesn't  have  enough  physical  memory  to compute beyond 8m or 16m
digits of pi this way (without the FractalMul() kicking in.)

I don't know for sure what the limit of the 2 digit version will be.
I  suspect  it  would  be able to go to 1g digits of pi, but I can't
check, of course.  Also, if  you  use  an external FFT library, this
limit will depend upon it.  TEST it!  Use  the  testing  routine  to
check  the multiplication by giving the program a negative number of
digits to  compute:   pi  -512m  would  test  the  multiply for 512m
decimal digits.  If you need to, you can  reduce  the  MaxFFTLen  in
fft.h.


