shun_iwasawa a35b8f
KISS FFT - A mixed-radix Fast Fourier Transform based up on the principle, 
shun_iwasawa a35b8f
"Keep It Simple, Stupid."
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    There are many great fft libraries already around.  Kiss FFT is not trying
shun_iwasawa a35b8f
to be better than any of them.  It only attempts to be a reasonably efficient, 
shun_iwasawa a35b8f
moderately useful FFT that can use fixed or floating data types and can be 
shun_iwasawa a35b8f
incorporated into someone's C program in a few minutes with trivial licensing.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
USAGE:
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    The basic usage for 1-d complex FFT is:
shun_iwasawa a35b8f
shun_iwasawa a35b8f
        #include "kiss_fft.h"
shun_iwasawa a35b8f
shun_iwasawa a35b8f
        kiss_fft_cfg cfg = kiss_fft_alloc( nfft ,is_inverse_fft ,0,0 );
shun_iwasawa a35b8f
shun_iwasawa a35b8f
        while ...
shun_iwasawa a35b8f
        
shun_iwasawa a35b8f
            ... // put kth sample in cx_in[k].r and cx_in[k].i
shun_iwasawa a35b8f
            
shun_iwasawa a35b8f
            kiss_fft( cfg , cx_in , cx_out );
shun_iwasawa a35b8f
            
shun_iwasawa a35b8f
            ... // transformed. DC is in cx_out[0].r and cx_out[0].i 
shun_iwasawa a35b8f
            
shun_iwasawa a35b8f
        free(cfg);
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Note: frequency-domain data is stored from dc up to 2pi.
shun_iwasawa a35b8f
    so cx_out[0] is the dc bin of the FFT
shun_iwasawa a35b8f
    and cx_out[nfft/2] is the Nyquist bin (if exists)
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Declarations are in "kiss_fft.h", along with a brief description of the 
shun_iwasawa a35b8f
functions you'll need to use. 
shun_iwasawa a35b8f
shun_iwasawa a35b8f
Code definitions for 1d complex FFTs are in kiss_fft.c.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
You can do other cool stuff with the extras you'll find in tools/
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    * multi-dimensional FFTs 
shun_iwasawa a35b8f
    * real-optimized FFTs  (returns the positive half-spectrum: (nfft/2+1) complex frequency bins)
shun_iwasawa a35b8f
    * fast convolution FIR filtering (not available for fixed point)
shun_iwasawa a35b8f
    * spectrum image creation
shun_iwasawa a35b8f
shun_iwasawa a35b8f
The core fft and most tools/ code can be compiled to use float, double,
shun_iwasawa a35b8f
 Q15 short or Q31 samples. The default is float.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
shun_iwasawa a35b8f
BACKGROUND:
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    I started coding this because I couldn't find a fixed point FFT that didn't 
shun_iwasawa a35b8f
use assembly code.  I started with floating point numbers so I could get the 
shun_iwasawa a35b8f
theory straight before working on fixed point issues.  In the end, I had a 
shun_iwasawa a35b8f
little bit of code that could be recompiled easily to do ffts with short, float
shun_iwasawa a35b8f
or double (other types should be easy too).  
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Once I got my FFT working, I was curious about the speed compared to
shun_iwasawa a35b8f
a well respected and highly optimized fft library.  I don't want to criticize 
shun_iwasawa a35b8f
this great library, so let's call it FFT_BRANDX.
shun_iwasawa a35b8f
During this process, I learned:
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    1. FFT_BRANDX has more than 100K lines of code. The core of kiss_fft is about 500 lines (cpx 1-d).
shun_iwasawa a35b8f
    2. It took me an embarrassingly long time to get FFT_BRANDX working.
shun_iwasawa a35b8f
    3. A simple program using FFT_BRANDX is 522KB. A similar program using kiss_fft is 18KB (without optimizing for size).
shun_iwasawa a35b8f
    4. FFT_BRANDX is roughly twice as fast as KISS FFT in default mode.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    It is wonderful that free, highly optimized libraries like FFT_BRANDX exist.
shun_iwasawa a35b8f
But such libraries carry a huge burden of complexity necessary to extract every 
shun_iwasawa a35b8f
last bit of performance.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Sometimes simpler is better, even if it's not better.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
FREQUENTLY ASKED QUESTIONS:
shun_iwasawa a35b8f
	Q: Can I use kissfft in a project with a ___ license?
shun_iwasawa a35b8f
	A: Yes.  See LICENSE below.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
	Q: Why don't I get the output I expect?
shun_iwasawa a35b8f
	A: The two most common causes of this are 
shun_iwasawa a35b8f
		1) scaling : is there a constant multiplier between what you got and what you want?
shun_iwasawa a35b8f
		2) mixed build environment -- all code must be compiled with same preprocessor 
shun_iwasawa a35b8f
		definitions for FIXED_POINT and kiss_fft_scalar
shun_iwasawa a35b8f
shun_iwasawa a35b8f
	Q: Will you write/debug my code for me?
shun_iwasawa a35b8f
	A: Probably not unless you pay me.  I am happy to answer pointed and topical questions, but 
shun_iwasawa a35b8f
	I may refer you to a book, a forum, or some other resource.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
shun_iwasawa a35b8f
PERFORMANCE:
shun_iwasawa a35b8f
    (on Athlon XP 2100+, with gcc 2.96, float data type)
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Kiss performed 10000 1024-pt cpx ffts in .63 s of cpu time.
shun_iwasawa a35b8f
    For comparison, it took md5sum twice as long to process the same amount of data.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Transforming 5 minutes of CD quality audio takes less than a second (nfft=1024). 
shun_iwasawa a35b8f
shun_iwasawa a35b8f
DO NOT:
shun_iwasawa a35b8f
    ... use Kiss if you need the Fastest Fourier Transform in the World
shun_iwasawa a35b8f
    ... ask me to add features that will bloat the code
shun_iwasawa a35b8f
shun_iwasawa a35b8f
UNDER THE HOOD:
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Kiss FFT uses a time decimation, mixed-radix, out-of-place FFT. If you give it an input buffer  
shun_iwasawa a35b8f
    and output buffer that are the same, a temporary buffer will be created to hold the data.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    No static data is used.  The core routines of kiss_fft are thread-safe (but not all of the tools directory).
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    No scaling is done for the floating point version (for speed).  
shun_iwasawa a35b8f
    Scaling is done both ways for the fixed-point version (for overflow prevention).
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    Optimized butterflies are used for factors 2,3,4, and 5. 
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    The real (i.e. not complex) optimization code only works for even length ffts.  It does two half-length
shun_iwasawa a35b8f
    FFTs in parallel (packed into real&imag), and then combines them via twiddling.  The result is 
shun_iwasawa a35b8f
    nfft/2+1 complex frequency bins from DC to Nyquist.  If you don't know what this means, search the web.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    The fast convolution filtering uses the overlap-scrap method, slightly 
shun_iwasawa a35b8f
    modified to put the scrap at the tail.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
LICENSE:
shun_iwasawa a35b8f
    Revised BSD License, see COPYING for verbiage. 
shun_iwasawa a35b8f
    Basically, "free to use&change, give credit where due, no guarantees"
shun_iwasawa a35b8f
    Note this license is compatible with GPL at one end of the spectrum and closed, commercial software at 
shun_iwasawa a35b8f
    the other end.  See http://www.fsf.org/licensing/licenses
shun_iwasawa a35b8f
shun_iwasawa a35b8f
    A commercial license is available which removes the requirement for attribution.  Contact me for details.
shun_iwasawa a35b8f
shun_iwasawa a35b8f
  
shun_iwasawa a35b8f
TODO:
shun_iwasawa a35b8f
    *) Add real optimization for odd length FFTs 
shun_iwasawa a35b8f
    *) Document/revisit the input/output fft scaling
shun_iwasawa a35b8f
    *) Make doc describing the overlap (tail) scrap fast convolution filtering in kiss_fastfir.c
shun_iwasawa a35b8f
    *) Test all the ./tools/ code with fixed point (kiss_fastfir.c doesn't work, maybe others)
shun_iwasawa a35b8f
shun_iwasawa a35b8f
AUTHOR:
shun_iwasawa a35b8f
    Mark Borgerding
shun_iwasawa a35b8f
    Mark@Borgerding.net