FFT & Convolution
Longfellow’s FFT family provides the polynomial arithmetic that Reed-Solomon encoding, Ligero commitments, and sumcheck-layer interpolation all stand on. The module ships a cache-oblivious radix-2/4 FFT over any field with roots of unity, a real-FFT variant for fields where elements factor into a real+imaginary pair, a Nussbaumer negacyclic convolution for small sizes, and two convolution wrappers that pre-FFT one of the operands for reuse.
When to reach for it
- You are building a new polynomial routine (interpolation, multiplication, evaluation over a coset) and need the transform primitive directly.
- You are picking a field for a new statement and need to check whether it has a root of unity of the order you want.
- You are tuning a proof-system instantiation and want to know where the polynomial work is — this is it.
Design overview
The module has three conceptual layers:
FFT. FFT<Field> is a cache-oblivious Cooley-Tukey transform with variable radix. Given a size N, choose_radix() factors N = R · S · R with S ≤ 16384 and picks the split that minimises recursion depth; below kBasecase = 16384 the iterative bit-reversed basecase runs in place. The forward transform is fftf(A, n, omega, omega_order, F), the inverse is fftb(A, n, omega_j, j, F); both take the primitive root omega passed by const-ref, together with its order, so the caller can pick a sub-order in a larger-order field.
Real FFT. RFFT<FieldExt> operates on a quadratic extension FieldExt = Fp2<Base> but stores output in half-complex format, exploiting F[n-j] = \overline{F[j]}. It runs at radix-4 with specialized basecases (r2hcI_2, r2hcI_4, hc2hcf_4, hc2rI_4) that avoid multiplies by the fourth root of unity. Used when the input is real (imaginary part zero) and the FFT is part of a longer real-valued pipeline, which saves roughly half the work compared to the complex transform.
Nussbaumer. For small sizes n ≤ 64 (set by kNussbaumerSmall), negacyclic and linear convolutions fall back to a subtractive Karatsuba recursion. For larger n, Nussbaumer<Field> lifts the 1-D convolution to a 2-D problem of size M × R, runs FFTs on both dimensions, and recursively handles the small-size negacyclic residue. Exposes negacyclic, linear, and middle products.
Convolution wrappers. FFTConvolution<Field> precomputes FFT(y) and the 1/N normalization, so later calls to convolution(x) are “FFT x → pointwise multiply → inverse FFT”. FFTExtConvolution<Field, FieldExt> does the same with the real-FFT path. CRTConvolution<CRT, Field> decomposes elements into several small primes, runs independent FFTs over each prime, and reassembles via CRT — useful when the working field’s limb count makes a single large FFT expensive.
API surface
FFT<Field> — in-place cache-oblivious FFT
template <class Field>
struct FFT {
using Elt = typename Field::Elt;
static void fftf(Elt A[], size_t n, const Elt& omega, uint64_t omega_order,
const Field& F);
static void fftb(Elt A[], size_t n, const Elt& omega_j, uint64_t j,
const Field& F);
};nmust be a power of two.omegamust be a primitiveomega_order-th root of unity inField; the field must supply one —Fp128does for very high orders,Fp256through FFT-friendly subgroups via CRT.- Both calls are in-place; allocate
Alarge enough fornelements.
RFFT<FieldExt> — real-input FFT in half-complex form
template <class FieldExt>
struct RFFT {
using RElt = typename FieldExt::BaseField::Elt;
using CElt = typename FieldExt::Elt;
static void r2hc(RElt A[], size_t n, CElt omega, uint64_t order,
const FieldExt& C);
static void hc2r(RElt A[], size_t n, CElt omega, uint64_t order,
const FieldExt& C);
};Output of r2hc stores re(F[j]) at index 2j when 2j ≤ n and im(F[n-j]) otherwise; hc2r is the exact inverse.
Nussbaumer<Field> — negacyclic, linear, and middle products
template <class Field>
struct Nussbaumer {
using Elt = typename Field::Elt;
static void negacyclic(size_t n, Elt z[], Elt x[], Elt y[], const Field& F);
static void linear (size_t n, Elt z[], Elt x[], Elt y[], const Field& F);
static void middle (size_t n, Elt z[], Elt x[], Elt y[], const Field& F);
};linear output has size 2n; negacyclic and middle outputs have size n.
Convolution wrappers
template <class Field>
class FFTConvolution {
public:
// Pre-FFTs y; omega must be a primitive omega_order-th root of unity.
FFTConvolution(size_t n, size_t m, const Field& f, const Elt omega,
uint64_t omega_order, const Elt y[/*m*/]);
// z[k] = sum_{i=0}^{n-1} x[i] y[k-i], first m entries.
void convolution(const Elt x[/*n*/], Elt z[/*m*/]) const;
};
template <class Field, class FieldExt>
class FFTExtConvolution { /* same shape, half-complex internal */ };
template <class CRT, class Field>
class CRTConvolution { /* multi-prime decomposition */ };All three expose a matching Factory class for the lazy-initialization pattern reed_solomon and ligero use — construction is the expensive step (precomputing the FFT of y and the scale), subsequent calls are cheap.
LCH14<Field> — additive FFT over binary fields (gf2k/lch14.h)
LCH14 implements the Lin-Chung-Han 2014 additive FFT algorithm for binary fields (those with Field::kCharacteristicTwo). It precomputes normalized basis values ŵ_i(β_j) at construction time and uses them to compute twiddle factors on the fly.
template <class Field>
class LCH14 {
public:
explicit LCH14(const Field& F); // precomputes ŵ_i(β_j) table
// Forward additive FFT of length 2^l starting at coset offset `coset`.
void FFT(size_t l, size_t coset, Elt B[/* 1 << l */]) const;
// Inverse additive FFT of length 2^l starting at coset offset `coset`.
void IFFT(size_t l, size_t coset, Elt B[/* 1 << l */]) const;
// Bidirectional (truncated) FFT: given k "time" evaluations and 2^l - k
// "frequency" evaluations, fills in the complementary set.
void BidirectionalFFT(size_t l, size_t k, Elt B[/* 1 << l */]) const;
};- The constructor takes only the field reference; there is no
log_nparameter. The maximum supportedlisField::kSubFieldBits. - All three transform methods take
l(log₂ of the transform length) and either acosetoffset (FFT,IFFT) ork(the split point between known time and frequency samples,BidirectionalFFT). - Method names are uppercase (
FFT,IFFT,BidirectionalFFT).
Roots of unity (twiddle.h)
Twiddle<Field> caches powers of the primitive root of unity across transforms. Each FFT invocation passes its omega in, but for long-running code paths the cache in Twiddle avoids recomputing the same omega^k values. You rarely construct a Twiddle directly; the FFT and convolution constructors build one as a private member.
Complexity at a glance
| Routine | Cost |
|---|---|
FFT::fftf / fftb | O(N \log N) field operations |
RFFT::r2hc / hc2r | O(N \log N), ~half the constant of complex FFT |
Nussbaumer::negacyclic (large n) | O(N \log N) via lifted FFT |
Nussbaumer::negacyclic (small n) | O(N^{\log_2 3}) Karatsuba |
FFTConvolution::convolution | O(N \log N) per call; setup amortised |
CRTConvolution::convolution | O(k · N \log N) where k is prime count |
The polynomial interpolation (interpolation.h) and Reed-Solomon encoder (reed_solomon.h) both call into this module rather than implementing their own transforms.
See it used
longfellow-zk/lib/algebra/fft_test.cc— fft_test.cclongfellow-zk/lib/algebra/rfft_test.cc— rfft_test.cclongfellow-zk/lib/algebra/nussbaumer_test.cc— nussbaumer_test.cclongfellow-zk/lib/algebra/reed_solomon.h— the main consumer.
Related
- Reed-Solomon — layered on top of the convolution wrappers.
- Interpolation — reuses the same transforms for Lagrange ↔ monomial conversion.
- Prime Fields — picks the
Fieldthe transform runs over.