Compiling & Caching Circuits

Compiling a non-trivial circuit — SHA plus ECDSA plus a CBOR parser, for instance — is slow. Proving against that circuit is fast enough to do per-request. The asymmetric cost profile is why Longfellow ships lib/proto/: serialise the compiled Circuit<Field> once, cache the bytes, and hand them to every prover and verifier that needs them. This page covers the three moving parts that make that pattern safe: the circuit_id hash, the serialisation format, and the transcript binding.

When to reach for this pattern

You are deploying a fixed circuit (attestation verifier, signed-document predicate) to many provers or verifiers who will use it repeatedly.
Your compile step takes seconds to minutes and your prove step needs to be milliseconds.
Two parties need to agree on “we are talking about the same circuit” without exchanging the source or recompiling.

The `circuit_id` contract

Every compiled Circuit<Field> carries a 32-byte circuit_id — a SHA-256 hash over its structural content: the field type, the input / public-input / copy counts, the per-layer gate structure, and the compile-time constants referenced by any gate. It is computed by sumcheck/circuit_id.h during compilation and stored as Circuit<Field>::id[32].

The id changes when any of the following change:

The base field (Fp256 vs. Fp256k1 — two different ids).
The gate structure: adding, removing, or reordering any gate. The scheduler is deterministic, so identical source produces identical ids.
The compile-time constants referenced by gates. Changing a magic number in the circuit changes the id.

It does not change when:

The prover’s witness changes. (The witness is not part of the circuit.)
The rate / nreq Ligero parameters change. (Those live in ZkProof, not Circuit.)

Note that it does change with the nc batching factor: mkcircuit(nc) bakes nc into the compiled circuit, so two circuits built from identical source with different nc have different ids. If you plan to re-batch, re-compile.

Two parties exchanging the id have a cheap, collision-resistant way to check they are talking about the same circuit.

Serialising and shipping

The serialisation primitive is CircuitWriter<Field>::to_bytes. CircuitWriter is an instance class constructed with a Field reference and a FieldID:


std::vector<uint8_t> bytes;
CircuitWriter<Field> writer(field, field_id);
writer.to_bytes(*circuit, bytes);
// bytes now carries the whole circuit, delta-encoded, in a few KB to MB.

The format (covered in full on Serialization (Reference)) deduplicates constants and delta-encodes wire indices — expect roughly 1 MB per million gates pre-compression, a few tens of KB after xz. The header embeds the circuit_id, so readers can cross-check without re-hashing.

The reverse is CircuitReader<Field>::from_bytes. Likewise, CircuitReader is an instance class constructed with a Field reference and a FieldID. It takes a ReadBuffer (from util/readbuffer.h):


ReadBuffer buf{bytes.data(), bytes.size()};
CircuitReader<Field> reader(field, field_id);
auto maybe_circuit = reader.from_bytes(buf, /* enforce_circuit_id */ true);
if (!maybe_circuit) { /* rejected: version mismatch, field ID mismatch, or id mismatch */ }
auto circuit = std::move(*maybe_circuit);

Pass enforce_circuit_id = true when you did not produce the bytes yourself. It recomputes the id after deserialising and rejects if it differs from the header — the cheap defence against a corrupted cache or a silent upstream rebuild that produced a different circuit with the same name.

Transcript binding

Serialisation only gets you to “both sides hold identical bytes.” Turning that into “and we are going to generate the same challenges” is the transcript’s job. In zk_common.h, both ZkProver and ZkVerifier start the Fiat-Shamir transcript by writing the circuit’s 32-byte id, then each public input element individually, then a pro-forma zero output, and finally nterms() zero bytes for correlation intractability:


Transcript ts(domain_separator, domain_separator_len);
ts.write(circuit.id, sizeof(circuit.id));  // <-- binds the proof to this circuit.
for (size_t i = 0; i < circuit.npub_in; ++i) {
  ts.write(pub.at(i), F);                  // each public input element
}
ts.write(F.zero(), F);                     // pro-forma zero output
ts.write0(circuit.nterms());               // nterms() zero bytes for correlation intractability
// ... everything else the protocol writes ...

Note: ZkProver::prove and ZkVerifier::verify call ZkCommon::initialize_sumcheck_fiat_shamir internally — do not replicate these writes manually.

Because every random challenge is derived from the transcript state, a prover who signs a proof against circuit A cannot pass it off as a proof for circuit B: the first byte of the first challenge differs.

This has an important consequence for caching: the cache key is the circuit_id, not the source file path. Two equivalent-looking circuits that produce different ids (because one uses a stale compile-time constant, say) must not share a cache entry. Using the 32-byte id as the filename of the cached blob is a clean default.

A deployment sketch


// On the build host, once.
auto circuit = compile_my_circuit(field);
std::vector<uint8_t> bytes;
CircuitWriter<Field> writer(field, field_id);
writer.to_bytes(*circuit, bytes);
save_to_object_store(/* key */ hex(circuit->id), bytes);
 
// On each prover, once per circuit.
auto bytes = load_from_object_store(known_circuit_id_hex);
ReadBuffer buf{bytes.data(), bytes.size()};
CircuitReader<Field> reader(field, field_id);
auto circuit = reader.from_bytes(buf, /* enforce */ true);
 
// On each prove.
ZkProver<Field, ReedSolomonFactory> prover(*circuit, field, rs_factory);
// ... fill witness, commit, prove.

The verifier follows the same pattern: load the blob once, verify many. The cost of deserialisation is a small constant, dwarfed by commit and prove.

Pitfalls to watch for

Forgetting to pass enforce_circuit_id = true on untrusted inputs. A corrupted cache blob silently deserialises into a valid-looking circuit with the wrong structure. Every subsequent proof fails transcript verification with an opaque “bad proof” error. Always enforce on production load.
Shipping a circuit compiled with a different nc. mkcircuit(nc) bakes the batching factor into the id. If the prover wants nc = 4 and the cached blob is nc = 1, they need two different blobs.
Using the source-file path as the cache key. It outlives the source’s actual contents. Use the id.

Serialization (Reference) — the on-disk format.
Circuit Compiler (Reference) — the step that produces the Circuit<Field> and its id.
End-to-end Prover & Verifier — what happens after the blob is loaded.