Performance
ProveKit is fast enough to ship in production for many workloads, but “fast” depends almost entirely on circuit shape. This page tells you how to measure your circuit honestly, what dimensions matter, and where the benchmark suite lives.
What drives performance
Section titled “What drives performance”| Dimension | Effect |
|---|---|
| Constraint count | Roughly linear in R1CS row count for the dominant proving phases (witness solve, commitment, sumcheck). |
| Hash choice | skyscraper is the fastest option for BN254 Merkle commitments. SHA-256, Keccak, Blake3, and Poseidon2 are slower in different ways, Keccak and SHA-256 have larger constraint footprints inside Noir circuits, while Skyscraper, Blake3, and Poseidon2 are cheaper in-circuit. |
| Witness layer count | Witness builders execute in layers (see Proving flow). Deep layer graphs add coordination overhead. |
| CPU architecture | aarch64 benefits from SIMD-accelerated BN254 arithmetic in skyscraper/core. x86_64 falls back to portable arithmetic. |
| Parallelism | Proving uses Rayon. More cores help up to the parallelism inherent in the circuit. WASM threading depends on SharedArrayBuffer. |
| Host memory | Mobile FFI hosts can swap to disk via pk_configure_memory(...). File-backed mmap allocation is slower than RAM but unlocks larger circuits. |
Measuring your circuit
Section titled “Measuring your circuit”The CLI prints span timings and memory statistics through its tracing layer. The simplest measurement:
cargo run --release --bin provekit-cli -- prove --prover circuit.pkpInspect the structured timing output it prints. For finer-grained profiling, build with the Tracy feature:
cargo run --release --features tracy --bin provekit-cli -- --tracy proveFor programmatic benchmarking, across hash configurations, host variants, and circuit sizes, use the provekit-bench crate in tooling/provekit-bench/. It hooks into criterion and reports proving time, verification time, peak memory, and proof size.
Inspecting the circuit before proving
Section titled “Inspecting the circuit before proving”Two CLI commands tell you what you’re about to prove:
# R1CS structure and ACIR statistics.cargo run --release --bin provekit-cli -- circuit-stats target/<circuit>.json
# Size breakdown of the prover key, matrix sparsity, witness builders, hash config.cargo run --release --bin provekit-cli -- analyze-pkp <circuit>.pkpUse circuit-stats to confirm constraint counts match your expectations before committing to a host. A circuit that fits comfortably on a server may exceed practical proving time on mobile.
The ProveKit benchmark suite
Section titled “The ProveKit benchmark suite”noir-examples/csp-benchmarks/ contains the Ethproofs CSP benchmarks, a standardized suite of client-side proving targets used to compare proof systems on common workloads.
| Target | Circuit sizes | Implementation note |
|---|---|---|
| SHA-256 | 128, 256, 512, 1024, 2048 bytes | Uses noir-lang/sha256::sha256_var, lowering compression through Noir’s SHA-256 blackbox. |
| Keccak-256 | 128, 256, 512, 1024, 2048 bytes | Native Noir Keccak circuit with a witness-focused u32 lane representation. |
| Poseidon | 2, 4, 8, 12, 16 field elements | noir-lang/poseidon BN254 native Noir helpers. |
| Poseidon2 | 2, 4, 8, 12, 16 field elements | TaceoLabs/noir-poseidon for states 2, 8, 12, 16; state 4 intentionally exercises Noir’s Poseidon2 blackbox. |
| ECDSA | secp256r1 over a 32-byte digest | zkpassport/noir-ecdsa native P-256 verification (P-256 blackbox is not yet lowered by ProveKit). |
To run any benchmark target:
cd noir-examples/csp-benchmarks/sha256_512cargo run --release --bin provekit-cli -- preparecargo run --release --bin provekit-cli -- provecargo run --release --bin provekit-cli -- verifyCombine that with the CLI’s timing output (or the provekit-bench harness) to capture proving time, verification time, and memory for each target on your machine.
What to expect across hosts
Section titled “What to expect across hosts”The fundamentals don’t change between hosts, but resource constraints do:
- Native Rust on a workstation. The reference platform. Smallest measured proving time, largest available memory.
- WASM in a browser. Slower than native, the proof system runs single-threaded unless
SharedArrayBufferis available, and JavaScript marshalling adds overhead at the boundaries. - WASM in Node.js. Closer to native than browser WASM, but still single-process unless you orchestrate workers externally.
- iOS / Android via FFI. Bounded by device RAM unless you configure
pk_configure_memoryfor file-backed mmap. Modern phones can prove non-trivial credential circuits on-device; budget memory carefully. - Verifier server. Verification dominates. Concurrency is configurable through
VERIFIER_SEMAPHORE_LIMIT; the default of one keeps memory usage predictable.
Recursive verification cost
Section titled “Recursive verification cost”The Groth16 wrapper through the Go/gnark recursive verifier adds a fixed per-proof cost, typically a one-time setup cost (proving key generation) and a recurring proving cost for the outer proof. The base proof’s verification is the inner workload; everything else is the wrapper. Plan for the Groth16 step to dominate end-to-end latency when on-chain settlement is the goal.
Optimization checklist
Section titled “Optimization checklist”If proving is slower than you need:
- Run
circuit-stats. Confirm constraint count matches expectations. Unexpected blowups usually indicate accidentally-quadratic constraint generation. - Switch to
skyscraperif you haven’t. It’s the default but worth confirming. Other hashes are slower in-circuit. - Audit black-box vs native lowerings. Some Noir black boxes (SHA-256, Keccak) are heavier in ProveKit than their native R1CS implementations. The CSP benchmarks call this out explicitly.
- Profile with Tracy. Run
cargo run --release --features tracy --bin provekit-cli -- --tracy proveand inspect span timings. Look for layers that dominate the witness solve. - Split the circuit if possible. Two smaller proofs may verify faster than one giant proof, depending on transport overhead and what the verifier needs to check.
Related pages
Section titled “Related pages”- CLI reference, flags for
circuit-stats,analyze-pkp, and Tracy profiling. - Examples catalog, circuits you can benchmark against.
- Proving flow, the conceptual pipeline being measured.