Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

investigate chunking bug #1806

Open
Trivo25 opened this issue Aug 26, 2024 · 5 comments
Open

investigate chunking bug #1806

Trivo25 opened this issue Aug 26, 2024 · 5 comments
Assignees

Comments

@Trivo25
Copy link
Member

Trivo25 commented Aug 26, 2024

Problem: Chunking is implemented in Kimchi and Pickles. However, some specific circuit configuration seem to cause a bug in Pickles. A prior RFC talks about these specific issues, which seem to be part of the wrap circuit (which needs to implement chunking as well)

Goal: Figure out what would need to be done in order to fully support chunking in o1js. Is it a relatively small engineering task or does it require further research?

@querolita
Copy link
Member

querolita commented Sep 17, 2024

Observations after preliminary exploration of chunking bug in o1js with potential relations with Pickles code.

TODO's throughout Pickles code

Domain size of wrap circuit

    This circuit was compiled for proofs using the wrap domain of size 15, but the actual wrap domain size for the circuit has size 16. You should pass the ~override_wrap_domain argument to set the correct domain size
  • We might need to extend common.ml, which only considers 2^13, 2^14, and 2^15 for log_2_domain_size.
  • Also being able to pass >N2 to override_wrap_domain thus extending Mina_wire_types.Pickles_base.Proofs_verified.V1.t.
  • Maybe the operations performed inside the wrap also require more chunks than just 1, but fewer than the previous step?
  • If the wrap circuit needs such a large domain, then it needs to be chunked accordingly as in the step circuit.

Verifier

  • The invalid_proof test trying to prove 2^17 rows, with N1 wrap domain size, and 4 chunks, fails to verify returning (Pickles.verify dlog_check). I suppose it needs 4 chunks and not just 2 in order to fit additional step operations in the circuit.

ZK rows

  • the number of ZK rows should be given by the operation:
   zk_rows := (16 * num_chunks + 5) / 7
  • when num_chunks = 1 then zk_rows = 3.
  • when num_chunks = 2 then zk_rows = 4 but the above operation should give 5.
  • running an MVP of chunking with 2 chunks in o1js yields the following error message:
    there are not enough random rows to achieve zero-knowledge (expected: 3, got: 2)
  • this error comes from the Kimchi prover
  • there, index.cs.zk_rows is 2 but zero_knowledge_limit equals 3
  • the zk_rows of the constraint system are created inside the build() function of the Builder trait.
  • no values for num_chunks gives a 2 for zk_rows.
  • some Kimchi bindings set a fixed 3

Exceptions

  • When running a program with chunking on o1js (4 chunks, N1 wrap domain), it fails with
    an uncaught error.
/workspace_root/src/bindings/ocaml/overrides.js:45
    if (err instanceof Error) throw err;

RuntimeError: unreachable
    at __rg_oom (wasm://wasm/011a91aa:wasm-function[4848]:0x3e27c2)
    at __rust_alloc_error_handler (wasm://wasm/011a91aa:wasm-function[4998]:0x3e3251)
    at alloc::alloc::handle_alloc_error::rt_error::h6c30faefac1fefed (wasm://wasm/011a91aa:wasm-function[5056]:0x3e355f)
    at alloc::alloc::handle_alloc_error::h6289c616c1280e38 (wasm://wasm/011a91aa:wasm-function[5055]:0x3e3554)
    at alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle::h41bc9aef765cae77 (wasm://wasm/011a91aa:wasm-function[3755]:0x3c27a7)
    at <o1_utils::serialization::SerdeAs as serde_with::ser::SerializeAs<T>>::serialize_as::h1da92fe6ae712792 (wasm://wasm/011a91aa:wasm-function[2483]:0x366fca)
    at serde_with::ser::const_arrays::<impl serde_with::ser::SerializeAs<[T; N]> for [As; N]>::serialize_as::h1233a3b3efbeac4b (wasm://wasm/011a91aa:wasm-function[1484]:0x2f0683)
    at kimchi::circuits::constraints::_::<impl serde::ser::Serialize for kimchi::circuits::constraints::ColumnEvaluations<F>>::serialize::h2627bc77795152ac (wasm://wasm/011a91aa:wasm-function[1126]:0x2ad0c6)
    at kimchi::prover_index::_::<impl serde::ser::Serialize for kimchi::prover_index::ProverIndex<G,OpeningProof>>::serialize::hf0dd2bc7ae7d95f7 (wasm://wasm/011a91aa:wasm-function[1589]:0x300c01)
    at caml_pasta_fp_plonk_index_encode (wasm://wasm/011a91aa:wasm-function[1909]:0x32b28c)
    at module.exports.caml_pasta_fp_plonk_index_encode (o1js/dist/node/bindings/compiled/_node_bindings/plonk_wasm.cjs:1475:14)
    at encodeProverKey (o1js/src/lib/proof-system/prover-keys.ts:105:26)
    at write_ (o1js/src/lib/proof-system/zkprogram.ts:1017:48)
    at write$0 (src/bindings/ocaml/lib/pickles_bindings.ml:470:7)
    at spec (src/bindings/ocaml/lib/pickles_bindings.ml:421:40)
    at filter_map$1 (ocaml/base/list.ml:812:14)
    at write$1 (src/bindings/ocaml/lib/pickles_bindings.ml:417:9)
    at write$0 (src/mina/src/lib/key_cache/key_cache.ml:140:5)
    at _iCZ_ (src/mina/src/lib/pickles/cache.ml:114:18)
    at <anonymous> (src/mina/src/lib/promise/js/promise.js:25:37)
  • According to @mitschabaude , it could be a memory issue with storing the proving key.
  • When running with let { verificationKey } = await MyProgram.compile({ cache: Cache.None }); to skip writing the prover key, it complaints again about a mismatch in the expected number of zero knowledge rows.
/workspace_root/src/bindings/ocaml/jsoo_exports/overrides.js:45
    if (err instanceof Error) throw err;
                              ^
Error: there are not enough random rows to achieve zero-knowledge (expected: 8, got: 3)
    at module.exports.__wbindgen_error_new (o1js/dist/node/bindings/compiled/_node_bindings/plonk_wasm.cjs:9723:17)
    at <wasm_bindgen::JsError as core::convert::From<E>>::from::he54f4fac6a715911 (wasm://wasm/01276de2:wasm-function[4435]:0x4041cf)
    at caml_pasta_fp_plonk_proof_create (wasm://wasm/01276de2:wasm-function[660]:0x2496fd)
    at module.exports.caml_pasta_fp_plonk_proof_create (o1js/dist/node/bindings/compiled/_node_bindings/plonk_wasm.cjs:2937:14)
    at caml_pasta_fp_plonk_proof_create (src/mina/src/lib/crypto/kimchi_bindings/js/bindings.js:789:15)
    at <anonymous> (src/mina/src/lib/crypto/kimchi_backend/pasta/vesta_based_plonk.ml:120:15)
  • This suggests a problem in the way we use the memory to write the proving key, and reaffirms the issue with the bindings not using the right number of zk rows (always 3 instead).

Hardcoded

The codebase is full of places where a hardcoded 3 is used to refer to the number of rows for zero knowledge and 1 for the number of chunks:

  • #L1570, #L184 and #L1841 in /src/lib/pickles/pickles.ml

  • #L457 in /src/lib/pickles/step.ml

  • #L63 and #L120 in /src/lib/pickles/verify.ml

  • #L383, #L882, #L883, #L937 and #L938 in src/lib/pickles/compile.ml

  • #L90 in /src/lib/pickles/unfinalized.ml

  • #L41 in src/lib/pickles/wrap_proof.ml

  • #L1566 in src/lib/pickles/wrap_verifier.ml

  • #L200 in /src/lib/pickles/verification_key.ml

  • #L196 and #L197 in /src/lib/pickles/step_branch_data.ml

  • #L232 in /src/lib/pickles/side_loaded_verification_key.ml

  • #L242 and #L251 in /src/lib/pickles/plonk_checks/plonk_checks.ml

  • #L282 in /src/lib/crypto/kimchi_bindings/stubs/src/pasta_fp_plonk_verifier_index.rs

  • #L281 in /src/lib/crypto/kimchi_bindings/stubs/src/pasta_fq_plonk_verifier_index.rs

Proposals

PR16057

  • in kimchi_bindings/wasm/src/plonk_verifier_index.rs:
    • res.set(zk_w(domain, index.zk_rows)).unwrap();
    • res.set(permutation_vanishing_polynomial(domain, index.zk_rows)).unwrap();
    • but this has no effect on o1js test
  • in src/lib/pickles/compile.ml
    • ((2 * (permuts + 1) * num_chunks) - 2 + permuts) / permuts instead of -1
  • in src/lib/pickles/step_branch_data.ml
    • ((2 * (permuts + 1) * num_chunks) - 2 + permuts) / permuts instead of -1
The above fixes `invalid_proof.ml`

PR16092

  • create a default number of ZK rows for 1 chunk as a constant to avoid plenty of locations with the hardcoded value 3.
    • less error prone, easier to keep track of.

PR2590

  • fix mismatch in the error message in Kimchi ProverError::NotZeroKnowledge
    • the number of rows needs to be at least (16*chunks-2)/7 + 1 but the error message suggests (16*chunks-2)/7 is sufficient.
    • the error must be triggered if the number of rows is <= than the strict lower bound, not just <.

  • I couldn't find any more locations where hardcoded values are introduced, thus I believe there must be some calls to the functions with default values.

## Wrap circuit

As chunks increase, the number of commitments and verifier complexity also grows. That means the step circuit grows to fit the verifier computation. Then, the wrap circuit derived from it also grows. We would need to measure how much it grows to automatically overwrite the wrap circuit size. And when that circuit exceeds 2^16, chunk accordingly. This needs further engineering work, to replicate the same behavior in step circuits. Worst case, I would start by setting the number of chunks required in the wrap circuit to be the number of chunks of the step circuit. But it should be smaller (plus, I wonder if it would be a problem to set the chunks higher than actually needed, leaving "empty" chunks).

@mitschabaude
Copy link
Collaborator

mitschabaude commented Sep 18, 2024

Nice investigation @querolita!!

When running a program with chunking on o1js (4 chunks, N1 wrap domain), it fails with
an uncaught error

Judging from the trace, this is an out of memory error ("__rg_oom") that happens when writing the prover key to a file.
You could compile with zkprogram.compile({ cache: Cache.None }) to skip writing the prover key (but probably the OOM will be somewhere later then.)

We might have to investigate and improve memory usage in Kimchi/Wasm!

@Trivo25
Copy link
Member Author

Trivo25 commented Oct 16, 2024

As originally suspected, the process runs out of memory before it can finish proving a circuit with even 2 chunks (or 2^17 constraints), as a result, we need to decide what approach we want to take next:

  1. wait until the problem resolves itself :)
  2. investigate if we can optimise our wasm code somehow that will enable us to prove larger circuits
  3. (part fix) enable native prover access, this means we will be able to prove large circuits (likely way larger than 2^17 constraints too) but we wont be able to do this in the browser
  4. investigate in what state Memory64 is and if or when we can utilize it

The current wasm memory limit is 4gb, which the process reaches quickly when trying to compile and prove a circuit of size 2^17 constraints (even slightly above 2^16 constraints). I estimate we would need at least 6 to 8gb for 2^17 constraints

@mitschabaude
Copy link
Collaborator

4. investigate in what state Memory64 is and if or when we can utilize it

Memory64 can be used in Node.js with a flag, also in some nightly browsers behind a flag but you can't expect users to set those. Not sure about the state of the Rust compilation pipeline to wasm64.

https://webassembly.org/features/
image

@Trivo25
Copy link
Member Author

Trivo25 commented Oct 16, 2024

4. investigate in what state Memory64 is and if or when we can utilize it

Memory64 can be used in Node.js with a flag, also in some nightly browsers behind a flag but you can't expect users to set those. Not sure about the state of the Rust compilation pipeline to wasm64.

https://webassembly.org/features/

image

Rust can target wasm64 (memory64) as a compile target, but I am unsure if the rest of our stack is compatible with it. Either way, it would still be worth to investigate further. I could even see a world where we release chunking in an experimental state that only works with the correct node flags set, until memory64 lands in an official capacity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants