June 16, 2021

SpywareNews.com

Dedicated Forum to help removing adware, malware, spyware, ransomware, trojans, viruses and more!

Reverie: An optimized zero-knowledge proof system

Reverie: An optimized zero-knowledge proof system

Zero-knowledge proofs, once a theoretical curiosity, have recently seen widespread deployment in blockchain systems such as Zcash and Monero. However, most blockchain applications of ZK proofs make proof size and performance tradeoffs that are a poor fit for other use-cases. In particular, these protocols often require an elaborate trusted setup phase and optimize for proof size at the expense of prover time. Many would-be applications of ZK proofs are so complex that the ZK-SNARK protocol used by Zcash would take days (if not months) to run. With this in mind, I believe it is important to provide engineers who want to tinker with ZK proofs an alternative set of tradeoffs.

During my internship at Trail of Bits, I implemented a ZK proof system Reverie that optimizes for prover efficiency and does not require any trusted setup. These optimizations come at the expense of proof size, but Reverie allows proofs to be streamed across a network and verified incrementally so as to avoid loading large proofs into memory all at once. Reverie is also available as reverie-zk on crates.io.

The rest of this blog post will explain the underlying ZK proof system behind Reverie and provide performance benchmarks. Since the proof system uses techniques from secure multiparty computation (MPC), I’ll start by going over some MPC basics and then showing how to build ZK proofs using MPC.

MPC Primer

Reverie: An optimized zero-knowledge proof system

The players have secret inputs X, Y, Z, respectively, and compute f(X, Y, Z) without revealing X, Y or Z.

The goal of a multi-party computation protocol is to allow the distributed computation of a function on inputs from different parties, without the parties revealing their inputs to one other.

For instance:

  1. Tallying a vote without revealing the values of individual votes.
  2. Computing the number of shared customers without revealing your customer databases.
  3. Training/evaluating a machine-learning model on private datasets without sharing confidential data.

Zero-knowledge proofs and multi-party computation have conceptual similarities, but there are crucial differences:

  1. MPC allows the computation upon private data, where zero-knowledge proofs only enable parties to prove properties about private data.
  2. MPC protocols are interactive (their non-interactive cousin is functional encryption), requiring the parties to be online. Zero-knowledge proofs can be either interactive or non-interactive.

In this post we see that these two primitives are indeed related by describing how to “compile” a multi-party computation protocol into a zero-knowledge proof system. This is a beautiful theoretic result: essentially, checking the output of a function inside a zero-knowledge proof will always be more efficient than computing the function using multi-party computation.

Additionally, it can also lead to very simple, concretely efficient proof systems in practice, with applications including practical post-quantum signature schemes. As we’ll see, it’s a very general and versatile framework for constructing zero-knowledge proof systems.

Cryptographic Compilers

We will now explore how to apply a sequence of “cryptographic compilers” to any MPC protocol to obtain a non-interactive zero-knowledge proof system. A cryptographic compiler uses cryptographic tools to produce a new protocol from an existing one. The advantage of “compilers” in protocol design is that they are general: any input protocol satisfying a set of properties will yield a new protocol where a new set of properties are guaranteed to hold.  In other words, you can compose cryptographic primitives in a black-box fashion to derive new protocols in a “mechanical way.”

IKOS Compiler

Given an MPC protocol capable of computing a function “g(x),” the IKOS compiler yields a zero-knowledge proof system for input/output pairs (w, o) of the function “g”: It enables a prover to convince a verifier that it knows a secret input “w” such that “g(w) = o” for some output “o,” e.g., if “g(x) = SHA-256(x)” to prove that it knows an input “w” to SHA-256 resulting in a given hash “o” without revealing the pre-image (“w”).

At a high level, this is done when the prover locally runs the MPC protocol for “g(x)” where the input “w” is shared among the players. For example, in the case of three players, this can be done by picking random X, Y, Z subject to X + Y + Z = w. In this way, knowing just two of the inputs, X, Y, Z reveals nothing about w. Then the function f(X, Y, Z) = g(X + Y + Z) is computed using the multi-party computation protocol. To prevent cheating, the verifier asks the prover to reveal the inputs and communication for a subset of the players and checks that they’re run correctly.

The IKOS cryptographic compiler

More precisely, the IKOS compiler takes:

  1. An MPC protocol.
  2. A cryptographic commitment scheme (covered below).

And yields an interactive (public coin) zero-knowledge proof protocol. A cryptographic commitment scheme consists of a function”Commit,” which takes a key and a message, then returns a “commitment” Commit(key, msg) -> commitment, with the property that:

  1. Finding different messages with the same commitment is intractable (like a collision-resistant hash function)
  2. Discerning whether a commitment is then created from a given message is intractable (with a random key, the commitment hides msg, like encryption)

Commitments can be constructed from cryptographic hash functions, e.g., by using the HMAC construction. For a physical analogy to commitments, one can think of an envelope: you can put a letter inside the envelope and seal it (compute “Commit(key, msg)”), the envelope will hide the contents of the letter (“hiding”), but will not allow you to replace the letter by another (“binding”). At a later point, one can then open the envelope to reveal the contents (by providing someone with “key” and “msg” and having them recompute the commitment).

Uses cryptographic commitments, the MPC protocol is “compiled” as follows:

  1. The prover executes the MPC protocol for f(X, Y, Z) “in the head”, by simulating the players running the protocol: running every player locally as if they were distributed. You can think of this either as the prover executing the code for each player inside a virtual machine and recording the network traffic between them, or, as the prover playing a form of MPC sock-puppet theater wherein it plays the role of every player in the protocol.
  2. The prover records everything the different players send and receive during the execution of the protocol. We call this “the view” of each player: everything the player sees during the protocol. The prover then commits to the views of every player (“putting it inside an envelope”) and sends the commitments (“envelopes”) to the verifier.
  3. The verifier asks the prover to provide it with a subset of the views (2 in our example).

The verifier then checks that the correspondence between the two parties for which the view has been provided are compatible: the players are sending and receiving the messages that the protocol specifies that they should.

This is illustrated below.

The prover executes the MPC protocol between “virtual/imagined” parties and records their views. The verifier checks the consistency between a subset of these purported views.

Note: The “public coin” part just means that the verifier simply responds to the prover by picking random numbers: in this case the indexes to open, i.e. the verifier has no “secrets” that must be hidden from the prover to prevent him from cheating. This will be relevant later.

Why does this work? We need to convince ourselves of two things:

Soundness: If the prover cheats, it is caught

Suppose the prover does not have inputs X, Y, Z such that f(X, Y, Z) = 1. If it executed the MPC protocol correctly, then the computation would not return the expected output “o” (since the MPC protocol correctly computes f). Hence the prover must cheat while executing the MPC protocol “in his head”. In order to cheat, the prover must either:

  1. Cheat in a player, by e.g. having the player do a local computation wrong.
  2. Cheat in the communication between pairs of players, by e.g. pretending like player A received a message from player B, which player B never sent.

These two ways encompass the only ways to cheat in the MPC protocol: either by having a player violate the protocol, or, by subverting the simulated communication medium.

Consider now the case of 3 players with pairwise channels (as illustrated between Alice, the Hare, and the Hatter). Then the cheating prover must cheat between at least one pair of players. Since it commits to the views of all three players, at least one pair of these commitments must be inconsistent: if the verifier opens this pair, it observes that either one of the players is cheating or the messages between the two players do not match. Since there are three pairs of commitments, the cheating prover is caught with a probability of at least ⅓.

To “improve soundness” (to catch the cheating prover with higher probability), the zero-knowledge proof is repeated in parallel: having the prover run the MPC protocol multiple times, then challenge it to open two parties from every repetition. This way, the probability of a cheating prover succeeding after N repetitions drops exponentially in N.

Zero-knowledge: The verifier learns nothing

When two views are opened, the verifier learns the messages and inputs of two of the players. Since the commitment scheme hides the view of the unopened player and the MPC protocol ensures privacy of the input for the players, the verifier does not learn the input of the unopened player (Y in the example), since the input is shared as described above providing two of the inputs (X, Z in the example) leaks no information about “w”.

Fiat-Shamir Compiler

The Fiat-Shamir transform removes the interactive verifier (professor).

The Fiat-Shamir transform/compiler takes:

  1. A public coin zero-knowledge proof system.
  2. A (particularly strong) hash function.

And yields a non-interactive zero-knowledge proof.

The observation underlying the Fiat-Shamir transform is essentially as follows: Since all the verifier does is pick random numbers, we could just have the prover roll a dice instead and open the index that the dice shows. Then everyone can verify that the opened views are consistent.

There is just one obvious problem: a cheating prover can of course just pretend like the dice throw came out however it wanted them to (e.g. not opening the cheating pair of players). The Fiat-Shamir transformation resolves this by essentially using a hash function to “roll the dice in a verifiably random way”.

This is done by having the prover hash his messages (the commitments to each player’s view), the hash digest is then used as the random challenge (e.g. interpreted as a sequence of integers denoting which players to open).

A cheating prover can of course try changing its messages and recomputing the hash until it gets a challenge that allows him to cheat, however, this merely corresponds to running the interactive protocol multiple times in a setting where the verifier allows him to attempt any number of times. Hence the probability of a cheating prover succeeding in the interactive protocol is amplified by the amount of computation the cheating prover has available, e.g. if the interactive protocol allows the prover to cheat with probability 2^{-128}, then the adversary would have to compute the hash an expected 2^{127} times to cheat; which is clearly not feasible.

Note: In reality, the Fiat-Shamir transform requires a “Random Oracle” [see: https://blog.cryptographyengineering.com/2011/09/29/what-is-random-oracle-model-and-why-3/ for more details], a proof artifact that cannot really be instantiated. However in practice replacing the “Random Oracle” with a “good hash function,” where the output of the hash function “looks random” is sufficient.

Reverie

This finally brings us to Reverie. Reverie is an efficient Rust implementation of the MPC-in-the-head proof system described in KKW 2018 derived by applying the previous two compilers to a particularly simple MPC protocol. Since the proof system works for any ring (e.g., finite fields, the integers modulo a composite or vector spaces), Reverie is designed to be both fast and flexible enough to support computations in any ring.

Since the MPC-in-the-head paradigm covered above is quite general and the techniques in KKW generalize very easily, Reverie is also designed to be general and easily extendable. This makes it easier to add features such as moving between rings inside the proof system and adding “random gates” to the circuits.

Optimizations

To improve performance, Reverie uses a number of implementation tricks:

Heavy use of bitslicing. Since the parties in the underlying MPC protocol all execute the same simple operations in parallel, we can significantly improve performance by packing the players’ state continuously in memory, then operating on all players in parallel with a single operation (either in traditional 64-bit registers or with SIMD instructions). Reverie never operates on the state of individual players.

Parallelism. Since the proof system requires a number of repetitions for “soundness amplification,” each of these can be trivially  delegated to a separate thread for a linear increase in performance.

Fast cryptographic primitives. Reverie makes use of Blake3 and ChaCha12, since these were found to be the fastest cryptographic primitives (cryptographic hash function and pseudo random function) while still offering a very comfortable security margin.

Streaming prover and streaming verifier. Reverie allows the proof to be produced and verified in a steaming fashion: this enables the verifier to validate the proof while the proof is still being generated and transmitted by the prover. This means that Reverie’s proof and verification components are fully asynchronous.

Benchmarks

To benchmark Reverie, we measured the speed of verifying the SHA-256 compression function on this circuit. Reverie averages around 20 SHA-256 compression/second on a (32-core) AMD EPYC 7601:

Num. of SHA-256 compression applications Proof generation time AND Gates XOR Gates Proof size
100 4.76 s 2,227,200 9,397,400 22 MB
1000 50.39 s 22,272,000 93,974,000 220 MB
10000 482.97s 222,720,000 939,740,000 2.2 GB

Conclusion

Reverie is the first step in creating high-performance zero-knowledge proofs outside of the dominant SNARK paradigm, and it handles hundreds of millions of AND/XOR gates with relative ease. The project comes with a simple “companion” program (in the “companion” subdirectory) which makes playing with Reverie relatively easy. Try it now on Github and crates.io!