~ 27 min read
Secure Randomness: From Zero to Verifiable Delay Functions, Part 2
We’d like to thank the following two people for helping with this blog post by answering our questions:
Benjamin Wesolowski, who laid a lot of the theoretical groundwork for verifiable delay functions.
Bram Cohen, Co-Founder and CEO of Chia Network (and also known for being the author of the BitTorrent protocol). The Chia blockchain is the only major project to currently employ verifiable delay functions in practice.
Table of contents
- WTF is a VDF?
- Supplementing Our Commit-And-Reveal Scheme With a VDF
- A Promising VDF Candidate: The Wesolowski Construction with Class Groups
- Considerations for Practical Implementation: Towards an On-Chain Contract
This is the second post in a two-part series on different approaches to on-chain randomness.
For many different purposes, the blockchain space needs a trustless, general-purpose source of randomness. In the first post, we saw that almost all sources of general on-chain randomness that are currently in use have fundamental flaws (the severity of which varies). So how can we fix this?
In this post, we focus on employing Verifiable Delay Functions (VDFs) for on-chain randomness. We describe what VDFs are and why it’s not clear they even exist. We also discuss one of the most promising candidates for a VDF and examine what difficulties an on-chain implementation of a scheme using that VDF candidate would face.
Finally, we also spend some time examining a VDF implementation that is already being used in practice, namely in the consensus algorithm of the Chia blockchain.
Be warned: This post is a technical deep-dive into the theory and implementation of VDFs. It dissects concepts down to the nasty details that are inherent in cutting edge research. However, we are convinced it’s worth going all the way.
WTF is a VDF?
Let’s begin by defining what a Verifiable Delay Function (VDF) is.
Informally, a VDF is a conjectured cryptographic primitive that allows anyone to quickly verify that a so-called prover has spent a certain amount of time computing something from an input they were given.
Slightly more formally, a VDF is a function whose evaluation involves a large number of sequential steps, but whose result can be verified in a much smaller amount of steps.
Very formally, using a (slightly modified) definition from this paper, a VDF scheme for a given security parameter and a delay parameter consists of an input domain and an output range as well as two algorithms and . The algorithm is hard to compute; it takes an input from the input domain and produces an output from the output range along with a proof . The algorithm , on the other hand, is easy to compute; it takes and uses the proof to verify that is the output corresponding to . It outputs ‘true’ if the verification is successful, and ‘false’ if not.
A VDF scheme must satisfy the following properties. It should be:
sequential (basically: is hard to compute): Anyone can compute in sequentials steps, but not even an adversary that has a polynomial number of processors available can distinguish from an element picked randomly from the output range in significantly fewer steps. (This is a paranoid cryptographer’s way of saying “Nobody can compute the output of much faster than with the obvious approach”.)
efficiently verifiable (basically: is very easy to compute): runs significantly faster than .
unique (basically: Fooling the verifier is computationally hard): For any input , it is computationally hard to find values and such that but .
A VDF scheme should maintain these properties even if an attacker is granted a polynomial amount of time for pre-computation.
Now that’s quite the definition. What’s the main take-away from this section? For a VDF, when we give someone an input, they will need some time to compute the corresponding function output — but we will be able to verify that it’s the correct output very quickly.
There are candidates for VDF schemes which have not been broken yet — we’ll see one of them in detail later — but much like it’s unknown whether strong public key cryptography exists in general, it’s not known if a VDF scheme satisfying all properties defined above exists. The difference is that there has been significant research into breaking the most common public key cryptography schemes, while VDFs are a relatively new field.
Supplementing Our Commit-And-Reveal Scheme With a VDF
Now, assuming a VDF exists, let’s see how we could employ it.
In the last post, we saw a commit-and-reveal scheme. It worked as follows: In phase one, all participants publish a commit of their fixed-length random binary input. In phase two, everyone reveals their input and the contract verifies that it matches their committed hash. The result is the XOR of all inputs.
We saw that this scheme is almost perfect and guarantees true randomness as long as at least one participant is honest and inputs truly random data.
Unfortunately, it had the fundamental flaw that participants can refuse to reveal their inputs and, by doing so, influence the result. We’re now in a position to fix this flaw by employing VDFs.
The Commit-and-Reveal Scheme Augmented with a VDF
The scheme we construct has three phases. For simplicity, we denote the function as .
(Note that the following scheme is an example we cooked up ourselves. It has not been subjected to due scrutiny, so be very careful when employing this in practice. We’ll see a different scheme that is currently being used in practice later.)
In the first phase, each of the participants chooses a random VDF starting value for themselves and computes . They then take their random input and encrypt it using a key derived from , obtaining .
In the second phase, each participant commits and to the on-chain contract. These values are now public.
In the third phase, anyone can input to the on-chain contract. This essentially says “I know the VDF result for participant : it’s , and the proof for that is ”. The contract can then verify that and if so, calculate .
As with the old commit-and-reveal scheme, the result is .
Some notes on each of the phases:
Phase 1: The length of this phase can be extended at will, and indeed the time period must be long enough such that each participant can compute along with the corresponding proof . Note, however, that the participants can also pre-compute these values.
Phase 2: As soon as participant commits their values and on-chain, anyone can start computing to then compute . Hence, the length of the second phase must be chosen such that this computation is guaranteed not to complete within this phase for any realistic adversary.
Phase 3: Ideally, participant submits their triple of values themselves. However, given the public value , anyone can compute and and thus “force” the reveal of participant ’s input after some time has passed. This way, participants refusing to reveal their input are no longer a problem.
Analyzing the Scheme
Assuming certain security assumptions (that we will discuss later in this post), it should be true that no matter what input anyone inputs for participant in phase three, the verifier ensures (up to negligible probability) that . Hence the result of the decryption is guaranteed to be .
Note that the participant does not need to have committed the correct and the correct encryption of . But if they do not, the decryption done by the contract will either fail — in which case the contract can simply ignore that participant — or it will decrypt to nonsense instead of , which is fine.
Indeed, note that at the end of phase two, when all participants have committed their inputs, the output should be entirely deterministic. As long as there exists at least one person that bothers to reveal any unrevealed inputs, nobody has any control over the set of inputs that are used to compute the result. Hence, as we saw in the discussion for the commit-and-reveal scheme in the last post, if there exists some participant that inputs a truly random input that is independent of the other inputs, the output is truly random. Assuming that participant neither collaborates with anyone nor leaks their input or prematurely in any way, the independency is guaranteed by the fact that noone can compute and hence decrypt until some time after phase two.
It should be noted that there are other protocols based on VDFs. The one we described above is simply the natural extension of the commit-and-reveal scheme we saw in the last post. Later in this post, we’ll discuss a completely different architecture currently used by the Chia blockchain that needs significantly less VDF evaluations.
A Promising VDF Candidate: The Wesolowski Construction with Class Groups
Now that we’ve seen how to employ Verifiable Delay Functions to construct what seems like the perfect scheme for on-chain randomness, let’s see a promising candidate for such a VDF. Keep in mind that most research into VDFs is very recent and hence this construction has likely not been subjected to the amount of scrutiny one might wish.
The General Construction for Groups of Unknown Order
The following is a construction proposed in a paper by Benjamin Wesolowski (see also this paper for a more concise overview of the construction).
The scheme uses a group of unknown finite order as both its input and output domain. We will see later why the fact that the group has an unknown order is important. The scheme is parameterized by the security parameter and the delay parameter .
The algorithm, which gets a group element as input, consists of simply computing via fast exponentiation and selecting a seeded random prime from the first prime numbers. The randomness here should only depend on and (this makes the protocol non-interactive, which is necessary for our VDF-based randomness scheme — in fact, this is an example of applying the Fiat-Shamir heuristic to the interactive version of the protocol). outputs and the proof defined as .
The algorithm gets as input and must verify that quickly. To do this, we compute and accept if and only if .
Note that the verifier obviously accepts outputs that the evaluator algorithm produces, since we can rewrite and hence . Further, the verify algorithm is easy to compute as long as group operations run in a reasonable time: We can compute via fast modular exponentiation with multiplications in the ring , after which the computation of and are just two fast exponentiations that can be done in group operations. Usually, .
The above shows that this scheme satisfies the efficient verifiability property. For it to be a full VDF scheme, we would also have to show that it has the sequentiality and uniqueness property. However, both of these are only conjectured to hold.
For sequentiality (” is hard to compute”), it certainly seems like the fastest way to compute is via fast exponentiation. This is actually also where the assumption that the group has an unknown order comes into play: A well-known and easy corollary to Lagrange’s Theorem says that for any element , we have , where 1 is the multiplicative identity in . Hence, , so we have found a faster way to compute . In other words, if one were to use a group with a known order for the Wesolowski construction, anyone could compute relatively quickly, making the scheme useless.
More formally, one can define a time-lock game whose hardness is sufficient for sequentiality to hold for an implementation in a certain group. For a more formal discussion of this, see section 5, Assumption 2 of Wesolowski’s original paper.
For uniqueness (“Fooling the verifier is computationally hard”), let’s assume that the attacker even has the value with which they want to fool the verifier, and they only need the corresponding to go with it. Let’s define the root-finding game as a two-player game where an attacker is given a group of unknown order, chooses an element and is then given a random prime from the list of the first primes. They must then output a group element and win the game if and only if . In other words, the game tests whether the attacker can compute an random prime root of a group element they chose themselves.
Wesolowski gives a nice informal intuition in his paper for why fooling the verifier comes down to breaking the root-finding game. To do this, he analyzes the interactive version of the protocol, where is chosen at random by the verifier instead of being randomly generated depending on and . We slightly adapt the argument here:
We want that the verifier accepts, meaning we want . We have , hence we can rewrite this as or alternatively . The verifier accepts if and only if that last equation holds. Looking at the last version of equation, we denote its left-hand side by .
For a malicious prover to fool the verifier with a fake proof , he must hence be able to compute a that fulfills the above equation, i.e. he must be able to compute . The hard part here is finding th roots of . Since is randomly chosen, this is an instance of the root-finding problem.
Of course, the above is not a formal proof that fooling the verifier requires breaking the root-finding game. The real proof first formally argues this fact for the interactive version of the protocol in the random oracle model, after which you can apply a general argument that applies for any interactive protocol made non-interactive by the Fiat-Shamir heuristic. If you’re interested in the details of the formal proof, see section 6 of Wesolowski’s original paper.
The takeaway from this section is that employing this Wesolowski construction only makes sense if the time-lock game and root-finding game are conjectured to be hard in the group that we use as a basis for our scheme.
The Class Group of an Imaginary Quadratic Field
We’ve now seen a protocol that relies on a group of unknown order in which computing repeated squares must be done in a sequential fashion, and in which finding arbitrary roots is hard. Ideally, we’d also like this group to have fairly efficient group operations. So are there any good candidates for such a group?
Wesolowski, in his original paper, mentions two such types of groups: RSA groups (i.e., invertible integers modulo a product of two large primes under multiplication) and class groups of an imaginary quadratic field.
The first type is suited for our purposes only in a very limited way because we require that nobody can reconstruct the order of the group, even if many parties collaborate. This is hard to achieve, because the two primes defining the RSA group have to come from somewhere. Wesolowski mentions that there are approaches to generating large integers that are divisible by two unknown primes with large probability, but those have significant drawbacks as well. There are also ways to generate two unknown primes via secure multi-party computation, but that is very complicated to implement, let alone implement on-chain in a verifiable manner. Hence, we will focus on the second type of group, the class group.
This is where some abstract algebra comes into play. Feel free to skip some of the more technical parts of the following paragraphs, they are not essential for the randomness protocol itself.
The class group we will use is constructed as follows. Take a negative prime and define the quadratic extension . Then, take its ring of integers and consider its fractional ideals along with its fractional principal ideals . Then, the class group we use is the quotient group .
Now of course, that’s a very convoluted definition, and it’s certainly not something we’d want to represent computationally as it stands. Fortunately, this group is isomorphic to a group consisting of equivalence classes of binary quadratic forms, along with a well-known and efficiently computable group operation. All quadratic forms across all equivalence classes of that group have the same discriminant, namely the negative prime we chose above.
In fact, the group is identified by its negative prime discriminant . Let us construct the group corresponding to some given . We define two binary quadratic forms to be equivalent if , i.e., if their ranges over integer inputs are the same. To construct the elements of the group corresponding to discriminant , we then take the set of all binary quadratic forms with integer coefficients whose discriminant is . We partition it into equivalence classes according to the equivalency we just defined.
The group operation on two group representative is now defined as follows. Let two representatives and of two equivalence classes be given. We write them as and . Then, the group operation (commonly known as form composition) is defined as follows:
- Set .
- Set .
- Solve for , obtaining solutions . This is a standard problem and can be done efficiently (see e.g. section 7.4 of the Chia paper on class groups.)
- Solve , obtaining solutions .
- Set .
- Return .
It is not hard to prove that if both inputs have the same discriminant, this operation preserves it. The proofs that this forms a group, or indeed that it is isomorphic to the fractional ideal definition mentioned above, are rather more involved. See e.g. this blog post for a concise introduction and argument.
In practice, there are several crucial methods to make group operations even faster. One method involves always operating on a specific group representative that is in reduced form, in this case meaning that , and furthermore that either or that and . It turns out that each equivalence class only has one such reduced form, and that it is very easy to operate on it because its coefficients are reasonably small. Another method comes from the fact that the composition of forms with a prime discriminant is much simpler if both inputs are the same, i.e., if you’re squaring a form. Since this is the main operation in our VDF construction, it greatly reduces its running time.
This concludes the description of the class group. Crucially, for a randomly chosen negative prime discriminant , the order of its corresponding class group is unknown. In fact, the fastest known algorithm for computing the order of that group, the Hafner-McCurley algorithm, runs in expected time (see paper here), where . That’s pretty slow! And this running time even assumes the generalized Riemann hypothesis. For comparison, the currently theoretically fastest known general-purpose algorithm for factoring a number (one of the most researched problems in existence, since a fast algorithm would break RSA), the general number field sieve (GNFS), only takes time (see Wikipedia), where . Note the exponents.
All of this makes this type of class group the best currently known contender for a group to use in our VDF scheme.
Other VDF constructions
We’ve now seen the Wesolowski construction for VDFs, as well as a candidate for a group of unknown order with which it can be used. It should be mentioned that this is not the only viable construction for VDFs available: Pietrzak describes a similar VDF based on repeated squaring in a group of unknown order, differing from the Wesolowski construction by how verification is performed. Two more constructions are described by De Feo, Masson, Petit and Sanso, using supersingular isogenies and pairings. However, these latter constructions require a trusted setup, much like the instantiation of the Wesolowski protocol using an RSA group. Again, this is something we would ideally like to avoid.
It is also an open problem to find a simple VDF that is secure against attackers with access to a quantum computer. Both the Wesolowski and Pietrzak scheme can be broken by computing the order of the underlying group via Shor’s algorithm.
Considerations for Practical Implementation: Towards an On-Chain Contract
We now have a promising candidate for a VDF. Great! Irrespective of any theoretical concerns over hardness assumptions, what’s stopping us from just running with this and implementing our scheme as an on-chain contract?
Not much. In fact, the Chia blockchain already uses VDFs (notably as part of their consensus algorithm and not as an on-chain contract). They use a variant of the Wesolowski protocol that splits the generation of into some number many parts, which makes computing the proof faster by parallelizing the computation of , a variant that is described in section 4.3 of Wesolowski’s paper.
Chia’s protocol is very different from the protocol above with which we motivated VDFs. It does not directly take participant input, and consequently does not need one VDF execution per participant. Instead, they use three interacting VDF chains. When simplified, their method boils down to the following. They generate a continuous stream of challenges, each of which is meant to be fully random. To generate a new challenge , they use , the last VDF challenge that was generated, and the blockhash of the Chia block that was generated directly after the last VDF challenge was published. The new challenge is then generated by hashing F(c’) together with F(h). The idea is that by the time the attacker has computed F(h) and F(c’), they can’t influence the blockhash h anymore, so it’s hard for them to influence the challenge in any directed way.
Ideally, we want to implement either Chia’s or our protocol on-chain. What obstacles are there to making that happen?
Generating Random Numbers
One of the main concerns regarding implementing the Wesolowski scheme is that it relies on large (pseudo)random1 primes: We need a random prime discriminant that defines our group , and in every execution of the protocol, we need a seeded random prime during verification. In the original paper, comes from , which is the set of the first primes, where is the security parameter. Chia itself uses 264-bit primes as (see code here).
Generating a prime takes a non-negligible amount of computation time. The best method currently known — and indeed what almost all cryptographic applications essentially do to obtain random primes — is to generate and test the primality of many candidates.
Let’s take the
num_bigint_dig crate that implements pure-rust bigints along with cryptographic functions including random prime generation. For each candidate, it does a lot of cheap pre-tests like testing divisibility by small primes, followed by 100 rounds of Miller-Rabin, and finally the Lucas primality test. If a candidate fails at any point in these tests, the search immediately continues with the next candidate. If it passes all tests, the search is terminated and the candidate is output as the result.
In practice, using the release flag, this needs something like 100-200ms single-core for generating a 1024-bit prime on an average laptop. That’s a lot!
The openssl prime generation (using
openssl prime -generate -bits 1024) does a bit better, completing in about 30-60ms, but that’s still not great for our purposes. It also does heavy low-level optimization, which isn’t accounted for in most gas costs / computation limit models, which is important if we want to implement this on-chain.
We did some very quick-and-dirty tests for how well prime generation using the
num_bigint_dig crate works on Solana, which currently has a transaction-wide limit of 1.4 million compute units (which are configured to approximate the computation time needed to execute the on-chain contract). The problem was that remainder operations — which are essential for any primality test — were extremely inefficient, with up to 11.000 compute units simply for computing the remainder of a 1024-bit number when simply dividing by 5. This makes it necessary to split a full run of an on-chain primality test over several hundred or even thousands transactions. And keep in mind that you typically need to check hundreds of candidates! Though most of those tests will usually abort much sooner.
Even though this can almost certainly be optimized significantly, it would be surprising to see the entire prime generation process complete in only a couple of transactions, or even a few hundred, in the near future.
Choice of Discriminant
Apart from prime generation, verification also involves calculating and via fast exponentiation, hence requiring squaring and multiplying group elements. Both of these operations involve many costly BigInt operations. The size of these BigInts depends on the size of the discriminant.
So in practice, discriminants should be chosen small enough to make verification relatively quick and cheap, but large enough to make breaking the discriminant infeasible in practice. So what’s the smallest discriminant we can get away with? Research on this is very sparse, but there is one data point of interest.
One way an adversary could beat the time-lock game and root-finding game is by finding a significantly faster way to compute powers of the group element . The obvious way to do this, as we’ve already mentioned above, is to know the order of and then simply compute . In fact, just knowing the order of any group element , i.e. to find a group element other than the identity and a number such that , might already make the group unsafe, since p necessarily divides .
So how long does it take to compute the order of a group element, i.e., to “break” its prime discriminant? Fortunately, we can again look to Chia: they ran a discriminant breaking contest as part of the first iteration of their VDF challenge. The winning entrant was able to break a 240-bit discriminant in slightly less than 10 days on unspecified consumer hardware. However, note that all they did was run
pari-gp, which includes a fast implementation of the Buchmann-McCurley algorithm that computes an algebraic representation of the class group from which its order is easy to read. It might be possible to improve this significantly.
That being said, Bram Cohen pointed us to a paper by Belabas, Kleinjung, Sanso and Wesolowski that showed how to compute the order of a specific element in any group with a Mersenne prime as a discriminant, with some pointer for how it might be possible to extend this technique to other types of primes. This implies that such discriminants should never be used for a run of Wesolowski’s VDF.
He also confirmed that Chia currently uses 1024-bit discriminants. This strikes a balance between ease of prime generation and safety of the protocol. Note that in their protocol, the discriminant is rotated every 10 minutes.
Faster Squaring (Code vs. Hardware)
Research into accelerating iterative squaring, i.e., accelerating the computation of , is still sparse. Of course, any practical implementation will need to know an upper bound for this speed to choose the length of phase two. If there exists an implementation that is quick enough to compute for all before phase two ends, the last participant could completely control the output using the exploit method we discussed for the distributed XOR scheme from the first post.
Let’s start with what’s known for general-purpose hardware. Fortunately, Chia ran a VDF competition in 2018 and 2019, asking for fast implementations of squaring in class groups. They used 2048-bit discriminants. This resulted in a significant improvement in the computation time needed for squaring, from ~170 s per squaring of the reference implementation down to ~40 s per squaring, measured on a i5-8400 @ 2.8GHz. A second round of competition led to the development of further optimizations (note that the results are not directly comparable, since the hardware and parameters were different).
There are several optimizations that were employed in the winning entries, such as implementation of the NUDUPL algorithm (described in this paper) that reduced the size of operands during squaring operations, and a very neat trick discovered by Akashnil Dutta to speed up the reduce operation using that uses u64 approximations for transformation matrix entries, avoiding costly BigInt computations. Doubtless further optimizations are still possible.
A paper from 2020 even evaluates the speedup of using ASICs for squaring forms with 2048-bit discriminants. They achieve ~6.3 s per squaring on TSMC 28-nm CMOS technology, with a frequency of 500MHz. Their software implementation, tested on a 7-6850K @ 3.6GHz, requires roughly double the time per squaring, still clocking in at an impressive ~12.8 s.
Chia itself only uses 1024-bit discriminants for its VDF implementation. The fastest implementation they have observed on their network runs at 360k iterations per second, implying about ~2.7 s per iteration.
In any case, the length of phase two should always be chosen such that there is an ample buffer for implementations faster than any implementation known publicly — especially since research into ASIC implementations is still so sparse. The faster the implementations, the larger the parameter will have to be to keep the length of phase two constant. As a consequence, it might become necessary for all participants to own and run special-purpose hardware to keep up with the current speed requirements, potentially making participation costly.
Incentivization for Participants
One matter that should be mentioned is what the incentives should be for participants to submit inputs. If it becomes necessary to employ special-purpose hardware — which is definitely plausible seeing as we’ve already seen a 2x speedup of ASIC vs software in the paper mentioned above, a value that will presumably increase as more research in this direction is initiated — incentives would need to offset the costs of purchasing and running that hardware.
In the end, VDF-based solutions might be restricted in use to applications with a large amount of money at stake with each generated random string. Applications with smaller amounts at stake might not warrant use of VDFs, relying instead on the normal commit-and-reveal scheme, VRF-based schemes or staking-based schemes, and then simply trusting that all participants reveal their input.
In this post, we’ve seen what we believe is the only known solution for a general-purpose, trustless scheme for generating randomness on-chain. However, we’ve also seen that it relies on cryptographic assumptions that have received comparatively little attention — specifically, the hardness of the root-finding game in the class groups we explored.
We also experimented very quickly with what would need to happen to implement a VDF in an on-chain contract, and recognized that generating random primes on-chain is a notable obstacle. In that regard, we also saw that nonetheless, the Chia blockchain already uses this construction in their consensus algorithm.
Generally available on-chain randomness remains a major problem in the blockchain space. More research into the theory of VDFs, fast implementations of form squaring and verification algorithms, as well as possible alternative protocols, is desperately needed.
This concludes our two-part deep-dive into on-chain randomness. Make sure to check our blog regularly, as we will be releasing more deep-dive posts about Solana bugs, topics related to smart contract audits, and interesting ideas in the blockchain space.
For brevity and readability, I’ll use random and pseudorandom interchangeably in this post. ↩