~ 20 min read
Secure Randomness: From Zero to Verifiable Delay Functions, Part 1
Table of contents
- Basic Solutions, and Why You Should Never Use Them
- VRFs and Commit-Reveal Schemes, and Why They’re Still Not Enough
- Staking-Based Solutions, and Why They’re Not General-Purpose
Do you want to run an on-chain lottery? Or on-chain casino games? Do you want to develop a blockchain game with random events? Fairly airdrop tokens? Fairly mint NFTs of varying rarity? Or randomly select leader nodes for your blockchain?
Well, you better have a good source of randomness. One that everyone can trust to be fair and unbiased.
A secure source of randomness is one of the most critical components of many decentralized applications. However, perhaps surprisingly, there is currently no on-chain source of randomness that is truly trustless. Almost all solutions that have been used in practice are either fundamentally broken or require the participants involved to trust each other or a third party. Why is that?
In this two-part series, we’ll discuss the different attempts to construct a secure source of randomness, and why all currently known solutions have fundamental shortcomings in some aspect.
The first post will begin by exploring basic (insecure) solutions and common pitfalls. These solutions are very broken, but we keep seeing them proposed or even used in the wild, so we think it’s worth reiterating why they’re lacking. We then introduce Verifiable Randomness Functions (VRFs) and Commit-Reveal Schemes, approaches that seem to offer a solution but are still flawed.
The second post will examine what we think is the most promising solution for trustless randomness, namely Verifiable Delay Functions (VDFs). We will discuss what the current challenges are in its theoretical security and its on-chain implementation.
In this post, we’ll begin with very basic constructions and observations intended for a general audience — if you want to understand and recognize harmful approaches to on-chain randomness, this is the blog post for you. The second post will necessarily be more in-depth and is intended for a technical audience. Some familiarity with basic high-level algebra is helpful for understanding parts of the second post, but it is not strictly necessary.
Basic Solutions, and Why You Should Never Use Them
So let’s say you’re running an on-chain lottery or roulette. You need a source of randomness. What are some basic solutions for that, and what are potential problems?
Let’s say you’ve heard of this great website called random.org that promises true randomness. Well, that’s exactly what you need, right? So that’s what you’ll use as your source of on-chain1 randomness.
Everything’s going great and the payout of your lottery keeps rising and rising. After a few weeks, the people who run random.org become aware that your lottery uses their website as a source of randomness. They decide to participate in your lottery with one distinct advantage: They control the output of their website. They manipulate it to let themselves win and make off with the funds. The best thing is that nobody’s the wiser, since they can participate anonymously. To everyone else, it looks like the winner was chosen fairly and was paid out as usual. They can continue doing this every time the lottery is run.
This can obviously happen for any centralized source of randomness. Hence, this is not a good solution.
Randomness from the Blockhash
Another strategy that we often observed in the wild, especially on Solana, is to use the blockhash in some way. Let’s say you’ve heard that secure hash functions make good pseudorandom generators (which is not necessarily true in practice, but let’s forget that for a moment). Well, then the blockhash should be a good source of randomness, right?
The problem with this is that in the slot where they act as leader, a validator has a large degree of control over the blockhash of the block they produce. For example, they can choose to reorder, leave out or insert transactions into the block until they get a blockhash they’re happy with.
Suppose you have a something like a coin flipping game where the last bit of the blockhash is used as the source of randomness. The leader can repeatedly bet on a zero bit and then grind blockhashes until they get one with a zero in the last bit. This enables them to win every time. Even for more complex games, the leader can bias the result in their favor.
In fact, with some MEV products, manipulating the blockhash becomes viable for others than just block producers themselves.
Randomness from Hashing Public Data
Another method that is often proposed or used is to utilize on-chain public data that changes often and unpredictably, such as token prices. These can then be used as inputs to a pseudorandom generator, producing something that hopefully looks like a random output. However, this is also vulnerable to attacks. For example, the leader can manipulate token prices just before executing the transaction that feeds the data into the pseudorandom generator. And in fact, this is not limited to the leader — anyone can do so, if they time it perfectly.
It might seem to be a solution to hash many public data sources, some or all of which may not be manipulable by everyone. Surely an attacker can’t control all of the data sources, so this should make it more secure, right? However, this actually makes it less secure: As long as even one of the sources is in some way manipulable by an attacker, that means they have some degree of control over the output of the pseudorandom generator. This is almost always enough to bias the result in their favor and cheat — see the coin flipping game above for an easy example of where this applies.
As a last resort, you can always exclusively use public data that is manipulable only by trusted parties as input for your pseudorandom generator. Again, though, we would like to avoid needing any trusted participants.
General Pitfall: Single-Transaction Gambling
There’s another major pitfall to avoid when designing on-chain betting: You should never, ever allow the processes of (1) betting on an outcome and (2) the random generation of the outcome to occur in a single transaction.
Let’s see a real-world example of how an on-chain roulette that used on-chain randomness was exploited by a white hat (who later returned the funds) because the developers of the roulette made exactly this mistake.
One of the earliest projects on Solana, COPE, released a roulette game smart contract that used public on-chain data to generate its randomness. This is already flawed by design, but that’s not how the white hat exploited the contract — instead, they noticed that when betting, the random output determining whether you won or lost was generated in the same transaction as the placing of the bet, and you were even immediately paid out if you won the bet.
The exploit is very simple: If any instruction in a transaction fails, the entire transaction is reverted. Hence if you sent the contract a transaction that placed a roulette bet and then tried to transfer your winnings to another account, that transaction only had an effect if you won. If you lost, the transfer instruction failed, hence the entire transaction was reverted and you kept your wager.
The white hat released their exploit along with an explanation on Github after contacting the developer.
Ok, so let’s say you’ve now tried all the basic approaches discussed above to generate randomness. you’ve realized that they are all fundamentally flawed. But now you’ve got another brilliant idea: What if you crowdsource your randomness? Maybe if many people submit supposedly random data, and enough people are honest in submitting truly random data, you can combine all inputs in some way that guarantees that the output is also truly random data.
In fact, there’s a construction that even guarantees that if a single party is honest and pushes truly random data independent of the other inputs, the output is truly random: We require that all parties submit binary data of length . Then, we XOR all submitted inputs. That is, if there are participants — let’s number them according to the order in which they submitted their input — and we denote the length- input data of the th participant as , the output is
It is easy to see that if we choose an , fix all with , and then choose uniformly at random from all binary strings of length , the result is also uniformly distributed over all length binary strings. In other words, if one of the participants submits truly random data that is independent of the other inputs, the output is also truly random. This is what we wanted.
However, note that this only works if the random input is independent of the other inputs. If a dishonest participant knows what participant is going to submit, they can simply submit the same, and the inputs will cancel out. And indeed, if we implement this scheme on-chain, anyone who submits their data after participant knows . This is a major problem.
In fact, the problem is even worse: The last person — participant — that submits their data on-chain already knows the inputs of all other participants, since they are publicly available on the blockchain. In other words, they know and can now freely choose . That means they have complete control over the output. If they are malicious and want the output to be , they can simply choose
Then, we have
Hence this method of crowdsourcing randomness is also horribly broken and easy to exploit. In the next section, we’ll see how to fix this obvious bug by wrapping it with a commit-and-reveal scheme, but we’ll also see that even then, the last person to submit data can still bias the result in their favor.
VRFs and Commit-Reveal Schemes, and Why They’re Still Not Enough
We’ve discussed and refuted the security of the most basic randomness schemes. With decentralized XOR, we saw our first method that worked by crowdsourcing the randomness from multiple participants, some of whom may be dishonest.
Any scheme that employs a crowdsourcing method on-chain has the inherent problem that the th participant can see what participants 1 to submitted, and adjust their input accordingly. In this section, we’ll see two schemes that try to circumvent this problem by hiding information from the participants. One construction does so in a decentralized manner, the other in a centralized manner.
These schemes are a big step up from the basic constructions we’ve seen above. However, they still allow at least one entity to influence the output. We’ll see why in a second.
A Simple Commit-and-Reveal Scheme
First, we’ll describe a simple extension of the decentralized XOR we saw in the last section.
Again, we require the participants to submit length- binary data, and at the end we XOR all of these inputs. This time, however, we do it in two phases. In the first phase, all participants submit a hash of their input — of course, this hash has to make it impossible to discern any relevant property of the input. In the second phase, all participants reveal their inputs. An on-chain contract verifies that the inputs match the hashes that were commited before.
This way, the participants do not have access to the inputs of earlier participants, invalidating the attack on the prior scheme. Furthermore, participants are bound to their inputs by the commit hash they published before.
Let’s number the participants again, this time as to by the order in which they reveal their inputs (to which they had previously committed). Again, let’s assume that participant is honest and submits a binary string that is picked uniformly at random from the set of all length binary strings. So, we’ve now guaranteed that unless participant leaks (part of) their input to other participants, it is independent of all other inputs. By our proof above, this should now also guarantee that the output is truly random.
So is this it? Have we found the perfect scheme for on-chain randomness? Well, obviously not, otherwise this wouldn’t be such a long post.
There is still one variable that participant can control in the formula
and that is the set (and the number) of participants.
Nothing forces the participants that committed to an input to reveal it. The last participant to reveal their input now knows all other inputs again, hence they also know the result . They don’t have complete control over the output as in the last scheme, but they can choose to not reveal if they don’t like the result.
What happens when a participant doesn’t reveal their result in phase two depends on the implementation. The contract can either decide to go with the result so far, i.e.
or it can decide that the result is invalid and start the entire process again. Irrespective of what method is employed, participant has influenced the result.
Let’s do a simple example using a coin flipping game. Let’s suppose participant bets on heads. If the coin lands on heads, they double their money. If the coin lands on tails, they lose everything. The commit-and-reveal scheme is run with , where a result of signals heads and signals tails.
Suppose all participants but participant have revealed their result. Participant observes the result. There’s a chance that it’s heads, in which case he happily reveals his result and doubles his money. There’s also a chance that it’s tails, in which case he chooses not to reveal his input.
In the second case, the contract either uses as the result or restarts the process. Let’s assume for simplicity that participant doesn’t influence the result further if the process is restarted. Then, either way, they have a fifty-fifty chance of winning, even though they should have lost. In total, on average, they multiply their money by 1.5 each time they play. They’ll be able to quickly empty the contract’s payout vault and make off with the money.
Verifiable Randomness Functions (VRFs)
Alright, well that last construction didn’t really work either. But wait! You’ve heard of this amazing new invention called a verifiable randomness function (VRF). It’s even being used right now in a lot of on-chain applications to generate randomness. Surely if so many projects are using it, it can’t be that bad, right?
First, a quick introduction to VRFs. A VRF is a public-key pseudorandom function that acts on some input , given a private key and accompanying public key . Only the party that knows the private key is able to compute , but given the public key, anyone can verify whether is the correct output of for that input. Essentially, it’s a random function that only a single party can predict or compute, but its output can be verified by anyone.
So why is this useful in this context? Imagine we want to implement a raffle. We have players that enter, and then we want to choose a winner. This can be implemented using the above-mentioned commit-and-reveal scheme, but we already pointed out that scheme’s shortcomings. Instead, we require all parties to commit to some arbitrary value that they commit on-chain when entering the raffle. Then, we use a hash of all these value as input to a VRF that an oracle computes off-chain. The oracle then publishes the result back on-chain. The smart contract can verify that the value published by the oracle is actually the evaluation of the VRF, and we then use that value as our source of randomness to choose a winner.
Sounds good, right? But this is, in fact, also broken. Someone has to know the private key that is used to compute the VRF, and that entity can also enter the raffle. By being the last to commit to a hash input value, they can bias the result for themselves — and indeed, it is possible to do this anonymously such that no one would notice the manipulation.
So how about we add the blockhash of the block in which the transaction was included to the input of our VRF? Again, this only pushes the problem back one step. If the validator and VRF provider collude, they can try out different blockhashes until they are happy with the output of the VRF. This method of biasing the random output is virtually undetectable if done right.
What’s more, the oracle provider can decide not to publish the result of their function if they don’t like the output. This works even when block producer and VRF provider don’t collude. The only downside for the VRF provider is that it is very obvious when they try to influence the result this way. Still, if the random output would have meant that they lose a big bet, this might be a viable action for them.
Nonetheless, that protocol is in essence what Switchboard’s VRF implementation on Solana does. It’s a somewhat practical way of generating randomness as collusion between oracle providers and validators can be seen as unlikely, and you can always hope that VRF providers always reveal the output of their function. However, this protocol lacks any guarantees that we would like to see in a truly trustless random oracle. Similarly biasable VRF implementations on other chains include the VRFs of Chainlink and Oraichain (which is broken in even more ways that we won’t elaborate here).2
As a side note, verifying VRF output on-chain is a computation-intensive task. The Switchboard VRF implementation on Solana needs 276 transactions to do so.
Note that you can circumvent the undetectable biasing problem we described above by employing multiple VRFs, i.e. having more than one party with a secret key, and XORing or hashing the outputs of the parties together. If you make participation permissionless, it becomes believable that there is at least one honest participant. But again, this scheme still has the problem that the last entity to reveal their output doesn’t have to reveal the result of the function if they don’t like the overall result. This is much the same problem as we had in the commit-and-reveal scheme above.
We’ve just seen why VRFs are still biasable. Hence our search for a trustless decentralized randomness scheme continues. We will, however, mention one specific case where this biasability can be circumvented below.
A Sensible Special-Purpose VRF Scheme for Raffles
One specific use-case of VRFs that actually makes sense is the example we had above: raffling something among participants. But instead of having a separate centralized VRF mechanism, each participant will set up their own VRF beforehand and publish their public key. We then do a two-round protocol.
In the first round our goal is to produce an output that is guaranteed to have at least some randomness to it. How exactly we do this is not important. We can, for example, employ the commit-and-reveal scheme above, where the last participant can only bias the result but not fully control it. Another method is to let every participant evaluate their VRF on a predefined input, for example 0, and hash all results together.
In the second round, everyone evaluates their VRF using the output of the first round as the input. The winner of the raffle is then the person whose VRF output is the smallest among all participants. Since it is a VRF, everyone can verify the winner’s output via the public key of the winner. They can also compare it to their own, and to other’s outputs.
There are two things that are crucial to this construction. First, the individual VRF outputs of the second round are indistinguishable from a truly random output by the fact that the output of the first round has at least some random bits. Second, it does not matter if a non-winner does not reveal their VRF output in the second round. The winner however has a huge incentive to reveal their output, otherwise they will not be paid out.
This circumvents the non-revealing party problem of the commit-and-reveal scheme as well as the VRF scheme we saw above. In fact, this is more or less what Cardano3, Algorand and Polkadot use to decide who the leader / block producer is in their consensus mechanism — though of course they use slightly more complicated variants, and each has their own implementation.
Note that this protocol requires all participants to actively participate by committing to their own VRF, executing it on the output of the first round, and publishing the result.
So what should you take away from this section? In essence, VRFs do seem to work for the specific use-case of determining one or multiple winner(s) among active participants, but we still lack a general-purpose scheme to generate randomness on-chain for anything that does not fit into that model.
Staking-Based Solutions, and Why They’re Not General-Purpose
One tool we haven’t mentioned yet is staking. If we require all input providers to stake a certain amount, that would discourage them from misbehaving for fear of slashing — assuming that misbehaviour is detectable, of course.
In this case, we could use any scheme that is vulnerable to a non-revealing party attack, and require all such parties to stake tokens in order to participate. In the commit-and-reveal scheme above, we could slash anyone that refuses to reveal their output. In fact, if you’ve heard of on-chain randomness generation using RANDAO, that’s essentially how it works. However, this would mean that we would need all participants to stake more than they could earn by misbehaving in any single execution of the protocol. That’s quite the requirement! It means that our scheme probably can’t be used for any high-stake situations.
There’s one way we can limit the effect of this by making the scheme more centralized: In the commit-and-reveal scheme, only the last participant that has not revealed their output can influence the result. Hence, what if we have a designated last revealer, and require only them to stake a large amount? During the revealing phase of the commit-and-reveal scheme, we would have two stages: One stage in which all other participants can choose to reveal their output, and then another stage in which only the designated participant can reveal their output. If any of the participants in the first stage choose not to reveal their output, that doesn’t matter. If the designated participant chooses not to reveal their output, we can slash the large pool of tokens they staked.
Still, this makes the scheme centralized again, and requires the stake of the designated participant to be higher than any amount they can gain in any single execution of the protocol. Ideally, we don’t want to worry about problems like that when we request randomness.
That concludes our first blog post on on-chain randomness. We’ve seen some basic constructions that have been used surprisingly often even though they are horribly broken, as well as some more advanced constructions that still have inherent flaws.
We saw that the ubiquitous problem with constructing randomness on-chain is that all information is public as soon as it is sent to the chain. This means that the last person to provide an input to the randomness protocol can calculate both the result with their input and the result without their input, and can decide which one they like better.
In part 2 of this blog post, we’ll explore a candidate for a general-purpose on-chain randomness solution that seems to work around this problem: VDF-based schemes. We’ll be diving into cutting-edge research and using it to construct a scheme that is provably unbiasable — under certain theoretical assumptions.
As a side note, for off-chain data like random.org, you also need some way to access it on-chain, which introduces another layer of complexity we won’t discuss here. ↩
Oraonetwork — which is different from Oraichain — has also announced a VRF implementation on Solana, but we were unable to find details or documentation on it. Possibly for good reason. ↩
Cardano does this as part of its consensus protocol Ouroboros Praos, which is specified in this paper. ↩