Nonce Upon a Time, or a Total Loss of Funds - Exploring Solana Core Part 3

This blog post details the first ever loss-of-funds bug we found in Solana Core. The bug would have allowed us to write arbitrary data to any account on the Solana Blockchain. This has many devastating implications. An attacker with this power can:

Mint any amount of any token
Steal other accounts’ Sol, or tokens
Change ownership of any NFT
Delete their liabilities in any lending protocol

In short: we could have done pretty much anything we wanted, short of minting new SOL.

Even after we reported the bug in Solana’s amazing bug bounty program in late November 2020, and it immediately got fixed, it stuck with us due to its intriguing nature. Hoping it will help other projects to avoid similar bugs, we’ve decided to publish the details now.

In fact, we liked the bug so much that we featured it in the 2021 edition of ALLES! CTF. Due to its subtlety, it remained the only unsolved challenge.

The Background

As the major auditor of Solana’s validator code, we have probably spent more time staring at it than anybody else. Over the last two and a half years alone, we’ve found and reported over 100 vulnerabilities in Solana Core. Yet, this vulnerability stands out. Firstly due to its devastating impact, and secondly due to the simplicity of exploiting it. It’s also the first loss-of-funds vulnerability we found in Solana’s validator codebase.

However, to understand what led to the vulnerability, we first have to introduce one of Solana’s mechanisms to prevent replaying transactions: Durable Transaction Nonces (DTNs).

Durable Transaction Nonces

Durable Transaction Nonces play a vital role in Solana’s replay protection. But what are they, and how do they work? Well, imagine you go to Jupiter and trade 10 SOL for USDC. Have you ever asked yourself what prevents people from submitting this exact transaction again and in the future? If someone were able to do that again and again, that would force you to sell another 10 SOL without your consent — until suddenly, you sold all your SOL without ever wanting to. You certainly signed the original transaction with your private key, so if the attacker just sends it again, a validator would have to accept it, right?

A straightforward way to solve this problem requires validators to check whether they have already seen your signature. If so, they should not accept the transaction. However, this is more complicated than it first sounds due to Solana’s high-performance requirements. So far, Solana has handled over 150 billion transactions. Just the signatures of those transactions take up over 4 terabytes of space. Checking every transaction in history to identify a replayed transaction is not feasible — at least not in an acceptable timeframe.

The solution is simple: just don’t accept transactions that are too old. To do so, each transaction has to include the blockhash of a recently produced block. This way, validators don’t need to hold on to all transactions ever, but only those in this limited time window. New transactions are only compared against transactions in that window. If a signature contains an out-of-date blockhash, validators should just reject it.

However, as usual, life is not that simple… Including a block hash means that transactions have to be sent pretty quickly after they have been signed; otherwise they will expire. But there’s still a need for a mechanism with which transactions can be signed, but then sent some time later. A use case of this is offline signing. If a transaction is signed offline, the used blockhash might already be expired by the time the signature was transferred back to a computer with an internet connection. This is relevant as many security-focused users do not wish to have their private keys stored on an online device.

That’s where Durable Transaction Nonces come in. They allow signatures without an up-to-date blockhash. Instead of storing a recent blockhash in the transaction, a user can choose to use a Durable Transaction Nonce in its place. Such a Nonce needs to be created and stored in an account on chain ahead of time. Particularly, this is a 32-byte pseudorandom value, and the creation, storing, and advancement are all handled with instructions provided by the System Program. This value can then be used in place of the blockhash in transactions.

But, you may ask, does this not create the same problem that we had before? Where everyone can just replay signatures and force us to sell all of our SOL? This is where advancing the nonce comes into play. If we use the nonce in place of the blockhash, the very first instruction in our transaction needs to be an advance_durable_nonce instruction. This replaces the nonce in the on-chain account with a new random value. Meaning the transaction will be rejected if it was replayed at any point, as its nonce no longer matches the nonce stored on chain.

So, that’s how replay protection works? Well… still not quite. There is another important case to mention. Imagine you want to sell SOL, and you accidentally type 10,000 instead of 10. Luckily, you don’t have 10,000 SOL, so the transaction fails. The only money you lost is the 0.000005 SOL transaction fee you paid. However, a failing transaction means no changes are made, right? And the nonce is not changed… Now everyone can replay the transaction over and over again, slowly burning all of your SOL as fees until nothing is left. Even worse, once you made it (after years of grinding) to 10,000 SOL, the transaction would still work, as it never “passed” the system. If someone held on to this transaction, they could now replay it and sell everything on your behalf without your consent.

Fortunately, this can also be fixed. Even when the transaction fails, the durable nonce will still advance on the chain. When a transaction using a durable nonce fails, it will never be able to land again.

At the end of the day, Durable Transaction Nonces are an elegant way to prevent transaction replay. However, as we have seen, their usage is more complex than one might expect. And usually, with complexity come bugs.

The Bug

The original bug is subtle, and easy to overlook.

Solana Core has safeguards in place. These safeguards work by failing transactions, for example, when programs write to accounts that they don’t own. As in every good system for transactions, the state will be rolled back once a transaction fails, rendering all changes of the transaction moot. But whenever a durable nonce is used, some of the state has to be preserved when the transaction fails. Namely, the nonce has to be updated on chain, even when the transaction fails (see previous section). However, pretty much every single program (including the system program) relies on a total rollback of all state when it fails a transaction, and so do the previously mentioned critical safety guards. We need to be extra careful when implementing this advance_durable_nonce feature, to not roll back too little.

This is a lesson you learn over and over as a security professional. Special cases lead to complexity, and complexity leads to bugs. Write that down.

So, this rollback of state is implemented in Solana Core’s runtime/src/accounts.rs in the collect_accounts_to_store function. That function is called for every transaction, successful or not, to store modified accounts back to the accounts’ database.

Here are the relevant lines. Don’t worry if you don’t immediately get what this means; we’ll go over what this code does and where the bug can be found:

let maybe_nonce = match (res, hash_age_kind) {
    (Ok(_), Some(HashAgeKind::DurableNonce(pubkey, acc))) => Some((pubkey, acc)),
    (
        Err(TransactionError::InstructionError(_, _)),
        Some(HashAgeKind::DurableNonce(pubkey, acc)),
    ) => Some((pubkey, acc)),
    (Ok(_), _hash_age_kind) => None,
    (Err(_), _hash_age_kind) => continue,
};

let message = &tx.message();
let acc = raccs.as_mut().unwrap();
for ((i, key), account) in message
    .account_keys
    .iter()
    .enumerate()
    .zip(acc.0.iter_mut())
    .filter(|((i, key), _account)| Self::is_non_loader_key(message, key, *i)) // don't persist accounts that are called as programs
{
    // no more checks here, persist changes
}

This part of the function first checks if it should store accounts at all. If the transaction fails and returns an Error, it skips the store and continues with the next transaction. However, if you look closely at the match-expression, you can see that this is not the case when the hash_age_kind of the transaction is DurableNonce. This happens when a Durable Nonce (DTN) is used in place of the blockhash. This code persists all account changes made by the transaction — even illegal ones! — every time a durable nonce is used, no matter whether the transaction failed or not.

One can imagine how this behavior can lead to devastating exploits. Once there is a DTN in use, you can store whatever you want, in whatever account you want.

The Impact

This is just about the worst possible thing that can happen. All safety guards can be circumvented entirely by exploiting this bug.

On Solana, the state of every program is written to the accounts it owns. For every account, only the owning program is allowed to have write access. If this is no longer guaranteed, we are in deep trouble. Every program can write to the state of every other program. Everyone can mint themselves an arbitrary amount of any token, even without changing the indicated total supply. Everybody can doctor entries in lending protocols, for example deleting any liabilities. Changing the owner of NFTs. Steal SOL from any account. Everything is possible. You get the point. This is as bad as it gets.

Shortly after we reported this bug, Solana’s market cap and TVL shot up to tens of billions of dollars. As such, this was probably the single most impactful bug fixed as a result of our work in the Solana ecosystem.

The Fix

Neodyme’s security researcher Benno reported this vulnerability to the Solana team right away through their bug bounty program in late November 2020. The issue was promptly fixed in this commit, with follow-up work in this pull request. The changes are a bit more involved, so we won’t be summarizing them here. If you’re interested, check out the commit and pull request.

Summary

A subtle bug can go a long way. With a simple malformed match in the right place, everything is possible. It would be easy to blame developers for bugs like that. However, these bugs do exist and they will always exist. Writing perfect code is something we can all only dream of. The Solana devs handled this incident very professionally. That showed us that they have a great mentality towards errors and bugs and motivated us to dive deeper into the ecosystem.

During our audits on blockchain projects, we often encounter critical vulnerabilities like this where devs least expect them. As a rule of thumb, we recommend that you double-check special cases and complex code. Let devs read other devs code and see if they find mistakes or logic bugs. Also, a professional audit won’t hurt either ;).

The Background¶

Durable Transaction Nonces¶

The Bug¶

The Impact¶

The Fix¶

Summary¶