~18 min read

Riverguard: Fishing for Loss of Funds in the Stream of Solana Transactions

Riverguard, the free first line of defense for all Solana contracts
Authored by:

Welcome to our new blogpost series, in which we embark on a journey through the world of automated Solana smart contract exploitation and how we use it to protect you with our new, free Riverguard platform.

TLDR:

  • Riverguard is the first line of defense for ALL Solana contracts.
  • It automatically finds exploitable bugs even in closed-source on-chain contracts.
  • Thanks to a Solana Foundation Grant, it’s completely free for everyone.
  • It works, as we’ve found and fixed critical bugs worth millions!
  • If you’re developing a contract yourself, you can sign up now at riverguard.io! We do our best to triage, but it’s time-consuming, especially for closed-source contracts.

In this first post, we kick things off by exploring the what, why, and how of Riverguard. We discuss our motivation for creating this tool, its functionality, and its significance for enhancing the Solana ecosystem’s security. We also take a closer look at one of our early successes, namely “Solcean’s Eleven”, a casino heist scenario on Solana revealed through Riverguard. It is an excellent example of the issues that our tool can detect and prevent.

Our presentation at Breakpoint 2023 serves as an accompaniment to this blogpost. If you prefer watching videos to reading, then check the presentation out on YouTube by clicking on the following link: Breakpoint 2023: Riverguard: Fishing for Loss of Funds in the Stream of Solana Transactions.

1. Background: The Genesis of Riverguard

We first provide some backstory as to how we got here. Feel free to skip ahead to How Does Riverguard Work? for the more technical details. Note that this post assumes a basic familiarity with Solana.

1.1 Let’s Talk Security!

Over the past few years, our team has been deeply involved in enhancing the security of the Solana ecosystem. We’ve read hundreds of smart contracts, found and reported numerous bugs, quite a few of them critical, jumped into action during security breaches, and spent a great deal of time analyzing hacks after they occurred. A significant part of our job involves reviewing open-source contracts; however, with over 6,000 smart contracts on Solana, many of which are closed-source, this is a tall order. We’ve noticed a pattern where similar bugs pop up repeatedly. While things have definitely improved since the early days of Solana, there’s still room for growth. In a perfect world, each project would receive an audit before launch. However, experienced Solana auditors are few in number, and arranging a thorough audit can be pricey. While larger protocols can afford these in-depth audits, smaller projects often cannot.

In dissecting past vulnerabilities and hacks, we’ve also noticed that many were surprisingly simple yet effective. Often, the vulnerabilities stemmed from seemingly minor oversights, such as neglecting to verify user signatures in specific scenarios or failing to authenticate the keys of vault accounts.

Yet, the simplicity of these exploits is deceptive. Identifying such bugs, particularly within a vast network of diverse and complex contracts, is anything but straightforward. This recurring simplicity in Solana’s security vulnerabilities has led to a growing sense of frustration on our part. Bugs are an inevitable part of any technological ecosystem, but we found the ease with which some hackers were exploiting these vulnerabilities in Solana disconcerting.

Driven by this frustration and a determination to fortify Solana’s defenses, we embarked on a mission — namely to mimic the methods of hackers but to use them more efficiently as well as for good. That is, we seek to do what hackers do, only better and faster.

Therefore, we teamed up with the Solana Foundation to create Riverguard, a new free security tool. Thus, we hope to make it easier for all Solana projects to keep their security tight. Riverguard is a tool that performs post-deployment testing on all contracts. It tests for common bugs by changing up their on-chain transactions, listing any potential issues in a user-friendly dashboard.

1.2 We Need to Automate!

So how do we actually achieve this? The answer is automation. There are two common contenders when it comes to automated bug-finding — namely fuzzing and symbolic execution. We started to investigate this topic years ago and reached the conclusion that while the aforementioned approaches work, they still require significant work for each contract. While the tools are highly useful for finding unknown and complex bugs or bug classes, they often require source-code access or contract-specific tuning to work well. We needed something applicable to all contracts that could work as a first line of defense for the bugs we see most often. For this, we realized that we could use a much simpler approach.

1.3 Think Like a Hacker

Too many of the more straightforward hacks follow the same basic pattern, where the user interacts with a protocol and observes weird behavior. The user becomes an attacker and abuses the behavior.

This sequence offers valuable insights into how security testing might be approached. The idea is to mimic genuine user interactions with contracts as a means to uncover vulnerabilities. Let’s treat all contracts like a black box and go bug hunting through the following steps:

  • First, interact with the program via a frontend;
  • Then, look at what the transaction did on-chain;
  • Lastly, mutate the transaction in an educated way and see if it still succeeds.

Automating the latter two steps is straightforward, but the initial interaction poses a challenge. Thus, the following question arises: How can diverse user behavior be replicated without manual intervention for each contract?

Enter: The River.

Other users use Solana programs all the time, so we don’t actually have to interact with frontends; rather, we can replay the transactions of legitimate users. This approach, while seemingly straightforward, comes with its own set of challenges. The volume is immense, with millions of transactions occurring daily and validators processing gigabytes of data every second.

Furthermore, we have to test each transaction for many different kinds of attacks with different parameters. We also have to be careful about the reporting and duplication of findings. A bug identified in one transaction is likely to appear in others of the same kind. To enable efficient triaging, duplicates must be filtered. Riverguard is designed to address these technical challenges, streamlining the process of identifying and categorizing unique vulnerabilities.

2. How Does Riverguard Work?

Quick Recap: Our goal is to automate the security of all Solana programs. We do this by taking real user transactions, tweaking them (or “mutating” them in tech speak), and observing the simulated outcomes to spot potential bugs.

2.1 Basic Overview

The following Figure gives an overview how Riverguard works:

Image 1: High-Level Overview

Image 1: High-Level Overview

Here’s a straightforward look at how Riverguard functions in the Solana ecosystem:

  1. Transaction Collection: Riverguard gathers every successful transaction executed on the Solana mainnet. They include everything from simple token transfers to complex actions such as swaps, NFT mints, or multi-sig votes. If it happens on Solana, then Riverguard takes note.

  2. Mutation Engine: Next, these transactions are sent to what we call the “mutation engine”. Here, each transaction is altered in various ways to simulate potential attacks. These mutations can be as simple as removing a signer or as complex as swapping vault accounts or faking account data. Since this is all in simulation, our scope for experimentation is wide open.

  3. Observation and Analysis: Once mutated, we closely observe how each transaction behaves. Does it still succeed, or does it encounter a problem? Where exactly does it fail, and how do the outcomes differ from the original transaction? Are the output accounts still changed? We use simulation results to filter out false positives and determine which program is at fault.

  4. Deduplication: As findings are always attributed to a specific program and instruction data, we duplicate based on the program, instruction, and mutation-specific metadata. This is a balancing act: On the one hand, we aim to minimize the number of duplicate reports — too many can clutter the system and obscure critical insights — while on the other hand it’s imperative to avoid missing genuine bugs, which might occur with over-filtering.

  5. Reporting: Our findings are then stored in a database and displayed on a web dashboard at https://riverguard.neodyme.io/. This dashboard serves as a centralized hub for viewing, understanding, and categorizing the vulnerabilities that Riverguard has identified.

2.2 Transaction Ingestion

We call the process of gathering transactions from mainnet into Riverguard Transaction ingestion. It is trickier than it seems because the transactions you might be familiar with from RPC nodes are incomplete. They contain essential elements such as signatures, instruction data, and account keys; however, they crucially lack necessary metadata, such as input account data and system variables (sysvar) contents. This omission renders it impossible to accurately replay a transaction without additional data. To replay a single transaction, one requires a snapshot of the chain’s state immediately before the transaction. However, accessing this historical chain state is currently complex and challenging. Our team is actively working on solutions to make this data more accessible, but we are not there yet.

To circumvent the missing information, we employ a modded validator that outputs Standalone Transactions. These enhanced transactions include all of the elements required for deterministic execution — not just the basics but also pre- and post-account states and other relevant metadata, such as the status of sysvars. They are distributed to several workers, each of which mutates one transaction at a time and generates findings.

For those interested in the nitty-gritty details of our infrastructure and how we manage these challenges, please keep an eye out for our upcoming infrastructure blogpost. It will delve deeper into the technical aspects of Riverguard’s operation. For a teaser, see the image below:

Image 2: Infrastructure Teaser

Image 2: Infrastructure Teaser

2.3 Transaction Mutation

The mutation process is at the core of Riverguard’s capability to simulate attacks and uncover bugs. As of January 2024, Riverguard employs seven distinct mutations, which we categorize into the following three main groups:

First, we have the general mutation rules:

  • Unchecked account owners: Are account owners not correctly asserted?
  • Unchecked signed cross-program invocation (CPI): Do we have an arbitrary CPI with a signature?
  • Missing signer checks: Are signer checks missing?

Second, we have more specific rules for the following:

  • Unchecked instruction sysvar: Can we fake instruction-sysvar contents?
  • Unchecked sigverify program: Can we mess with the sigverify verification?

Finally, we have our best mutation rules, which have had the most promising results thus far:

  • Account creation DoS: Can we cause a denial-of-service by preallocating lamports?
  • Unchecked vault accounts: Can we steal funds by replacing vault accounts?

We will soon release a companion blogpost with details on these mutations. All of them are based on real bugs that we have seen in the past. A good mutation finds real bugs that have an impact. It requires a low false-positive rate and the ability to attribute the error to a specific program so that deduplication can work efficiently.

More mutations are likely to follow. We welcome ideas and insights that could enhance them further. The goal is to maintain a delicate balance between thorough bug detection and efficient, accurate reporting.

2.4 Deduplication

The deduplication of findings is an essential step for efficient triaging. Many programs are called in a variety of different ways. A big question is whether we should only consider top-level transactions or examine cross-program-invocation in greater depth. Top-level instructions are much simpler to implement, but they have a glaring issue in the form of too many duplicates. Take multisigs, for example: They execute ‘foreign’ transactions through CPI, but they obviously aren’t at fault for issues in the programs they call. This means that each of our mutations must be aware of which program in the CPI chain causes a bug.

Once we know the location the the call-chain, we can use the program’s pubkey, the ‘instruction-selector’ (also called the “discriminator in Anchor, which uses the first 8 bytes of the instruction data), and some mutation-specific metadata as a deduplication key. All of this is hashed, and each hash represents a unique finding with potentially numerous duplicates. While there is room for improvement here, this method works surprisingly well, but determining the program at fault is the tricky part.

2.5 Example Mutation: Unchecked Vault Account

For now, we’ll focus on a single rule — Unchecked Vault Account. The essence of this mutation lies in its ability to detect vulnerabilities in token vault transactions. Check out the figure below to see one bug this mutation catches.

Image 3: Unchecked Vault Rule: Self Deposit Example

Image 3: Unchecked Vault Rule: Self Deposit Example

The following steps describe how it works:

  1. Identifying Potential Transfers: The first step involves scanning transactions for transfers to token vaults. Token vaults are critical in many protocols and often used to store and manage tokens securely.

  2. Simulating a Key Swap: Once a potential transfer to a vault is identified, Riverguard simulates a scenario where the vault in the transaction is replaced with the token source account. This is akin to asking the following question: “What if the destination of the tokens wasn’t the secure vault, but rather the place they came from?”

  3. Observing the Outcome: The crucial part of this test is to determine whether the transaction still succeeds with this swap. If it does, then a potential vulnerability exists. In a secure system, such a swap should almost always lead to the transaction failing. If the transaction succeeds under this mutation, then it implies that a user could potentially execute a deposit instruction without actually moving any tokens into the vault. This is a serious security concern. The user might still trigger any side effects of a deposit (e.g., crediting rewards or increasing their stake), and later, they might even be able to withdraw real tokens, effectively exploiting the protocol to steal funds. However, this might be a false positive in other cases. For example, we have seen ‘distributor’ contracts that simply distribute tokens from the user to many different target wallets of the user’s choosing. Here, a self-transfer is possible but not critical, as no side-effects will occur that cause issues. A user transferring tokens to him/herself isn’t a problem. Therefore, each finding must still be manually triaged to determine what is possible.

3. Solcean’s Eleven — A Casino Heist on Solana

Now, we delve into the perfect Riverguard showcase — a real bug, which we will actually exploit on-chain, with all of the triaging difficulties highlighted. This story is too great to burrow so deep in a blogpost, so we’ll provide a follow-up post with more technical details at a later date.

On a cozy autumn afternoon, Riverguard pops up with the following finding:

Image 4: Riverguard Finding. Self-Transfer (since then renamed to unchecked-vault-account)

Image 4: Riverguard Finding. Self-Transfer (since then renamed to unchecked-vault-account)

Therefore, we get to work. The first step in such triage is always to determine what the transaction should do. Here, this is easy as the logs tell us: DepositToken.

Image 5: Transaction Logs

Image 5: Transaction Logs

Upon further investigation, it becomes clear that this particular transaction belonged to SolCasino, an on-chain casino that operates on Solana. Vault replacement in a casino? Sounds exciting! Let’s look at our Riverguard finding again. We receive a message that we replaced a token account, but the transfer still succeeded. Matching our transaction, we obtain the following:

Image 6: Token Transfer in Transaction

Image 6: Token Transfer in Transaction

Bingo! This is our vault account, and we can call deposit without actually spending money. We specify ourselves as the vault, and the money comes back to us. The transaction still succeeds, and any account modifications persist. We also notice another log line after the transfer:

Program logged: "DEPOSIT EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v 20000000 0x1c5598509e9e7DeEF06B71Faeb0C69EC231a6ec3 7RNYdRkMWSKL8BM2pES8g8WZEp3Gm9azYh8yJoiFwtrb"

So, what now? At first glance, this seems like a big problem, but we should understand how the casino works before raising any alarms. We investigate a bit, use the casino ourselves, and find the typical user-flow to be as follows:

  • Deposit SPL Tokens to a centralized wallet through Deposit Instruction;
  • Optionally: Play some games 🎲🔥💸;
  • Create a withdrawal request -> Tokens get transferred to your wallet through a regular token transfer.

How curious! They don’t use an on-chain vault for user-fund payouts. This might make exploitation a bit more difficult as we have to convince the bot behind the payout that we have the required funds.

Remember the deposit logline? Let’s look at it again. We try deposits with different amounts and wallets and determine the following:

Program logged: "DEPOSIT EPjFWdd5AufqSSqeM2qN1xzybapC8G4wEGGkZwyTDt1v 32050566 0x85CA0453c8e1fe8fD20A1dEA9EDf0991b5C26334 7RNYdRkMWSKL8BM2pES8g8WZEp3Gm9azYh8yJoiFwtrb"
Educated Guess: "DEPOSIT <token_mint> <amount> <account_id?> <vault>"

Educated Guess: The backend parses the logs of all transactions that invoke the deposit program and credits the account_id. Maybe it also checks the <vault> key.

Is this exploitable? This is unclear! The answer is only if backend credits are deposited despite the self-transfer and wrong vault. The next step is to contact the developers as we cannot exploit this bug locally.

Contacting developers is most often the most challenging part of a triage. Here, we are lucky, as we already know that we are dealing with SolCasino. Sometimes, we get stuck at the program pubkey and cannot even identify the protocol name.

Even if you know the name of the protocol, contacting developers is always hard. We routinely try to use Telegram/Twitter/Discord and email for this, depending on what we find; however, we are often at the mercy of community managers to pass along information. For SolCasino, this was the difficult part. At first, we simply got banned from their Discord. After multiple other attempts, we finally found a point of contact. Some back-and-forth later, we received the following message:

Image 7: Proof it first! We’re given the go-ahead.

Image 7: Proof it first! We’re given the go-ahead.

Alright 😈! To do this, let us think about how an attack might work. We must convince the backend that the deposit happened, even though we transferred the funds back to ourselves. As this likely happens through a log-message, we have the following two possibilities:

  • The backend doesn’t actually check the deposit owner; or
  • The backend doesn’t parse the logs correctly, and we can peform a log injection.

It turns out that both are possible! We can replace the expected token vault with our own account; the backend doesn’t check the vault, and we can withdraw funds. We are now 1 USDC richer (which, of course, we immediately return to the vault)! Please keep an eye out for our soon-to-be-released companion blogpost over at https://neodyme.io/blog/ for the exact exploitation details as well as how the log injection works.

Ultimately, the SolCasino team was grateful and paid us a bug bounty. Riverguard protected approximately 1 million USDC in this bug alone.

We’ve seen that findings are only the starting point. False positives must be ruled out, which is often challenging for closed-source contracts without development support or live testing. Cooperation with the contract’s developers makes this much easier for everyone involved, but reaching the developers can be difficult.

4. Triaging: Help Us to Help You!

We’ve already contacted many projects about potential bugs, but some untriaged bugs remain, especially in smaller projects. To triage the rest, we need your help! Aside from time, there are currently a couple of big issues with effective triage. Specifically, it’s difficult to

  • find the project name;
  • find the code source;
  • contact the developers;
  • rule out false-positives.

If you are a contract developer, you can solve the first three problems quickly. Just add a security.txt to your contract: https://github.com/neodyme-labs/solana-security-txt

Image 8: security.txt as rendered by the explorer

Image 8: security.txt as rendered by the explorer

Users typically interact with a Solana smart contract via the project’s web interface, which knows the contract’s address. Security researchers often do not do this. For smaller or private projects in particular, identification from just the contract’s address is both difficult and time-consuming — if not impossible. This slows or prevents bug reports from reaching the developers.

Having standardized information about your project inside your contract makes it easy for white-hat researchers to reach you if they find any problems. To maximize compatibility with existing deployment setups, multisigs and DAOs, this security.txt is implemented to simply be a part of your program rather than an external contract.

4.1 Register!

In addition, you can register for free at riverguard.io! After a quick verification, this will give you access to findings for your own contracts, and you can directly triage yourselves. We will still help with triaging and be able to reach you more easily if we find anything.

5. The Future

We hope you have enjoyed this blogpost as much as we have enjoyed developing Riverguard. We are still maintaining it, and it is running smoothly on mainnet. In the past couple of months, we have worked on optimizing false positives and deduplication. Right now, we are updating Riverguard to support Solana 1.18.1.

As Riverguard still regularly finds bugs and we have yet to get through the triage backlog, we unfortunately cannot make it open-source yet. However, we certainly want to! If you have ideas or any thoughts or require access to similar infrastructure, we’ll hook you up. Just shoot us an email at the following address: riverguard@neodyme.io

Thank you for helping make the ecosystem more secure! Thanks to the Solana Foundation for making this project possible. Stay tuned for our upcoming blogposts detailing the mutations, the infrastructure, and SolCasino in greater depth.