Scalability with state archival on Stellar vs. Solana’s avocado

Author

Garand Tyson

Publishing date

TL;DR

This is part 2 in a deep-dive series on the state bloat problem. Part 1 was a high level overview of the State Archival solution on the Stellar network. This article focuses on State Archival’s scalability features and analyzes Solana’s less efficient version of State Archival called “State Compression.”

Solana’s State Compression design (Avocado) has fundamental design flaws:

RPC nodes are expected to store all compressed accounts and account data.
No protocol guarantees for compressed account data availability, so compressed data can be lost or censored.
Some validators still need to store all compressed accounts.
Transaction volume will increase and validator arbitrage spam will add to Solana’s congestion issues.

State Archival on the Stellar network is much more efficient and scalable:

No increase in transaction volume or spam.
Validator snapshots via public History Archives guarantee archived data availability.
No validator needs to store all archived accounts.
RPC nodes can shard archival state, reducing cost.

The state bloat problem is blockchain’s biggest elephant in the room. Every day, millions of new entries are created on public blockchains like Stellar and Solana, and each validator needs to store all this state indefinitely. Every meme NFT, hello world test contract, and spam token needs to be kept around forever no matter how often the information is actually used, or even if it gets used at all. As more users and use cases come to blockchain, the issue will only grow, causing networks to become slower and more expensive as they try to keep up with all this bloat.

This isn’t a new issue. Vitalik started the conversation in 2021 with his “State Expiry” scheme for Ethereum and the Stellar Development Foundation started designing a “State Archival” solution for the Stellar network two years ago. Anatoly Yakovenko, Solana’s co-founder, is the latest player to join the party, introducing Solana’s “Avocado” project a few weeks ago. While there are lots of ideas on how to solve state bloat, both Vitalik and Anatoly’s designs have significant, fundamental flaws. For instance, Ethereum’s state expiration proposal has a double spend bug that I found, and its fix is fundamentally incompatible with other plans in Ethereum’s roadmap such as Weak and Strong Statelessness. I’ll talk more about Ethereum in Part 3 when I focus on security, but for now, let’s talk about Solana and scalability issues.

At a high level, Solana’s “Index Compression” and “State Archival” on Stellar work similarly. Each account pays rent for staying active on the ledger. Whenever the rent runs out, the account is archived (Stellar) or compressed (Solana) and any transaction that touches an archived account fails. Archived accounts are deleted from validators and stored off-chain in Merkle trees. To unarchive and reuse these accounts, RPC nodes generate a Merkle style proof, validators verify the proof, and the entry is added back to the live ledger state.

While the high-level strategy is similar, the devil is in the details. I’ll explain how Avocado introduces more problems than it solves, with poor data availability, decreased network performance, and higher fees. I’ll also talk about how all these problems are solved with State Archival on Stellar.

Solana protocol does not guarantee data availability

Avocado’s “State Compression” has no data availability guarantees for compressed account data. When account data runs out of rent, it is replaced with a hash that is persisted by validators while the data itself is deleted. In order to restore (or decompress) this data, a transaction uploads the original data which validators check against the persisted hash.

The problem is, the Solana protocol only persists and maintains the availability of the hash, not the data itself. The hash can’t be reversed to get the original data, so while the protocol can verify that the data is correct, it can’t actually store or generate the data. Rather it’s the responsibility of the user submitting the transaction to come up with the original data. The question is, how can a user actually find this data?

The proposal does not discuss data availability. Unlike with the “Index Compression '' piece of Avocado, account data is not stored in Merkle trees, and there doesn’t appear to be any mechanism to maintain the data at the protocol level. I assume this responsibility will be left to RPC nodes (a common design on Solana). The issue is, Solana RPC nodes are already very expensive to operate today (as in 512 GB of RAM expensive), and this will put more on their plate. It’s easy to imagine that a cost like this will ultimately be passed down to the user, making decompression expensive.

But even more worrying than the price is that there is no guarantee or requirement that RPC nodes faithfully store this information. Once deleted, the account data payload isn’t required for consensus, so approaches like “Proof of Replication” won’t work. There’s no way for the Solana network to guarantee that RPC nodes actually save the deleted data. RPC providers can choose to ignore compressed data because it’s too expensive to store, or might pick and choose and only store compressed data that’s valuable enough to be profitable. Think about it: if the protocol isn’t forcing the RPC to store all data, and if storing the data is getting more and more expensive, the RPC provider may look to cut costs to turn a profit. If an RPC provider is faced with the choice to store your NFT that you thought would moon or a whale’s USDC balance, one can begin to guess which data will be prioritized.

If you care about getting your data back after it’s archived, build on Stellar. While off-chain RPC nodes on Stellar cache archived state, all data is backed by protocol level guarantees. Validators routinely publish “Archival Snapshots” that contain not just the hash of your data, but the data itself. Every Tier 1 validator maintains these snapshots and offers them publicly for free via the Stellar Network History Archives. And who’s storing these Archives? Not only some of the most trusted names in the blockchain space, but also traditional finance as well, including the Fortune 500 company Franklin Templeton.

On Solana, full validators do not serve Archive Snapshots as part of the protocol and archived account state is not stored in a Merkle tree, allowing RPC nodes to pick and choose what archive state to store. The Stellar network guarantees that your data is replicated and readily available at all times, even if it’s archived, by storing archived account state in a Merkle tree that is required to be routinely published to History Archives in full. These History Archives are auditable, verifiable, and available freely to the public, guaranteeing archived data availability in a decentralized way.

TPS ↓ network congestion ↑

Solana has historically had periods of high transaction drop rates, such as in April 2024 when bot trading activity caused too much congestion and Solana’s networking layer couldn't keep up with the transaction volume. While Solana was able to improve the situation in a patch, blockchains are fundamentally network bound when it comes to performance, and network congestion will continue to be an issue as the number of users, validators, and TPS increases. With Avocado, congestion will only get worse.

Today, users (and bots) have to burn fees to submit transactions. These fees help prevent network congestion, because without these fees, attackers and bot traders could overwhelm the network with spam. Fees act as a deterrent and make such attacks too financially expensive to be practical.

Yet even when charging fees for transactions, Solana has had significant congestion issues, and Avocado will throw gasoline on the fire. With this proposal, validators will be paid to send more transactions. Let me say that again. Solana, a network that already has a problem with transactions causing significant congestion, is proposing to actually pay validators for sending a new category of transactions, incentivizing even more transactions on the already overwhelmed network.

When an account runs out of rent it’s not automatically deleted like account state. Instead, a “compression transaction” uses zero knowledge proofs to delete the account and add it to a Merkle binary trie. This trie contains all the compressed accounts, but validators only need to store the root of the trie (kind of). In order to use a compressed account, a user must submit a transaction with a proof of the entry generated from the complete trie.

While “technically” validators only need to store the root of the trie to participate in consensus, RPC nodes need to store the full trie to produce proofs. Additionally, validators can make extra money by storing the complete trie and submitting compression transactions, and the network depends on these validators for compression to work properly.

This seems like a poor design tradeoff. Solana already has a problem with too many transactions when transactions cost money, and with this change, validators will be compensated for sending additional, non-user transactions. You might think that this isn’t so bad. After all, it’s just one transaction compressing multiple accounts.There will be few more transactions than you would otherwise, but you can save so much storage space!

The issue is, Solana is designing arbitrage spam directly into their network at the protocol level. Since validators make money sending these transactions, it probably won’t take long for validators to compress all the preexisting expired accounts when this “feature” is first enabled. Once all the old accounts have been compressed, validators will be continually searching for new expired accounts. Whenever a new account expires, all the validators will race to be the first to compress it to get the reward. This means that whenever an account runs out of rent, every single validator will submit a transaction to try to be the first to win the compression award, resulting in a large amount of redundant transactions that all attempt to compress the same account. These redundant transactions will take up limited network bandwidth that could have otherwise gone towards user transactions, reducing overall TPS.

With the State Archival solution on Stellar, all validators deterministically archive the same entries on a given schedule, so there’s no need to send additional transactions. This is very important, as networking is the main bottleneck in blockchain performance. Anatoly considered deterministic compressions, but decided against it:

“If all the validators have to do this deterministically then all of them have to maintain what is currently in the binary-trie and that means that the entire state has to be part of the snapshot.”

While that might be true for Solana’s Merkle binary trie, the Stellar network structures archived state much more efficiently. Instead of using one large trie to store all archived information, the Stellar network stores archived state in a collection of small, immutable Merkle trees. This means that validators can archive new entries by only storing a small tree of recently archived entries, while the majority of archived state is offloaded in History Archives:

Stellar validators only need to store a small Merkle tree with recently archived state. See Part 1 of this series for more.

Crucially, this allows Stellar validators to archive entries on a deterministic schedule such that State Archival requires no additional transactions, meaning more bandwidth for user transactions and higher TPS.

Avocado increases network costs

The entire point of State Archival is to reduce the amount of state blockchain nodes need to store. This doesn’t just include validators, RPC nodes should also benefit from State Archival. Unfortunately, Anatoly has decided to use a single trie to store all compressed account state. This is very expensive. To recap, here’s who needs to store the entire compressed account state:

RPC nodes (to produce decompression proofs)
Validators that make extra money from compression transactions

The only entity that doesn't have to store complete state are validators that don’t send compression transactions. Storing compressed state is too expensive for the long term sustainability of the network, so Solana’s solution is to make RPC nodes and many validators store compressed state.

As you can see, there’s really no cost savings from Solana’s state compression. Network fees will still increase in order to compensate the rewards for validators that need to store the full trie to produce compression proofs. Off-chain fees will also continue to increase, as RPC nodes are just as expensive to run as before, if not more expensive. While some validators won’t need to store the archive trie, the network still depends on validators that do store it. This doesn’t seem to be a significant benefit, especially at the cost of higher fees, more complexity, and more network congestion.

The issue is, by using a single trie, archival state cannot be sharded. Any node that interacts with compression or decompression needs to store all compressed state, whether that be an RPC node producing proofs or a validator submitting compression transactions.

The Stellar network has a much more scalable design. Because State Archival on Stellar uses a collection of small, immutable Merkle trees instead of one large, mutable trie, archival state is much cheaper to store long term. RPC nodes don’t need to store all archived state to produce proofs, but can shard trees between multiple nodes, or elect to only store shards that contain certain relevant archival state. While History Archives must maintain the full archival state for data availability and decentralization purposes, operators can also shard trees across multiple nodes and store archival state on cheap network backed storage devices. Solana’s single trie cannot be shared, and since it is mutable, must be stored on expensive, local NVMe drives.

The result is that with the State Archival design for Stellar, the network actually gets cheaper! No validator or RPC instance needs to store all archived state, so builders will see performance and cost saving benefits!

If you want to read more about the State Archival design on Stellar check out the interface spec here (live on mainnet since February) and the proof implementation details here (coming to mainnet later this year).

Don’t just believe the Solana hype machine, #buildbetter on Stellar.

Next Steps