Developers
Author
Garand Tyson
Publishing date
This article is part of a deep-dive series on the industry’s state bloat problem, which must be solved for blockchains to remain inexpensive, deliver high TPS, and scale to more users. This article will give a high-level overview of the Stellar solution to state bloat: State Archival. Follow-ups will discuss performance optimization, security features, and comparing State Archival to alternative scaling proposals.
When it comes to blockchain scalability, state bloat is the industry’s biggest open question. On every public blockchain, like Stellar, Ethereum, etc, anyone can submit a transaction for execution, so long as they pay a fee for the resources the transaction consumes. These fees allow blockchain networks to fairly allocate limited resources in a decentralized way, while still being protected from denial-of-service (DOS) attacks and spam. However, when it comes to storage, blockchain fees are fundamentally broken.
The state in blockchain refers to the current snapshot of all the data recorded on the ledger. State bloat, or ledger bloat, refers to the continued growth of the blockchain’s state as account, asset, smart contract information and associated data are stored. Consider a transaction that mints a new NFT. The submitter of this transaction pays a one time fee for the storage resources consumed when the transaction writes the NFT image to the blockchain.
The problem is, every validator on the network will have to permanently store this NFT forever, even if the NFT is never transferred or used after its initial creation. This is the “state bloat” problem, where state bloat refers to all the data on a blockchain that’s not being used, but must still be stored in validator databases.The problem gets even worse when you consider that validators must store this state in data structures that allow for fast hashing, such as Merkle trees. While not all blockchains use Merkle trees (Stellar uses a write optimized structure called a BucketList), they are the most common fast-hash structure, and all fash-hash structures share similar problems when it comes to caching and write performance.
For Merkle trees in particular, updating any entry requires an arbitrary number of random reads and writes. Because the root Merkle hash must be included in the blockchain, validators must block on these expensive update operations and cannot move onto the next block and start processing new transactions until the Merkle tree has finished updating. Even if an old piece of data was not used by any transactions in a block, if the data’s hash happens to be similar to the hash of a new entry that was updated by a transaction, the old data must be read in order to update the Merkle tree. While some write optimized structures, such as the BucketList, don’t have to block on writes like Merkle trees, they still must routinely reread and rewrite arbitrary ledger state, even if the data is not being actively used by transactions.
This means that a meme NFT minted in 2015 might have to be read in order to process a completely unrelated transaction in 2024. The result is traditional caching techniques are not applicable, as any random entry hash on the ledger may have to be read in order to process any random transaction. According to an Ethereum Foundation study, once blockchain state reaches a significant size, the network is limited to 214 TPS with a SATA SSD, or about 2400 TPS with the newest, most expensive PCIe NVMe drive. Of course, some networks can exceed these limitations by storing large portions of ledger state in memory, but this makes validator operation very expensive, especially since blockchain state has grown exponentially over the years. For comparison, Solana validators currently require at least 256 GB of RAM, compared to just 16 GB for a Stellar validator.
If a network wants to remain inexpensive, deliver high TPS, and scale to more users, state bloat cannot continue. The fundamental issue is that a one-time transaction storage fee incurs a recurring cost on the network to maintain this storage, leading to higher validator operating costs, as well as reduced network throughput.Enter the State Archival protocol on Stellar. With State Archival,
Stellar rolled out a State Archival interface on mainnet interface earlier this year, and if you’re interested in learning more about the user interface check out the docs here. While the interface is live on mainnet, archived entries are not yet deleted from validators. This blog will provide a high level summary on how the full State Archival protocol works under the hood. Specifically, it will walk through the design and discuss how
Every validator maintains two databases locally on disk, the live ledger state, and a lazily constructed Merkle tree containing recently archived entries, called the “Hot Archive.” Whenever an entry runs out of rent and is archived, validators move the entry from the live state database to the Hot Archive. Eventually, the Hot Archive will become full after some capacity limit is reached. When the Hot Archive is full, validators publish it to the History Archives, store just the Merkle root of the tree, and delete the rest of the archived entries. Then validators initialize a new, empty Hot Archive and repeat the process.
Over time, the History Archive will contain many immutable Merkle trees of archived entries generated from validators. This is called the Archived State Tree (AST), where each immutable Merkle tree is a subtree indexed AST[0]...AST[n]:
Suppose an entry was archived and is in AST[1]. In order to restore that entry, a transaction with the entry and a Merkle style proof-of-inclusion for that entry in AST[1] is submitted to the network. Validators can check that the entry being restored has not been tampered with by verifying the proof-of-inclusion against the saved root hash for AST[1]. If the proof is valid, the entry is then added back to the live state. In this way, validators can still ensure the validity and security of archived entries and the immutability of the blockchain, even though validators do not store the entries themselves.
On the Stellar network, every full validator maintains a History Archive (note that a Stellar Full validator is equivalent to an Archive Node in other ecosystems such as Ethereum). These Archives are free, auditable, and verifiable. Full validators include not only some of the oldest names in blockchain, but also trusted mainstream financial institutions, such as the Fortune 500 company Franklin Templeton. By publishing all the information required for entry restoration in the History Archives, the Stellar protocol ensures that archived entries are always readily available in a decentralized way.
While History Archives are the canonical store of archived entries at the protocol level, it would be inefficient and not user friendly to directly generate proofs from these archives. Instead, RPC instances maintain local copies in order to efficiently generate proofs during transaction simulation. Currently, on Stellar and many other blockchain networks, users simulate transactions via RPC before submitting them to the network in order to set resources, inclusion fees, etc. When full State Archival is enabled, RPC nodes will detect if a given transaction requires archived entries during simulation. For each archived entry, the RPC node will automatically generate and attach a proof to the transaction. After simulation, the user will be able to submit the transaction, the relevant entries will be unarchived, and the transaction will execute against the now live entries.
In addition to simulating transactions, RPC nodes will also expose endpoints to query about both live and archived state. This allows developers to abstract away the complexity of State Archival from end users in most applications.
With RPC as the primary entry point, most of the complexity surrounding State Archival is abstracted away from both developers and end users. Let’s examine a wallet application and see what changes need to be made for the wallet to be compatible with State Archival. For this example, the wallet has two functions: showing the user their account balance and submitting payment transactions. In a pre State Archival world, the wallet works as follow:
After full State Archival is implemented, the wallet behaves almost identically from both the user and developer perspective:
As you can see, RPC handles almost all of the additional complexity of State Archival. For simple wallets, the user does not even need to be notified that their balance is live or archived, as the user can still quickly submit payment transactions even if the balance is archived. While wallets targeted towards more advanced users might decide to display this data, many dApps may abstract away State Archival entirely.While the developer and user flow is almost identical, fees will change. For example, making frequent rent payments is cheaper than restoring entries, so a user or wallet implementation may want to automate periodic rent payments. However, the Stellar Network has some of the lowest fees of any blockchain, and State Archival will ensure these fees remain low for years to come.
With State Archival, the Stellar network remains one of the most scalable blockchains. Validators maintain very small databases, leading to fast transaction execution and high throughput. Additionally, smaller databases means new nodes can quickly sync to the network and reduces the size of ledger history. The result is a cheaper, faster layer 1 network.
While non-validator nodes still need to store archived state, this is still very efficient. The AST is an immutable, append-only data structure, meaning that History Archives can store it on inexpensive network drives as opposed to the expensive local NVMe drives required for high performance validators. Additionally, because all data is secured by the blockchain, there can be significantly less copies of archived state compared to live state without sacrificing security or decentralization.
Because the AST is immutable, it can also be sharded. This is especially helpful for RPC providers who can shard archived state across multiple nodes and load balance incoming requests accordingly. Since RPC nodes are off-chain and do not participate in consensus, this sharding is significantly more efficient and has no security risks compared to validator sharding. The end result is a highly scalable and easy to use system for developers and users alike.
But that’s just the tip of the iceberg! This has been a quick introduction, but the State Archival protocol is complex, especially when it comes to security and performance optimization. The full technical specification for the State Archival Interface can be found here, while the proof specification can be found here. In the next few weeks, I’ll be taking a deep dive into specific security and performance features, as well as comparing State Archival on the Stellar network to less scalable solutions on Solana and Ethereum in future blog posts.