Peeking under the hood
https://jumpcrypto.com/peeking-under-the-hood/
Rahul Maganti and Saurabh Sharma, Mar 2022
Introduction
With the rapid emergence of cross-chain bridges, new testing frameworks, and other crypto protocols, effectively mapping out blockchain infrastructure remains a key challenge for users, developers, and investors alike. The term “blockchain infrastructure” can encompass a variety of different products and services, ranging from the underlying networking stack to the consensus model or virtual machine. We reserve a more in-depth analysis on the various “core” components that make up L1/L2 chains for a later post (keep your eyes peeled!). In this piece, we specifically aim to:
provide a broad overview of other key components of blockchain infrastructure.
break down those components into clear and digestible sub-segments.
Infrastructure Map
We define the "ecosystem" of blockchain infrastructure as protocols intended to support the development of L1s and L2s in the following key areas:
Layer-0 Infrastructure: (1) Decentralized Cloud Services (Storage, Compute, Indexing); (2) Node Infrastructure (RPC, Staking / Validators)
Middleware: (1) Data Availability; (2) Communication / Messaging Protocols
Blockchain Development: (1) Security and Testing; (2) Developer Tooling (Out-of-the-Box Tools, Front/Backend Libraries, Languages / IDEs)
Layer 0 Infrastructure
Decentralized Cloud
Cloud services have been essential to the growth of Web2 - as compute and data needs for applications have grown, service providers specialized in making this data and compute quickly available in a cost effective manner has been critical. Web3 applications have similar demands for data and compute but want to remain true to the ethos of blockchain. As a result, protocols aimed at creating decentralized versions of these Web2 services have emerged. There are 3 core components of the decentralized cloud:
Storage - data / files are stored on servers run by many entities. Because data is replicated or striped across multiple machines, these networks are able to achieve a high degree of fault-tolerance.
Compute - just like with storage, compute is centralized in the Web2 paradigm. Decentralized compute is concerned with distributing this computation across many nodes to achieve a higher degree of fault-tolerance (if one or a set of nodes goes down, the network can still service requests with minimal disruption to performance).
Indexing - in the Web2 world, where data is already stored on one server or set of servers owned and operated by one entity, querying this data is relatively easy. Because blockchain nodes are distributed, data can be siloed and fragmented across different regions and often under incompatible standards. Indexing protocols aggregate this data and provide an easy-to-use and standardized API to access such data.
There are a couple of projects that provide storage, compute, and indexing (Aleph and Akash network) while others are more specialized (i.e. The Graph for indexing, Arweave / Filecoin for storage).
Node Infrastructure
Remote Procedure Calls (RPC) are core to the function of many types of software systems. They allow one program to call or access a program on another computer. This is particularly useful for blockchains, which have to serve a plethora incoming requests from various machines running in various regions and environments. Protocols like Alchemy, Syndica and Infura provide this infrastructure as-a-service, allowing builders to focus on high-level application development instead of the underlying mechanics involved in relaying or routing their calls to nodes. Like many RPC providers, Alchemy owns and operates all of the nodes. For many in the crypto community, the dangers of centralized RPC are evident — it introduces a single point of failure that can endanger the liveness of a blockchain (i.e. if Alchemy goes down, applications will not be able to retrieve or access data on-chain). Recently, there has been a rise in decentralized RPC protocols like Pocket to address these concerns, but the efficacy of this approach remains to be tested at scale.
Staking / Validators - the security of blockchains relies on a set of distributed nodes validating transactions on-chain, but someone must actually run the nodes that participate in consensus. In many cases, the time, cost, and energy required run nodes is prohibitively expensive, leading many to opt out and instead rely on others to shoulder the responsibility of ensuring the safety of the chain; however, this attitude poses serious problems - if everybody decided to offload security to someone else, no one would be validating. Services like P2P and Blockdaemon run the infrastructure and allow less sophisticated or well-capitalized users to participate in consensus usually by pooling funds. Some have argued that these staking providers introduce an unnecessary degree of centralization, but the alternative is likely worse — in the absence of such providers, the barrier to entry for running nodes would be too high for the average network participant, likely leading to an even higher degree of centralization.
Middleware
Data Availability
Applications are heavy consumers of data. In the Web2 paradigm, this data is often sourced directly from users or third-party providers in a centralized fashion (data providers are compensated directly for aggregating and selling data to specific companies and applications - think Amazon, Google, or some other machine-learning data provider.
DApps are also heavy consumers of data but require nodes to make this data available to users or applications running on-chain. To minimize trust assumptions, it’s important that this data is made available in a decentralized manner. There are two primary ways in which applications can quickly and efficiently access high fidelity data:
Data Oracles like Pyth and Chainlink provide access to data streams otherwise not available on-chain, thereby allowing crypto networks to interface with both traditional / legacy systems as well as other external information in a reliable and decentralized manner. This includes high quality financial data (i.e. asset prices). This service is of utmost importance to expand DeFi to broad use cases ranging from trading, lending, sports-betting, insurance and many others.
Data Availability Layers are chains that specialize in ordering transactions and making data available to the chains they support. Typically, they generate proofs that provide clients high probability confirmation that all block data has been published on-chain. Proof of data availability is key to guaranteeing the reliability of Rollup sequencers and reducing the cost of Rollup transaction processing. Celestia is a great example of this layer.
Communication and Messaging
As the number of layer-1s and their ecosystems grow, there is an even greater need for managing composability and interoperability across chains. Cross-chain bridges allow for otherwise siloed ecosystems to interact in a meaningful way - this is analogous to the way new trade routes helped connect otherwise disparate regions, ushering in a new era of knowledge-sharing! Wormhole, Layer Zero, and other bridging solutions support generalized message passing, allowing all types of data and information, including tokens to be moved across multiple ecosystems — applications can even make arbitrary function calls across chains, enabling them to tap into other communities without having to deploy elsewhere! Other protocols like Synpase and Celer are limited to cross-chain transfers of assets or tokens.
On-chain messaging remains a key component of blockchain infrastructure. As DApp development and retail demand grows, the ability for protocols to interact in a meaningful yet decentralized manner with their users will be a key driver of growth. Here are a few potential areas where on-chain messaging could be useful:
Notifications to claim returns / tokens
Allowing for in-built communication messaging within wallets
Announcements / notifications regarding important protocol updates
Notifications to track critical issues (e.g. risk metrics for DeFi applications, security vulnerabilities)
A few notable projects developing on-chain communication protocols include Dialect, Ethereum Push Notification Service (EPNS) and XMTP.
Blockchain Development
Security and Testing
Security and testing in crypto is relatively nascent and underdeveloped but undeniably critical to the success of the entire ecosystem. Crypto applications are particularly sensitive to security risks because they are often directly securing assets. Small errors in design or implementation can often lead to large economic outcomes.
There are 7 main approaches to security and testing:
Unit testing is a core part of the testing suite for most software systems. Developers write tests that check the behavior of small, atomic parts of the program. There are a variety of useful unit testing frameworks. Some popular ones on Ethereum include Waffle and Truffle while Anchor testing frameworks are standard for Solana.
Integration Testing is concerned with testing various software modules as a group. This testing paradigm is leveraged outside of the crypto arena but is also extremely valuable in blockchain development. Because libraries and higher-level drivers often interact with each other and other lower-level modules in various ways (e.g. a typescript library interacting with a set of underlying smart contracts) it’s crucial to test the flow of data and information across these modules.
Auditing has become a core part of the security process for blockchain development. Before releasing smart contracts for public use, protocols often leverage the services of third-party code auditors to check and validate each line of code. Like many members of the community, we rely heavily on auditors to ensure safety to the highest degree. Trail of Bits, Open Zeppelin, and Quantstamp are a few trusted names in this space (the demand for their services is so high that wait times can often be months!)
Formal verification is concerned with checking whether a program or software component satisfies a set of properties. Typically, someone will write a specification that details how the program should behave. The formal verification framework will work to turn this specification into a set of constraints that it then solves and checks. Certora is a leading project that leverages formal verification to bolster smart contract security, along with Runtime Verification).
Simulation - Agent-based simulation has long been used by quantitative trading firms to backtest algorithmic trading strategies. Given the high cost of experimentation in blockchain, simulations provide a way to test a wide variety of assumptions and inputs. Chaos Labs and Guantlet are great examples of platforms leveraging scenario-based simulations to secure blockchains and protocols.
Bug Bounties - bug bounties help to leverage the ethos of decentralization in crypto to solve massive security challenges. The monetary rewards incentivize community members and hackers to report and fix critical issues. As a result, bounty programs play a unique role in turning “grey hats” into “white hats (i.e. turn those on the fence about participating in hacks for the purpose of extracting funds to those who participate for the sake of addressing bugs and fortifying the security of the protocol). In fact, Wormhole has a bug bounty out on Immunefi right now worth up to $10m! (one of the largest software bounties ever), and we encourage anybody and everybody to participate!
Test Networks (testnet) - testnets provide like-for-like representations of mainnets to allow developers to test and debug parameters in a production environment. Many testnets use a Proof-of-Authority / other consensus mechanisms with a small number of validators to optimize for speed — the currencies on the test networks also have no real value. As a result, there is no process by which users can acquire coins through a mining process – instead, they are acquired through a faucet. There are a number of testnets that are built to mimic the behavior of popular mainnet L1s (i.e. Rinkeby, Kovan, Ropsten for Ethereum).
Each approach has its own merits and disadvantages and are certainly not mutually exclusive - often these testing flavors are used at different stages of a project’s development:
Stage 1: unit tests are written as the contracts are built.
Stage 2: once higher-level abstractions are built, integration testing becomes important for testing interactions between modules.
Stage 3: code audits are performed closer to testnet / mainnet launch or large feature releases.
Stage 4: formal verification is often paired with code audits and used as an extra guarantee for security. Once the program has been specified, the rest of the process can be automated, making it easy to pair with Continuous Integration or Continuous Deployment tools.
Stage 5: Application is launched on a test network to check throughput, traffic, and other scaling parameters.
Stage 6: Bug bounty is launched after mainnet deployment to leverage community resources to find and fix issues.
Summary of Testing Solutions
Type | Accessibility | Efficacy | Speed | Key Disadvantage |
---|---|---|---|---|
Unit Testing | High | Low | Medium | limited code coverage (depends on the developer); hard to test software interactions |
Integration Testing | High | Medium | Medium | requires all of the modules to be tested before finishing (takes time) |
Auditing | Low | High | Low | difficult to repeat; non-automated |
Formal Verification | Low | High | Medium | under-specifying relevant properties can miss critical bugs |
Simulations | High | Medium | Medium | not exhaustive and constrained by parameters tested |
Bug Bounties | Low | Medium | Low | difficult to coordinate - very capital inefficient |
Test Networks | Low | High | Low | difficult to spin up, run, and maintain |
Developer Tooling
The growth of any technology or ecosystem relies on the success of its developers — this is especially true in crypto. We segment developer tooling into three main categories:
Out-of-the-Box Tools
SDKs for developing new L1s help to abstract away the process of creating and deploying the underlying consensus core. Pre-built modules allow for flexibility and customization but optimize for development speed and standardization. The Cosmos SDK is a great example of this, enabling the rapid development of new Proof-of-Stake blockchains within the Cosmos ecosystem. Binance Chain and Terra are notable examples of Cosmos-based chains.
Smart Contract Development - There are a number of tools that can help developers spin up smart contracts quickly. For example, Truffle boxes contain simple and useful examples of Solidity contracts (Voting, Metacoin, etc...). The community can also suggest addendums to this repository.
Frontend/ Backend Tooling - there are number of tools that make it easier to develop applications. Connecting the application to the chain (i.e ethers.js, web3.js, etc...). Upgrading and interacting with contracts (ex. OpenZeppelin SDK) - There are a variety of different tools specific to different ecosystems (ex. Anchor IDL for Solana smart contracts, Ink for Parity contracts) that handle writing RPC request handlers, emitting an IDL, generating clients from the ID.
Languages and IDEs - the programming model for blockchains is often quite different than that of traditional software systems. The programming languages used for blockchain development were developed to facilitate this model. For EVM-compatible chains Solidity and Vyper are used heavily. Other languages like Rust are used heavily for chains like Solana and Terra.
Conclusion
Blockchain infrastructure can be an overloaded and confusing term - it’s often synonymous with a wide range of products and services that cover everything from smart contract auditing to cross-chain bridging. As a result, discussions about crypto infrastructure has been either been too broad and unstructured or simply too specific and targeted for the average reader. We hope that this piece struck the right balance for those just entering crypto and those seeking a more in-depth overview.
Crypto, of course, is changing rapidly, and the protocols referenced in this article will likely no longer constitute a representative sample of the ecosystem in 2 or even 3 months. Even so, we believe that the primary goal of this article (i.e. breaking down infrastructure into more easily understood and digestible segments) will be of even greater relevance going forward. But as the landscape of blockchain infrastructure evolves, we’ll also be sure to provide clear and consistent updates on our thoughts.
Please reach out to (Rahul Maganti (@rahulmaganti_) and Saurabh Sharma (@zsparta) for questions or comments. Let us know what we got wrong or where you disagree!). Special thanks to Nikhil Suri (@nsuri_). and Lucas Baker (@sansgravitas) for providing valuable feedback.
Last updated