In the first piece of this series, Mastering Web3 Fundamentals: From Node to Network, we covered key concepts around the node layer and layer1 networks, explaining how layer1 blockchains work from a hardware, network and consensus perspective. If you haven’t read it, I recommend to go read it first!
In this second piece of the Master Web3 Fundamentals series, we delve into a more advanced topic: interoperability (cross-chain communication).
Before we take the first step into this piece, I would like to clarify two terms:
MULTICHAIN is the idea that multiple networks can co-exist and complement each other
CROSS-CHAIN is the ability for networks to communicate with each other (often referred to as interoperability)
In From Node to Network we categorized web3 infrastructure into multiple sections, reflecting the on-chain ecosystems, the off-chain environment that supports the on-chain ecosystems and middleware that connects decentralized networks with each other and allows these to connect with the off-chain environment.
This piece covers a portion of the upper half of the on-chain ecosystem: we look at the Interoperability Layer (middleware) that connects one network with another.
On/Off-chain Communication Tools: Blockchain APIs, Oracles
The Blockchain Trilemma
Before we embark on a journey to understand the interoperability layer, we must first understand why different blockchain networks exist in the first place. Different networks are designed with different purposes in mind, and each purpose will consider a different set of factors to prioritize.
The Scalability Trilemma, often also referred to as the Blockchain Trilemma, is a theory published by Vitalik Buterin. It describes a comparative framework to evaluate blockchain networks, emphasizing that tradeoffs are made between scalability (speed), security and decentralization when designing a network.
Within this framework, the three edges of the triangle mean the following:
Scalability (speed) refers to a blockchains ability to handle large amounts of transactions, usually measured in Transfers Per Second (TPS)
Security describes the extent to which the network is secure against attacks (of both economic and technical nature) and its ability to operate as expected
Decentralization relates to which extent control is concentrated among few actors or intermediaries, and is often measured based on the number of nodes a network has and the barriers to entry for new nodes to join the network
The blockchain trilemma framework acts as a three-sided scale, which represents the compromises made between the three properties: a blockchain can occupy one side of the triangle, indicating that a network has decided to prioritize the two adjacent properties and de-prioritize the property opposite to them. This would result in three general profiles for blockchain networks:
Scalable and decentralized, but not secure [Nano, IOTA, VeChain]
Scalable and secure, but not decentralized [Post-merge Ethereum, Binance Smart Chain, Ripple]
Secure and decentralized, but not scalable [Bitcoin, Pre-merge Ethereum, Monero]
Based on this, we can compare networks to each other to understand generally, which area of the triangle they would occupy:
It should be noted that the blockchain trilemma is a model to help conceptualize challenges faced by blockchain networks. It is not deterministic. Nonetheless, the framework helps to highlight that different networks fulfill different purposes and target different niches. As can be seen above, Bitcoin has high security and decentralization properties. Comparatively, Ethereum (post-Merge) is less decentralized but more scalable, due to its higher TPS. Finally, IOTA is very scalable and decentralized, but due to the DAG-based structure of the network, it faces greater security challenges.
Users use networks for different purposes and to benefit from each network’s unique properties. To get the most out of Web3, a user could purchase Bitcoin to serve as a store of value (based on its decentralization and security aspects), while using Ether from the Ethereum network to benefit from its smart contract capabilities and faster transaction finalization.
However, blockchain networks are closed systems: they only see the state changes that occur within them. If a user holding Bitcoin wants to move their Bitcoin to Ethereum, they cannot do it without an interoperability layer that connects both networks and allows them to communicate. This is because the assets themselves are not compatible with other networks, and the networks are not natively interoperable with one another. Let’s look at that in more detail.
Enjoying this piece? Subscribe to the Web3edge newsletter and never miss new content!
Interoperability vs Compatibility
Interoperability refers to the ability of different systems, devices, or applications to connect and to communicate with each other in a coordinated way, with minimal or no effort from the end user. Interoperability differs from compatibility, in that compatibility requires two systems to be able to understand outputs from one another (i.e., they are compatible), but do not require the ability to connect or communicate with each other (i.e., they are interoperable).
In simple terms: compatibility is the property of being able to work with outputs of different blockchain networks without additional modifications to the outputs, and interoperability is about communication and interaction between different blockchain networks irrespective of differences in their underlying technology.
Bitcoin (BTC) the asset, cannot be sent natively from the Bitcoin network to the Ethereum network. Bitcoin and Ethereum are different systems, and their assets are coded differently. As a result, Bitcoin (BTC) is not compatible with the Ethereum network and Ether (ETH) is not compatible with the Bitcoin network.
Nonetheless, you may have seen an asset called Wrapped Bitcoin (wBTC) trading on Ethereum. wBTC is an ERC-20 asset created on Ethereum which is pegged to the value of Bitcoin, meaning 1 wBTC can be exchanged for 1 BTC. wBTC is an Ethereum-compatible representation of BTC, made to be useable on the Ethereum network.
Because wBTC is compatible with Ethereum and because BTC and wBTC are pegged to be equal in value, an interface between Bitcoin and Ethereum networks can allow them to communicate with each other. This interface would collect BTC on the Bitcoin network and keep it safe, while minting wBTC in equal denominations on Ethereum, thus “translating” Bitcoin-native BTC into the Ethereum-native ERC-20 wBTC asset.
This interface between two networks is called a bridge. If you imagine each network is a city, with its own language, own currency and own economy, then the bridge is what connects these two cities with one another and allows for communication and value transfers between both cities.
When designing an interface to enable the transfer of assets between two networks (a bridge), there are three inter-related design areas that need to be considered:
Cross-chain communication mechanism (oracle, relayer, light clients, centralized/trusted entity)
Determines the technology used to pass messages from one network to another. For example, to communicate the verification of received funds on one network to trigger the release of funds on the other network.
Cross-chain asset equivalence (lock-and-mint, burn-and-redeem, local liquidity pools)
Determines how assets are received and issued in equivalent value, as well as the format of assets received on the target network (native tokens, wrapped or pegged tokens).
Asset translation process (notary schemes, sidechains, relay networks, atomic swaps)
Determines the process by which assets are moved between the source and target networks, and the verification process before assets are released.
Generally speaking, all three design aspects are closely related.
To further separate these three design areas, think of it this way: the asset translation process describes the theoretical asset transfer mechanism, while the communication mechanism and asset equivalence refer to technical implementations used to facilitate communication and actually “move” the assets.
Cross-chain communication mechanism
As mentioned earlier, interoperability requires two networks to be able to communicate with one-another, but blockchains are closed-loop systems – they only understand what happens within them. To overcome this, two separate technologies have been developed: oracles and relayers. Both oracles and relayers enable cross-chain communication, but fundamentally, they both function very differently.
Oracles are third-party services on decentralized blockchain networks, which are able to write outside data to the network. There are many reasons why one would want outside data on a blockchain network: price feeds (price of assets not natively available on their network), synthetic assets (tokens which’s value is pegged to real-world assets) or any trigger event on another network (e.g., user initiates a cross-chain transfer). The most popular oracle providers include Chainlink, Band Protocol and API3.
With these examples it should become clear that oracles are not the source of data, but instead they are responsible for authenticating data sources and making information available on a blockchain network. This is achieved by writing the data to the network through calling a smart contract and including the data to be stored in the transaction’s payload. This is known as an inbound oracle, as it takes external data into the blockchain environment. Example: if an asset hits a certain price (external data to write to blockchain), then execute a buy order (smart contract functionality triggered by oracle price feed data).
Outbound oracles react to trigger events that occur on the blockchain. When a trigger event takes place, the smart contract takes note of this and signals certain actions to be taken. Off-chain oracle nodes which monitor the smart contract pick up the signal and make the data available outside of the network. Example: if payment is received (on-chain trigger), dispense a food item from a vending machine (off-chain event triggered by on-chain event).
When combining both inbound and outbound oracles, communication between two separate networks can be achieved: data from one network can be made available offline through an outbound oracle, and can then be transferred to another network through an inbound oracle.
Apart from distinguishing between inbound and outbound oracles, they can also be differentiated by the extent of their centralization. An oracle service, which is off-chain software that sends RPC commands to a blockchain node, can be either centralized or decentralized.
In the case of a centralized oracle, the node the oracle client runs on is entirely operated by a single entity, which is the sole provider of information. The accuracy of data and the security of the oracle would be entirely dependent on the design and security efforts implemented by the entity.
In a decentralized oracle, many nodes cross-reference data inputs from different sources to ensure that more accurate data is transmitted. These decentralized oracle networks (DONs) have their own consensus mechanisms with which greater data reliability is achieved.
While oracles provide opportunities for new and novel applications in decentralized blockchain ecosystems, there is also a risk involved with using oracles: if an oracle is compromised, it can be assumed that any smart contract that relies on the oracle is also compromised. Furthermore, even if oracles are not compromised, the data sources could be compromised. The garbage-in-garbage-out principle exemplifies that oracles may verify input data that is correct (received from trusted sources) and pass it on to a smart contract, but the data itself may be flawed.
Generally speaking, a relayer is an entity that relays information from one party to another. When looking at different projects, the term “relayer” may be used to describe any such transfer between two parties. We can discern three types of relayers in blockchain projects:
An entity that aggregates individual trade orders into an orderbook for users to store and find matched orders off-chain, where only the final transaction is submitted (relayed) to the network. See0x.
An entity that executes transactions on behalf of users (the transaction is relayed by a third party). See Tornado Cash.
An entity that is responsible for the transfer of information between two nodes on different networks (data is relayed from one network to another). See LayerZero or Relay Chain.
In the context of blockchain interoperability, the third type of relayer plays an important role in enabling cross-chain communication.
Cross-chain communication is achieved by having a relayer client connecting both source and target networks. The relayer client continuously monitors the source network for certain trigger transactions to take place. Once such a transaction takes place, the relayer client relays the actions specified in that transaction to the target network. To be able to monitor and relay transaction, the relayer client must be installed on a node together with the client software of both the source and target networks. This means there will be a cross-section of nodes that are connected through a relay layer.
A single node needs the source network client, the target network client and the relay client installed to relay messages between networks.
Similar to relayers, light clients need to be installed on a node that has both the source network client and the target network client installed. Unlike relayers, which monitor events on both the source and the target network, the role of light clients is to verify transactions with as little information as possible (hence light client). Instead of downloading the entire blockchains for the source and target networks, the light client downloads and verifies transactions by using only the block headers of both networks.
This is achieved through a “Proof-of-Assets” mechanism, which verifies that a transaction has taken place on source network. The Light Client stores the block headers of the source network, and when a transaction needs to be verified, the user provides a Merkle proof to the Light Client which is used to reconstruct the Merkle tree of the source network and check it against the stored block header.
A Merkle proof usually consists of three parts:
The Merkle root: the root of the Merkle tree that includes the transferred assets
The Merkle path: a path of hashes that starts from the Merkle root and ends with the leaf node that represents the transaction where assets are locked
The transaction index: this is just an index of the position of the transaction in the Merkle path
Since the Merkle root is part of the block header, using the Merkle proof the Light Client can verify that a transaction is indeed part of a specific Merkle tree, and then verify that the Merkle tree belongs to a specific block header of the source network (see the “The Block Structure”and“Merkle Root”sections of Master Web3 Fundamentals: From Node to Network for more information on block headers and Merkle trees). With this mechanism a user can prove that assets were transferred to a bridge account on the source network (Proof-of-Assets) without additional cross-chain communication requirements.
Oracles vs. Relayers vs Light Clients
While oracles and relayers both enable communication between two networks, they differ fundamentally in regards to the integrity of the data communicated.
Oracles provide data to a decentralized network, and the integrity of the data is a result of the integrity of the source of the data. If the source provides inaccurate data that is then made available on-chain, the oracle has no way of judging the accuracy of the data – the data is provided as-is. Furthermore, oracles provide data to smart contracts.
Relayers, on the other hand, do not act as a source of truth for data provided. Instead, they focus solely on passing on (relaying) data across networks in a trustless manner, and their communication is between network nodes only.
Finally, similar to relayers, light clients live on network nodes and do not act as a source of truth either. Unlike relayers, their main purpose is not the relaying of messages between networks. Their purpose is to store block headers and validate transactions on-demand.
A centralized or trusted entity is a black box. Any of the aforementioned technologies (oracles, relayers, light clients) can be used in isolation or in combination to pass messages between blockchain networks. A centralized entity could also use proprietary client software that they developed in-house. It is impossible to know for sure without the source code being made publicly available.
Cross-chain Asset Equivalence
When an asset is moved from one network to another, it is not physically moved like an apple can be moved from one bag to another. Instead, the asset is made unavailable on the source network, then an equivalent asset is made available on the target network. This is due to the closed nature of blockchain networks: they cannot communicate outside of their own networks, which is why an interface (bridge) is needed to facilitate this communication.
Generally speaking, there are three approaches to making assets available on a target network, which result in different kinds of assets being made available:
Lock-and-mint Mechanism 🡪 wrapped tokens
Burn-and-redeem Mechanism 🡪 native tokens
Local Liquidity Pools 🡪 wrapped/native tokens (traded token equivalent)
In the lock-and-mint mechanism, assets are received by a bridge on one network where the assets are locked, and a token that represents the value of the source token is minted on the target network. These representative tokens are called “wrapped” or “pegged” tokens, implying the source token is wrapped in a container that is compatible with the target network and that the value of the asset on the target network is pegged to the value on the source network. For example, wrapped Bitcoin (wBTC) or pegged Bitcoin (pBTC) issued by pNetworkon the Ethereum network.
Wrapped or pegged tokens are an “I owe you” from the bridge to the user that can be redeemed 1:1 for the source tokens. While this means that the wrapped tokens maintain the value of the source token, this also means that any issues with the bridge can undermine the value of the wrapped token.
Another mechanism to making assets available on a target network is the burn-and-redeem mechanism. Unlike the lock-and-mint mechanism where the bridge holds the assets on the source chain, the burn-and-redeem mechanism destroys the assets on the source chain and mints an amount equal to the destroyed assets on the target chain, which can be redeemed by the user.
The minting of a specific asset on the target chain, however, must be executed by the issuer of a specific asset. For example, the issuance of USDC on the Ethereum network and USDC on the Solana network are both handled by Circle, the issuer of USDC. There is only one real USDC, which is the one issued by Circle, just like there is only one real USD, which is the fiat currency issued by the United States Government.
This, however, also means that relevant smart contracts must be set up by Circle on both networks before a bridge can implement the burn-and-redeem mechanism between both networks. This is referred to as a “multichain” smart contract deployment, and USDC can thus be called a multichain asset, which is a native asset on both source and target networks.
Multichain vs Cross-chain
Multichain and cross-chain are two different concepts. Multichain refers to the issuance of smart contracts and assets on multiple networks, and is built on the idea of a “multichain future”. This theory states that there will not be one single blockchain to rule them all, but instead there will be many blockchains with unique characteristics that will fulfill specific needs.
However, as mentioned in the opening of this piece, users may want to use different networks for different purposes. Cross-chain describes the interoperability between different networks and enables communication and asset transfers between networks.
Note: there is an asset bridge aptly named “multichain”. The bridge is a web3 product that is not related to the above concepts (multichain theory, multichain smart contract deployment and multichain assets).
Local Liquidity Pools
The last approach to making assets available on a target network is to use assets that already exist instead of minting new assets. In this approach the bridge maintains liquidity pools on both networks from which bridged assets are redeemed. A liquidity pool is essentially a bucket of assets that are collected within a smart contract.
This approach requires the bridge operator to maintain liquidity pools on both networks, and ensure both pools on each network always have sufficient assets to execute a bridging request. Examples of bridges that implement this mechanism include: XY Finance and Thorswap.
Maintaining liquidity pools also allows for cross-chain swaps involving different assets, letting users trade one asset (e.g., wBTC) on one network for another asset (e.g., ETH) on another network at a pre-determined exchange rate. The assets traded have an equivalent value, but are not the same asset or representative asset.
Bridge Asset Security and Complexity Trade-offs
Each of these three mechanisms implies a trade-off between security and operational complexity. While local liquidity pools are also the most complex to implement and operate, the compartmentalization of asset pairs can help to limit the impact of potential exploits. The lock-and-mint and burn-and-redeem mechanisms are easier to implement and introduce far less complexity, but an attack on these bridges could lead to far greater losses.
Asset Translation Process
By combining the vehicle of communication (oracles and relayers) with an asset management and issuance approach (lock-and-mint, burn-and-redeem and local liquidity), a process can be designed to translate the value of an asset on a source chain to an asset of equal value on a target chain. Or phrased differently: assets can be bridged from one chain to another using an interface.
There are multiple ways of creating an interface between two or more networks, which can broadly be classified into below four approaches:
Sidechains or Relay Networks (require smart contracts and a relayer)
Simple Payment Verification (requires smart contracts and a relayer or oracle)
Atomic swaps (require hash-locking schemes and communication between both parties)
Notary Scheme Bridges
In non-Web3 terms, a notary is an official who has the legal authority to verify the authenticity of documents and serve as an impartial witness when legal documents are signed between two parties.
In the notary scheme approach to bridge design, a centralized third-party acts as the notary for a cross-chain transaction, verifying that assets have been received on the source network, and confirming to the target network that equivalent assets are to be sent to the user.
The user will first signal intent to bridge assets to the trusted bridge, indicating which address to receive assets with on the target network. Because blockchains networks are closed networks with no means of communicating outside of their own boundaries, the bridge must monitor the account or address on the source chain that is to receive funds from the user. Once the bridge observes and verifies that the asset has been received (and sufficient blocks have been validated following receipt ensuring no reorganization of the last blocks in the blockchain), then the bridge will send a command to the target network to make an equal denomination of assets available and send these to the user-designated address.
A federated notary scheme is more secure than a single signature notary scheme, as the same bridge request needs to be verified by multiple parties before the funds are released on the target chain. A federated bridge can both be trusted, i.e., users trust that the parties will not act maliciously, or it can be bonded. In a bonded bridge, every party must put up collateral that can be slashed if a party acts maliciously or negligently.
Sidechains & Relay Chain Bridges
Sidechains and relay chains rely on a separate network with their own consensus layers and validator sets to verify certain events have occurred on the source network and deliver data to the target network. The difference between sidechains and relay chains lies in their purpose and bridging mechanism.
Relay chains are purpose-built to relay transactions to a large number of networks. Having a single relay network allows for standardization of cross-chain transfers: any network that connects to the relay network can easily bridge to any network that is already set up with the relay network. Furthermore, the relay network’s blockchain acts as immutable evidence that transactions were initiated.
Sidechains are also purpose-built networks, but their focus is usually on alleviating challenges the mainchain faces. Ronin & Gnosis Chain (formerly known as xDai) are examples of this. Ronin is a sidechain specifically for the Web3 game Axie Infinity, and it exists solely to support Axie Infinity’s ecosystem with reduced transaction cost and faster transaction finality (‘scalability’ from the blockchain trilemma). Gnosis Chain is an EVM-based sidechain to the Ethereum network, allowing for cheaper transactions compared to Ethereum. Gnosis Chain is arguably most known for it’s role in popularizing POAPs.
Blockchains such as Ronin and Gnosis Chain are purpose-specific sidechains which do not act as standardized bridges to other networks. Polygon, however, has a vast ecosystem and has solidified its position as a highly interoperable EVM-compatible sidechain. As a result, many different networks use Polygon as an interface to connect to different layer1 networks.
Blockchains of blockchains (also referred to as BoBs) also utilize the relay network approach to pass messages and assets between networks. For example, Polkadot has its own relay chain which supports cross-network communication between its parachains. Parachains are essentially Polkadot-compatible networks that run in Parallel to each other and are hooked up to the relay chain.
Cosmos is another example of a BoB. While the Inter-Blockchain Communication (IBC) protocol developed by Cosmos can connect any blockchain, Cosmos released their own layer1 network called Cosmos Hub which acts as a hub to connect different blockchain networks. If every network were to connect directly with each other using IBC, the amount of connections and infrastructure needed would grow exponentially. Cosmos Hub is not a relay chain, however it can be considered the Cosmos equivalent of Polkadot’s Relay Chain.
We cover IBC briefly at the end of this piece (click here to navigate to the IBC section).
Unlike relay-based bridges where the relayer maintains full copies of both source and target blockchains, bridges based on Simple Payment Verification (SPV) require far fewer resources. The idea behind SPV-based bridges is to allow the target network to verify that a transaction has taken place on the source network by storing only the block headers of the source network instead of the entire transaction history. This is achieved by using the “Proof-of-Assets” mechanism described in the Light Clients section, in which a Merkle proof is passed along and compared against a block header to verify transactions.
To bridge assets using an SPV-based bridge, users first send their assets to a bridge contract on the source chain. The bridge contract creates a “commitment transaction” on the source chain, which includes the Merkle proof of the transferred assets and a unique identifier. An interface (a relayer, an oracle or a light client) monitors the incoming commitment transactions and saves the block headers of the source network which contain commitment transactions. The user redeems the funds on the target network by submitting the unique identifier and Merkle proof to the bridge account. Using the Proof-of-Assets (Merkle proof and unique identifier), the light client reconstructs the Merkle tree and cross-references it against the Merkle tree of the block header stored by the interface. If this check returns a valid result, the funds are released to the user on the target network.
Light clients are often used for SPV-based bridges, because they do not require the node to store the full blockchains of the source and target networks, and they allow for trustless verification of transactions without requiring a quorum of nodes to agree on transaction validity. In the context of SPV-based bridges, light clients fulfill two main tasks:
Storing of block headers (data storage)
Validation of transactions (computation)
As long as the storage and computational requirements can be fulfilled, SPV-based bridges can use relayers and oracles instead of light clients. For example, a relayer could store block headers locally for faster retrieval by the target network and transaction validation could be executed by smart contracts or outsourced to an oracle.
When designing an SPV-based bridge, considerations of cost (smart contract computation vs. client computation), complexity, security and speed requirements will help to determine the optimal SPV-based bridge design for a specific use case.
Atomic swaps are used to facilitate the peer-to-peer transfer of tokens between two parties across different blockchains, without the blockchains interacting directly with each other. For this method of transferring tokens, token/network compatibility and network interoperability are not required to trade tokens between two parties across two networks.
Atomic swaps got their name from the idea that the exchange of one token for another happens atomically, which means that the trade occurs all at once. Traditionally a swap is executed over multiple steps, including placing an order, waiting for it to be filled and finally receiving the funds. With an atomic swap, the exchange of tokens can happen at once without the need for a trusted third party.
Unlike the notary scheme bridge and relay chain bridge, which move an individual’s assets between networks, atomic swaps enable the trade of assets between users separately on two networks without the networks having to ever communicate with one another. This is achieved with a “Hashed Time-Locked Contract” (HTLC) on both networks.
Hashed Time-Locked Contracts (HTLC)
HTLCs were first introduced on the Bitcoin network in BIP-199, and are a combination of a hashlock and a timelock. A hashlock requires the receiver of a payment to provide a passphrase to accept the transaction, while the timelock specifies that the transaction must be executed within a specific amount of time, otherwise an alternative set of redeem conditions are activated that allow the funds to be returned to the sender. HTLCs are timebound conditional payment contracts.
Atomic Swaps: Successful Swap Scenario
Atomic swaps work by deploying two HTLCs – one by each party on each network. This means that Party A deploys an HTLC on Network A, and Party B deploys an HTLC on network B. One party must first decide a secret passphrase and hash it. The hash of the passphrase is then used as an input parameter for HTLC’s hashlock functionality. Furthermore, both parties must set a time threshold after which the funds are unlocked. These are set in staggered time intervals: the party without the passphrase (Party B) sets a shorter time threshold (e.g., 24 hours), while the party with the passphrase sets a longer time threshold (e.g., 48 hours).
After Party A sets the passphrase and generates the hash (step 1), Party A shares the hash with Party B, and both parties deploy the HTLC and lock the funds they agreed to trade on each network (step 2 – Party A deploys an HTLC on Network A, and Party B deploys an HTLC on network B). Once both HTLCs are set up on both networks, Party A can use the passphrase to redeem funds from Party B’s HTLC on Network B. By doing so, the passphrase is revealed on-chain to Party B (step 4) and Party B can use the passphrase to redeem the funds that Party A locked into an HTLC on Network A.
Atomic Swaps: Unsuccessful Swap Scenario
However, what happens if Party A never redeems the funds on Network B and the passphrase is never revealed? This is where the timelock comes in.
The timelock enables a transaction to be refunded after a certain amount of time has passed. Remember: the timelocks are staggered:
Party A’s timelock is 48 hours (they hold the secret passphrase)
Party B’s timelock is 24 hours
If Party A does not reveal the secret passphrase within 24 hours, then Party B can refund the funds they locked in their HTLC on Network B (step 3). After 48 hours, Party A can do the same on Network A.
This staggering of timelocks is extremely important: if both timelocks were set for the same duration (e.g., 24 hours), the party with the passphrase (Party A) could redeem funds right before the time threshold passes (e.g., 23 hours 59 minutes) and right when the threshold passes (24 hours) immediately refund their own HTLC. Staggering the timelocks allows both parties sufficient time to either complete the swap or to refund their assets.
Atomic Swaps Between UTXO-based Networks
This section contains advanced content and has been hidden. It is advised to skip this section. Click to read.
On the Bitcoin network, every transaction is a code block that contains an unlocking script and a locking script. The unlocking script verifies that the unspent transaction output (UTXO) is valid and unlocks the funds for spending, while the locking script handles the spending criteria and ensures that funds can be spent (i.e., the UTXO used for a new transaction) only when certain conditions are met (see the UTXO section of Master Web3 Fundamentals: From Node to Network).
On the Bitcoin network, which criteria can be used to unlock a transaction and which kind of addresses are required to receive certain types of transactions are all carefully defined. To use an HTLC on Bitcoin, the user must create a P2SH transaction.
Generally speaking, transactions on the Bitcoin network have both an unlocking script and a locking script. The unlocking script unlocks the UTXO that is used as an input for the transaction, and the locking script locks the funds using the recipient’s public key so that they can only be redeemed with the recipient address’s signature. This is called a Pay-to-Public-Key-Hash (P2PKH) transaction.
A Pay-to-Script-Hash (P2SH) transaction is different in that instead of using the recipient’s public key, it hashes a script that contains unlocking conditions. In a P2PKH transaction the unlocking condition would involve using the private key corresponding to the public key in the locking script, but in a P2SH transaction this script can contain complex logic defined by the sender. The P2SH transaction can be unlocked by anybody, as long as they have an unlocking script that, when hashed, matches to the hashed unlocking script of the P2SH transaction.
This means that anybody can unlock a P2SH, as long as they can present the proper script during redemption. As a result, the P2SH unlocking script is instead called a “redeem script”. To prevent anybody from redeeming a P2SH transaction, an additional condition can be added requiring the signature of a specific recipient.
HTLC on EVM-compatible Networks (Account model)
Since a HTLC transaction is just a transaction that has two conditions that must be met before it can be spent, this can easily be replicated using a smart contract on EVM-compatible networks. In UTXO-based networks anybody can send a P2SH transaction, but in EVM-compatible networks such logic must be first deployed as a smart contract before it can be used.
The Atomic Swap Process (Overview)
Let us assume we have two parties A and B, who want to swap tokens at a pre-determined rate with each other on two separate networks within a specific timeframe. For simplicity’s sake, we will assume that the swap takes place between two UTXO-based networks (e.g., Bitcoin and Cardano).
Below overview shows every step of the atomic swap. In below sections, these will be explained in greater detail.
Setting up the Atomic Swap
For this atomic swap to work, two HTLC need to be used (one on each network), with the following conditions set as unlocking conditions:
Passphrase (same on each network)
Signatures of sender and recipient
Since both networks are UTXO-based, P2SH transactions can be deployed on both which enable the HTLC functionality.
The Atomic Swap Process: Asset Movement Transaction Preparation
First, Party A must pick a passphrase. This passphrase is hashed and the hash is sent to Party B. Then, each of the parties prepares a P2SH transaction on the network on which they are sending funds on and use the hashed passphrase as one of the inputs:
Party A prepares a transaction P2SH1 with a UTXO1 that can be redeemed by:
The passphrase and Party B’s signature
Party A’s and Party B’s signature
Party A then signs this transaction, but does not broadcast it yet.
Party B does the same; they prepare a transaction P2SH2 with a UTXO2 that can be redeemed by:
The passphrase and Party A’s signature
Party A’s and Party B’s signature
The transaction is also signed by Party B, but Party B does not broadcast it yet. It is very important that both transactions are not broadcast to the network yet, because if Party B broadcasts their transaction, Party A could redeem the transaction with the passphrase. Instead another step is required that acts as a failsafe and allows a party to get their funds back, should the other party not fulfill their obligations.
The Atomic Swap Process: Asset Refund Transaction Preparation
Each of the parties creates a second transaction that is timelocked, meaning it can only be redeemed after a certain amount of time has passed, which redeems the funds from the UTXO of their respective P2SH transaction. Each party then passes the timelocked transaction to the other party, has them sign it and return it. Both transactions are also not broadcasted. This transaction fulfills the second redeem condition of each of the UTXOs (Party A’s and Party B’s signature). In summary:
Party A creates a timelocked transaction (e.g. 48 hours) which returns funds from UTXO1 to their wallet and has Party B sign it and returns it to Party A. Party A does not yet sign and broadcast the transaction.
Party B creates a timelocked transaction (e.g. 24 hours) which returns funds from UTXO2 to their wallet and has Party A sign it and returns it to Party B. Party B does not yet sign and broadcast the transaction.
Now each party is in possession of a transaction that only requires their own signature to return the funds from the first transaction. This acts as an insurance in case the other party does not complete their side of the transaction. Because both transactions are timelocked and not yet broadcast, each party cannot refund their initial transaction as long as the time threshold has not passed.
Note that the timelocks on both transactions are different in duration. This is so that one party can first observe if the other party is sticking to the agreement or not. If both timelocks were identical, both parties may wait until time has expired waiting for the opposite party to broadcast the asset movement transaction.
The Atomic Swap Process: Executing the Swap
Each of the parties submits their asset movement P2SH transaction to the network from which they are sending funds:
Party A broadcasts P2SH1 on Network A
Party B broadcasts P2SH2 on Network B
At this stage Party A reveals the passphrase to Party B. Now Party A and Party B have everything they need to fulfill the first unlock condition of the asset movement P2SH transaction (passphrase and signature of other party).
If Party A is malicious and does not share the passphrase with Party B, party B can sign the asset refund transaction and submit it to the network to get his funds back from the previously broadcasted P2SH2 transaction (by fulfilling the second redeem condition; both Party A’s and Party B’s signature).
Final Notes on Atomic Swaps
In atomic swaps active participation and communication between both parties is required. While above we write that both parties communicate directly with each other, this communication could be automated with oracles or relayers.
Atomic swaps allow for trustless trades between individuals; however, they do pose another challenge: how do two individuals find each other that want to trade a specific amount of tokens between two specific networks? That’s where Atomic Exchanges come in. Atomic Exchanges allow users to submit atomic swaps, and act as a marketplace for peer-to-peer trades. Popular Atomic Exchanges include Atomic DEX and Atomex.
Other Interoperability Technologies
Inter-Blockchain Communication (IBC) Protocol
The Inter-Blockchain Communication (IBC) Protocol is a communication standard that enables cross-chain communication between two blockchains in the Cosmos ecosystem. Fundamentally, IBC relies on relayers for passing messages between networks and light clients that reside on nodes of both sending and receiving networks to verify incoming messages.
IBC light clients are similar to SPV light clients, in that they allow for the verification of transactions on the source network without maintaining a full copy of the blockchain. In practice, IBC light clients are more flexible, as they allow for the verification of a broader scope of networks by using cryptographic proofs beyond only Merkle proofs.
The remaining components (IBC/TAO and IBC/APP) are standards pertaining to the Transport, Authentication and Ordering of packets, and specifications for the type of data or assets being sent (e.g., ICS-20 and ICS-721, which are ERC-20 and ERC-721 equivalents in the Cosmos ecosystem).
Aiming to be the underlying communication protocol between blockchain networks, the aptly named LayerZero (see whitepaper) relies on relayers and what they call “Ultra Light Nodes” to transmit messages between networks and verify incoming messages.
Unlike Light Nodes, which require a client to be run on a node of the source and target networks, an Ultra Light Node (ULN) exists as a smart contract on both networks. This heavily reduces the costs associated with running an endpoint, because it entirely removes the requirement to run your own node to run an endpoint.
The way that ULNs work is that they are responsible for verification of incoming transactions, but they pull the block-related data from an oracle on-demand instead of storing them locally.
In this setup, relayers are responsible for passing on messages, while oracles are used to fetch block-related data required for the verification of the data sent by the relayer. This way security is outsourced from the endpoints to the oracles and relayers. In this setup, the only way that malicious transactions can be passed on between two networks is if relayers and oracles collude.
LayerZero allows for developers to set themselves which oracles or relayers they wish to use. If a developer is malicious, they could set up LayerZero endpoints to use private oracles and relayers. However, if public oracles such as ChainLink are used, the likelihood of malicious activity can be heavily reduced.
Now that we understand all the components involved in designing interoperability, we can take a brief look at some of the most prominent bridge hacks and trace their security shortcomings to some of the aforementioned design decisions.
All bridges require some form of validation that assets have been received on the source network to make assets available on the target network. In many bridge exploits, that validation is spoofed to release assets on the source or the target network, even though the attacker does not hold the relevant assets.
Let’s look at some exploits of 2022 that cover various attack vectors.
The Binance Bridge is an interoperability protocol that allows for asset transfers between Binance Chain and Binance Smart Chain, two separate blockchains that are part of the BNB network. The bridge uses a lock-and-mint mechanism, for which wrapped tokens are minted on Binance Smart Chain in a 1:1 ratio (i.e., 1 BNB = 1 wBNB). Binance Smart Chain is a fork of the Ethereum network and regular BNB tokens from the Binance Chain are incompatible with the Binance Smart Chain. As a result, the BEP2-based BNB tokens from Binance Chain are wrapped into BEP20 BNB tokens (ERC20 equivalent) on Binance Smart Chain. So; while the token is called “BNB” on Binance Smart Chain, it is actually a wrapped BNB token.
The attacker was able to mint and release 2,000,000 BNB on Binance Smart Chain (target network) by tricking the bridge into thinking that an equivalent amount was deposited on the Binance Chain (source network). The team at Binance quickly caught on and asked validators to suspend the network, which allowed them to roll-back the network state to before the attack took place. Before the network was suspended, the attacker was able to bridge out roughly US$100m worth of BNB tokens to other networks.
The Ronin Network hack was one of the largest hacks ever seen in Web3. The Ronin Network is a bespoke network for the popular Web3 play-to-earn game Axie Infinity, and is a sidechain to the Ethereum network. The Ronin network acts as a cheaper more efficient means to play Axie Infinity, allowing for cheaper transaction costs and faster transactions when compared to Ethereum.
The Ronin sidechain, however, only has 9 validators which verify transactions. The low number of validators is what drives the efficiency of the network, but is also a security concern, because an attacker needs control of only 5 of the 9 validators to execute a 51% attack. – and this is exactly what happened. An attacker managed to get ahold of the keys that control 5 of the validators, and then signed his own transactions bridging out assets worth over US$600m, primarily in Ronin Wrapped Ether from Ronin (source network) back to Ethereum (target network).
$80m FEI Protocol and Rari Capital Pool Attack
While not a bridge or interoperability hack, the attack on FEI Protocol and Rari Capital shows how liquidity pools can limit the extent of an attack by requiring the attacker to repeat an attack on multiple pools.
In this exploit, the attacker deposited $150m USDC via flashloan to take out a loan of 1977 ETH. Through a smart contract exploit, the attacker managed to withdraw his initial deposited collateral of $150m USDC, while keeping the loan of 1977 ETH. This is known as a reentrency hack. The attacker then had to repeat this process on multiple other liquidity pools draining them one by one.
While in this case liquidity pools did not prevent other pools from being drained, it did require the attacker to repeat the process, which can provide valuable time for teams to freeze operations before greater losses are incurred.
$0.8m Luna 2.0 Oracle Exploit
This exploit is not a bridge hack, but it illustrates how Oracle data is not always right and can be exploited by an attacker.
When Luna 2.0 launched, the yield protocol Anchor accepted deposits of both Luna 1.0 and Luna 2.0. However, the Oracle price feeds for Luna 1.0 were incorrect: while the real-world value was US$0.001, the Oracle reported a value of roughly US$5, which coincidentally was the same price as Luna 2.0. Anchor’s smart contracts cannot tell whether this number is correct or not, instead viewing whatever number the Oracle provides as a source of truth.
An attacker spotted this and deposited a large amount of almost worthless Luna 1.0 tokens, which Anchor interpreted as being worth far more than they actually were. The attacker used those tokens to take out a loan in the UST stablecoin, which was used for further activity on Anchor. The attacker made off with roughly US$0.8m in profits.
Summary – Interoperability
The blockchain trilemma can be used to explain why multiple blockchain networks exist by illustrating that networks need to make trade-offs between scalability, security and decentralization. Different networks make certain trade-offs between these three properties, which enable networks to be suitable for specific use-cases (e.g., gaming on highly scalable networks, or censorship resistance on highly decentralized networks). The existence of multiple blockchains is referred to as “multichain”.
Users may wish to move assets between networks to use each network for the specific use cases that it excels at. To achieve this, networks need to be interoperable. “Cross-chain” communication describes the passing of messages (including transactions or assets) between a source and a target network. The primitives that are used by various projects and protocols to enable cross-chain communication include oracles, relayers, light clients and centralized entities.
Beyond being able to communicate, assets also need to be compatible between the networks – or a compatible equivalent representation of the source asset needs to be made available on the target network. The mechanisms for which a compatible asset can be made available on a target network are the lock-and-mint mechanism, the burn-and-redeem mechanism and local liquidity pools.
By combining the interoperability primitives with a mechanism to ensure cross-chain asset equivalence, we can design an asset translation process that enables the passing of messages from one network to another, and the issuance of an asset on the target network that is equivalent to an asset on the source network. The resulting product is usually referred to as a “bridge”. Bridge designs that have been described in this piece include notary schemes, sidechains, relay networks and atomic swaps.
Moving on, we took a brief look at the Cosmos Inter-Blockchain Communication (IBC) protocol and the LayerZero protocol, which use the above primitives to create a flexible low-level communication interface between blockchains.
Finally, a few recent high profile bridge exploits were reviewed based on the various aspects of interoperability explored within this piece.
Subscribe to the Web3edge newsletter and never miss new content!
This is it for the second part of the Master Web3 Fundamentals series, thank you for reading! If you enjoyed this piece, please consider sharing it!
If you have any feedback about this piece or want to discuss its contents, reach out to @0xPhillan on Twitter.