30/08/22

Storj V3: AWS-Compatible Decentralized Cloud Storage

Written by

Read time

12 min

Stroj is a decentralized file storage and content delivery network that aims to replace Amazon Web Services (AWS) S3. The Storj protocol implements a peer-to-peer storage system that encrypts, shards and distributes data to nodes around the world, bearing some similarities to Sia, all while avoiding the use of a blockchain. This design decision is meant to foster greater scalability: in a system where actions are meant to have only milliseconds of latency, the time overhead of waiting for a blockchain to reach consensus makes a blockchain an unsuitable mechanism for a decentralized storage provider with the ambition to replace AWS both in scale and performance.

Storj is currently on its third major iteration, hence “V3”.

Token

The Storj network is uses the Storj token on the Ethereum network as the default payment mechanism for storage and bandwidth payments. While the Storj token is the default, the network is designed in a way to allow for other payment mechanisms to be adopted in the future.

Storage Technology & Storage Mechanism

The Storj network differentiates actors in the network based on three peer classes:

  • Storage nodes (object storage servers): provide storage space and bandwidth, expected to remain online at all times
  • Uplink (clients): application or service that wants to store or retrieve data, not expected to remain online
  • Satellite (metadata servers): caches node address information, stores per-object metadata, maintains storage node reputation, manages billing and payments, verifies file integrity, and reconstructs files, and manages authorization
Diagram

Description automatically generated with medium confidence
Figure 1: The three peer classes. Source: Storj V3 Whitepaper

These three peer classes form a symbiotic relationship in that they all rely on each other for this system to work in a trustless manner. Unlike other systems where consensus must be achieved to mine a block to a blockchain that runs validations, the peer classes operate independently and can form clusters within which file storage and transfer operations are executed in what is essentially a decentralized reputation-based file storage system.

The main actions executed by actors on the Storj network include:

  • Node identity creation
  • Data storage
  • Data transfer (inbound & outbound)
  • Audits (validation of integrity of stored data)
  • Data repair (reconstructing files with poor integrity)
  • Authorization management
  • Reputation database management
  • Payment and billing

At a high level, storage users pay satellites to coordinate storage with storage nodes. To connect with a satellite, users use a customer application or uplink, that facilitates communication with a satellite. Once nodes have been selected by the satellite, the uplink connects directly to the storage node to store files. The files are split into equally sized segments using erasure encoding before they are transmitted using the uplink.

In this protocol satellites are the coordinators of the system and manage various administrative overheads, including maintaining an up-to-date database of where file segments have been stored, launching audits and coordination data repair efforts. Satellites are also the billing and payment centers of Storj, as they track how much data has been retrieved from storage nodes, how much data storage nodes have saved at any given moment in time, and which repair efforts a storage node was involved in. Satellites then pay storage nodes on a regular basis using the Storj ERC20 token that lives on the Ethereum blockchain. The Storj token is in this sense purely a utility token meant to pay for storage transactions in the network.

Storage nodes rent their hard drive space and provide bandwidth to allow for uplinks to send data to store or retrieve stored data from storage nodes, the connections of which are coordinated by satellites. When an uplink receives data to upload, it splits the data using Reed-Solomon erasure codes, similarly to Sia, which creates 80 constant-size file segments each encrypted using a different encryption key. The satellite tells the uplink which storage nodes to connect to and signs that message. Finally, the uplink then connects to storage nodes providing the signed message from the satellite (called a bandwidth allocation) to transfer the individual file segments to the storage node.

When a file is transmitted, it is not transmitted at once, instead the segment and bandwidth allocation is broken down into smaller pieces, which are transmitted and validated one by one. Bandwidth allocations are stored by storage nodes to claim bandwidth payments from satellites. Splitting up a transmission like this ensures that a storage node cannot go offline to avoid receiving and storing a full file, and still claim payment for the full bandwidth allocation.

Graphical user interface, application

Description automatically generated
Figure 2: Diagram of a put operation. Source: Storj V3 Whitepaper

It’s important to note that initial inbound transmissions of data for storage are not paid for on the Storj network. Instead, storage nodes are paid for all other uses of their bandwidth (e.g., data retrieval, repair work) and for the storage space used.

In this system there is no single source of truth, such as a blockchain, which stores and manages all storage orders and token transactions. Instead, every satellite and storage node maintain databases that hold metadata about network participants they’ve interacted with. While satellites also hold metadata on file locations (i.e., server addresses), both satellites and storage nodes hold data relating to the performance of their counterparts. These databases form Storj’s audit, data repair and reputation systems.

Audits are executed regularly by satellites and ensure data availability on storage nodes. A satellite sends a challenge to a storage node requesting proof that the node has indeed stored the data it is expected to have stored. Audits first choose a “stripe” – a subset of a segment – and then run an algorithm across all erasure shares stored across storage nodes to identify faulty data. When sufficient storage nodes return correct information (which they are incentivized to do), any incorrect or missing responses can be identified. Audit results feed into Storj’s reputation system – more on that later. Storage nodes that fail these audits are eventually removed from a satellite’s storage node database and can lose funds held in escrow. Furthermore, these storage nodes may also receive limited to no future payments, further incentivizing to adhere to the system.

When a node goes offline, taking with it pieces all the segments stored on it’s node. If the segments fall below a certain safety threshold (set by uplinks as desired durability), the satellite marks the pieces as missing and starts the data repair process. Since the satellite holds the data locations in a local database, it can reverse-lookup other storage locations of these segments. These are downloaded by the satellite, reconstructed, and the missing pieces will be regenerated and uploaded to new nodes. To assess whether the repair was successful, a validation hash is stored in the satellite’s database, and compared against a piece hash retrieved from the storage node after storage is complete.

Storj’s storage node reputation system consists of four parts:

  • Proof of Work (PoW) identity system: to enter the network and communicate with satellites, storage nodes must prove they are invested by solving a PoW puzzle. The difficulty of the puzzle is set arbitrarily by satellites and the system is expected to self-balance over time.
  • Vetting process: unvetted storage nodes are slowly added as additional storage targets for storage requests. They are selected as additional nodes on top of a satellites existing preferred storage nodes so as not to affect network integrity, but also to allow for data collection about the node.
  • Filtering system: nodes that fail audits, fail to return data, are too slow or do not have enough uptime are disqualified by the satellite for future storage operations. Once disqualified, a storage node must restart the vetting process to re-enter the network.
  • Preference system: based on storage node latency, history of reliability and uptime, geographic location and other collected data, they are given a greater selection likelihood for new data uploads.

The preference system only determines where new data is stored, and does not affect already stored data or repair data.

Since Storj is a trustless system, storage nodes have a reputation system of their own to determine trustworthy satellites. Storage nodes collect data on payment, demand generation and performance history. If satellites score poorly, storage nodes will avoid accepting their data. Furthermore, when new satellites join the system, storage nodes will start a vetting process of their own, restricting interactions with new satellites and collecting data to gauge their trustworthiness.

The role of satellites is extremely important in this system, and requires satellites to be constantly running – as should these go offline, repair processes would stop and eventually all stored copies of the file segments will disappear. It should be noted here that a satellite instance does not necessarily constitute one physical machine. Instead, a satellite can run as several servers, and can be backed by a horizontally scalable trusted database to ensure greater uptime. Furthermore, uplinks can connect to multiple satellites to increase data availability and permanence. Nonetheless, if a satellite goes offline, all data coordinated across storage nodes by that satellite will be inaccessible: the data will remain online on storage nodes, however will become inaccessible as the retrieval mechanism of the satellite would be unavailable.

As can be seen from the above, Storj implements various techniques to ensure data availability, tamper-protection and privacy. Storj also allows the editing of data – part of their mission to become an AWS S3 competitor – through the use of authorizations. Users will communicate with a satellite to request adding, removing and editing of files, and if these users have the right authorizations, they will be authenticated which will allow them make changes to their uploaded data according to their authorization configurations.

Pricing Mechanics & Data Permanence

In Storj uplinks, satellites and storage nodes are three distinct actors with distinct functions, all facing different target groups. Uplinks are end-user facing and build a user-friendly way for people with data storage requirements to store data, without interacting with the backend architecture, namely the satellites and storage nodes. Users – through uplinks – end up paying satellites to coordinate communication with storage nodes, the latter of which are often operated by anybody who has additional bandwidth and storage capacity to spare.

In this structure, satellites act as a sort of escrow that collect user payments and hold these, while storage nodes deliver on the storage and retrieval requirements and are paid in regular cycles by satellites based on files stored and bandwidth used in STORJ tokens, and uplinks handle user payments to satellites in multiple currency formats (STORJ, fiat money or other means).

Public storage node payments appear to be set centrally by Storj, and are as follows:

  • Storage – STORJ tokens at $1.50 per TB per month (including the increased data size resulting from erasure encoding). If 2TB of storage is used in a month, storage revenue would be $3 worth of STORJ tokens.
  • Egress bandwidth – $20 per TB for egress bandwidth related to file retrieval
  • Audit & repair bandwidth – $10 per TB for egress bandwidth related to file audit and repair bandwidth

Through the Storj website, users can get a free plan to start testing Storj decentralized cloud services with 150gb storage limit and 150gb bandwidth per month, or get a pro account for which storage costs $4 per TB per month and bandwidth $7 per TB (an additional per-segment fee of $0.0000088 applies as well).

Since users pay only for what they use, there is a clear difference in what users pay and what storage nodes earn. This has two reasons: first, Storj incentivized higher bandwidth with higher revenue, and second, Storj withholds a certain amount of revenue, which storage operators lose if they leave the network. Since this is a public network and users regularly join and leave the network, the revenue from collected witholdings is used to further fund storage node bandwidth:

  • Months 1-3: 75% of revenue is withheld, 25% is paid to the Node Operator
  • Months 4-6: 50% of revenue is withheld, 50% is paid to the Node Operator
  • Months 7-9: 25% of revenue is withheld, 75% is paid to the Node Operator
  • Months 10-15: 100% of Storage Node revenue is paid to the Node Operator
  • After Month 15: 50% of total withholdings are returned, with the remaining 50% held until the Node gracefully exits the network

It is important to note here that as time passes, previous withholdings are not returned to the storage node, but instead the withholding amount for new revenues is reduced. If a storage node stays operational for 15 months, then half of historical withholdings are returned, and the other half becomes claimable if they do a graceful exit. A graceful exit refers to storage nodes permanently leaving the network, for which the node triggers a special command with the satellite that coordinates the moving of all data stores on the node to other storage nodes.

Storj assumes that all data is meant to be stored permanently, unless the file is given a specific time-to-live (TTL) value during upload, which is essentially an expiration date. If no TTL value is set, the files will stay online for as long as the user pays the satellite through the uplink and the uplink remains online.

Tokenomics

The STORJ token is the utility token used for paying for storage and bandwidth on the Storj decentralized storage network. Previously Storj operated using a Bitcoin-based token with the ticker SJCX, however throughout 2017 Storj began allowing users to convert their Bitcoin-based tokens to their Ethereum ERC20 equivalent, the STORJ token. The total supply of STORJ ERC20 tokens is fixed and pre-minted at 425 million STORJ.

Of the 425 million STORJ in circulation, as of March 2022, 203.5m are in public circulation and 221.5m are in custody of Storj Labs, see items 18 and 19 in below figure.

Graphical user interface, table

Description automatically generated
Figure 3: STORJ Token Balances & Flows Report: Q1 ‘22. Source: https://www.storj.io/blog/storj-token-balances-and-flows-report-q1-2022

The tokens that are in Storj Labs custody are broken into eight different tranches, each containing 30.625 million STORJ that are unlocked every quarter for eight consecutive quarters. At time of writing (May 25th, 2025) this represents roughly $16.8 million USD. Each tranche is in a time-locked smart contract, so even if Storj Labs would want to redeem these tokens, they would have to need 8 consecutive quarters to withdraw the full amount.

In December 2018 when these tranches were introduced, Storj Labs committed to relocking the first tranche and appending it to the unlocking of the last tranche to essentially delay any unlockings without affecting the one tranche per quarter arrangement. Whenever a tranche is not used, the tranche get relocked as Storj Labs attempts to finance its operations through operating the public Storj network and keeps these tranches as financial reserves.

Storj committed to giving 60-days notice if there are to be any changes to the tranche unlocking schedule. Of the 8 tranches, currently two tranches have not been relocked. The first discontinuation of a relock was at the end of Q3 2020, and the second occured in the end of Q1 2022. The stated purpose for unlocking these tokens was to support the networks continuing growth, thus increasing tokens in circulation to 234.1 million STORJ and reducing tokens in Storj Labs custody to 190.8 million STORJ.

Table

Description automatically generated
Figure 5: Storj Labs custody Ethereum addresses as of April 28th. Source: https://www.storj.io/blog/storj-token-balances-and-flows-report-q1-2022

If we ignore changes in circulating tokens and only consider the fixed supply, it can be assumed that in the long-term the value of the STORJ token will increase as both supply and demand side actors in Storj network need to use the Storj token for storage activities.

References

All references were accessed between May 5th and 31st, 2022.

Gleeson, J. (2019) Cloud Storage Prices Haven’t Changed Much in 4 Years but They’re About To. Available at: https://www.storj.io/blog/cloud-storage-prices-havent-changed-much-in-4-years-but-theyre-about-to

Gleeson, J. and Ihnatiuk, V. (2021). Sharing Space for Fun and Profit—Part 2. Available at: https://www.storj.io/blog/sharing-space-for-fun-and-profit-part-2

Johnson, K. (2022) STORJ Token Balances and Flows Report: Q1 2022. Available at: https://www.storj.io/blog/storj-token-balances-and-flows-report-q1-2022

Storj DCS Docs (n.d.) Getting Started on DCS. Available at: https://storj-labs.gitbook.io/dcs/

Storj Docs (n.d.) Billing, Payment and Accounts. Available at: https://docs.storj.io/dcs/billing-payment-and-accounts-1/pricing/

Storj Forum (2020) Single point of Failure if a Satellite is down? Available at: https://forum.storj.io/t/single-point-of-failure-if-a-satellite-is-down/4577

Storj Labs (n.d.) Transparent Pricing. Available at: https://www.storj.io/pricing

Storj Labs, Inc (2018) Storj: A Decentralized Cloud Storage Network Framework. Available at: https://www.storj.io/storjv3.pdf

Storj Labs, Inc and Subsidiaries (2018) V3 White Paper Executive Summary. Available at: https://www.storj.io/Storj-White-Paper-Executive-Summary.pdf

Storj Nlog (2019) What Storage Node Operators Need to Know About Satellites. Available at: https://www.storj.io/blog/what-storage-node-operators-need-to-know-about-satellites

SHARE THIS PIECE

Related content