Overview of the present and future of decentralized file systems

Overview of decentralized file systems, present and future.

1. What is a Decentralized File System?

A decentralized file storage system is a set of systems that store data on a peer-to-peer network. It can store a DApp’s frontend, images, audio, and video files, reducing the risk of single-point-of-failure and achieving serverless operation. It removes the need for trusted third parties, provides better security, and is cheaper.

Decentralized file storage is a replacement for centralized storage and an important part of the Web3 technology stack.

A decentralized file system is suitable for storing:

  • Hot data (high frequency access): DApp frontend, NFT metadata and images, DApp file data such as data on lens, blogs, images, audio, video, etc.
  • Cold data (low frequency access): archived historical data and backups.

2. The Current Status of Decentralized File Systems

According to Messari’s statistics, the market capitalization of the top four decentralized file storage protocols is nearly $1.6 billion, an 83% decrease from $9.4 billion. The total storage capacity is 17 million TB, a 2% increase YoY, and the used storage capacity is 532,500 TB, a 1280% increase YoY.

Let’s take a look at the current status of some popular decentralized storage projects. Using all these decentralized storage protocols is much cheaper than using AWS. While AWS charges about $23 per TB/month, the cost per TB/month for these decentralized storage protocols varies from $0.0002 to $20.

  • IPFS: IPFS is the most widely used protocol for storing NFT images and metadata. It is well suited for storing “hot” data with high access frequency. However, IPFS has no built-in way to incentivize storage, prove data is stored correctly, or establish agreements among participants like a blockchain. This means that there is a risk of losing data if the data is only stored on IPFS. For example, Infura’s IPFS service will delete data that has not been accessed in six months. Therefore, it is best to run your own IPFS node if you want to keep your data available for a long time.
  • Filecoin: Filecoin provides low-cost storage and is mainly used for storing “cold” data such as archive data. Filecoin does not have a built-in data retrieval fee mechanism. Currently, some miners accept low-quality data to get rewards and refuse to help with data retrieval. The Filecoin community is actively addressing this issue and implementing measures to improve the overall quality of stored data.
  • Arweave: Arweave’s permanent storage concept is suitable for storing DApp data. The ecosystem is developing well, with decentralized database systems that store database files using Arweave and layer-two scalability solutions based on Arweave. In Arweave’s pay-as-you-go model, there is no bandwidth fee. Therefore, some nodes only provide storage services and do not provide retrieval services.
  • Swarm: Swarm charges bandwidth fees for storage and retrieval. The system is highly decentralized and has high bandwidth requirements for nodes.
  • StorJ: StorJ is partially decentralized and has good retrieval speed, which sets it apart from other protocols. It has been proven to be very effective for sharing large video files.
  • Sia: Skynet Labs closed due to a lack of new funding, which also led to a decrease in Sia’s usage.

We evaluate the usability of decentralized file storage protocols mainly based on the following three factors:

  1. Data retrieval speed. This is critical as it determines the efficiency of the storage system in responding to DApps requests and directly affects the user experience of DApps. Factors that may affect data retrieval speed include: whether data queries are charged, the degree of decentralization of nodes, the quality of nodes, the data forwarding logic, and facilities such as CDNs that accelerate queries.
  2. Incentive model and token economics. The incentive model and token economics affect the participation of storage nodes, which in turn affects their behavior. Currently, mainstream pricing models consist of storage and bandwidth costs, which means that users need to pay storage fees when storing data and bandwidth fees when accessing it. If data queries are free, nodes usually lack the motivation to provide such services. In addition, the incentive model and token economics affect the income of miners, which in turn affects the number of service nodes and storage capacity.
  3. Data availability assurance algorithm. This is an algorithm used in decentralized networks to ensure that nodes continue to provide data services and ensure the continuous availability of data. Currently, the most widely used method is the Proof of Random Access algorithm.


  1. Products and services using decentralized storage protocols are still in the early stages.
  2. The main focus of improving storage protocols will be on shortening retrieval time.
  3. Data retrieval speed, incentive model and token economics, and data availability assurance algorithms are key factors in determining whether a protocol is widely used.