Revisiting the Question: Feasibility of Large-Scale Storj Nodes

Hello Storj Community,

I’m revisiting a topic that has been discussed several times in the past, but without clear conclusions. With access to multiple locations offering 10Gbps unlimited bandwidth at no cost, and considering setting up professional-grade nodes totaling 144TB, I seek fresh insights into this scenario, especially as I am based in France.

Key Questions:

  1. Profitability and Feasibility: Given the lack of recent discussions, is it profitable and feasible to run large nodes like these on Storj in today’s context?
  2. Network Adaptation: Is it necessary to divide a large node into smaller virtual nodes to comply with Storj network’s working principles?
  3. Space Filling Rate: How long might it take to fill 144TB of space on the Storj network, based on current trends?
  4. Community Experience: Are there users with large nodes? I’d appreciate hearing about your experiences and suggestions.
  5. Additional Insights: Any other relevant questions or advice from the community would be greatly valued.

Currently, I’m experimenting with just one node, but am considering a major expansion. Your insights and experiences will be highly valuable in guiding my decision.

thanks in advance for the answers !

2 Likes

Hello @S0ly,
Welcome to the forum!

We need nodes, but geographically distant, physically separated, not a huge setup in the one physical location.
Thus are limits - all your nodes behind the same /24 subnet of public IPs will be treated as a one big node for customers uploads and as a separate ones for customers downloads, repair and audit traffic and online checks - we want to be decentralized as much as possible. So doesn’t matter one big node or dozens smaller ones - they will get the same amount traffic as only one node. Multiple nodes makes sense only if you run one node per disk - you wouldn’t need to have RAID and in case of the one disk failure you will lose only this small part of the common data, not everything in case of RAID failure, especially like in case of RAID0 (JBOD, LVM simple volume, ZFS simple pool, MergeFS, etc.).
The second problem is an equilibrium point - the balance between uploaded and deleted data (all usage from the customers, thus it’s not predictable), and the last known equilibrium point is

but likely it’s outdated.

You may use this Community Estimator to get an idea how long it may take to fill all the free space and how much you may earn:

The other alternative for you would be if you have SOC2 and/or EU ISO27001 compliance, then you may try to register as a commercial SNO, if there is a demand: Put your commercial storage capacity to work

4 Likes

Hiya @S0ly - Alexey have answered many of your questions, I’ll give them a shot as well.

  1. Profitability and Feasibility: Given the lack of recent discussions, is it profitable and feasible to run large nodes like these on Storj in today’s context?
  • It all depends on your hostingcosts. Electricity is expensive in France. I personally calculate that my nodes generate around $2.5/TB. Large nodes are significantly more profitable in my setup than smaller nodes, because a 1TB node on a 20TB hdd spends just as much power as a 20TB node on a 20TB hdd

  1. Network Adaptation: Is it necessary to divide a large node into smaller virtual nodes to comply with Storj network’s working principles?
  • Necessary, no. Good idea, yes.

  1. Space Filling Rate: How long might it take to fill 144TB of space on the Storj network, based on current trends?
  • It really really depends. Take a look at the trend of all other nodes, but with all the recent changes it’s hard to predict. We’ve just had a large purge of test data which have slowed ingest for a few months. Then we had some very big days (I saw +90GB ingest multiple days) and now it seems more normal again.

  1. Community Experience: Are there users with large nodes? I’d appreciate hearing about your experiences and suggestions.
  • What is a large node to you?
    • Is it a full 10TB node on a 10TB disk?
    • Is it a 10TB node on a 60TB disk?
    • Is it 8x 10TB nodes on 8x 10TB disks?

I total about 50TB now on around 20 different nodes. I don’t consider myself a large operator by any means. Perhaps @Vadim, @arrogantrabbit or @BrightSilence can chime in here? If not they, then certainly @Th3Van can. Then again, even he might not be a large node operator, depending on your requirements for that predicate.


  1. Additional Insights: Any other relevant questions or advice from the community would be greatly valued.

Start small; Start simple; Start Now.
Do things the right way from the start, and if you’re satisfied with your current and projected results; start additional nodes.

3 Likes

It will be much easier to elaborate if you describe your setup. because if you try to fill 144 TB as 1 node it will take forever. but if it will be 10 nodes with 10 Ip from different /24 then it will take 3-5 years. Also you have to comply with ToS you cant make several nodes on 1 hdd, for me best way is 1 node 1 hdd. Today i have 95 nodes with 490TB of space and 320TB filed with storj data.

Thank you all for your insights. @Alexey @Ottetal

I’ve gone through other discussions and am aware of considerations like avoiding unnecessary hardware purchases and adhering to the /24 IP rules, and I have some difficulty to understand the limit of 88tb I have used and seen this estimator. I’m intrigued by users who manage extensive setups even if I seen that the price for host have changed lately, some user have like 2PB, and am curious about their strategies.

As for my setup, I have access to three locations, potentially allowing for two nodes at each, with separate public IPs. While I understand there are many nodes in Europe, with a relatively lower density in France, I’m optimistic this won’t be a significant hindrance.

Regarding profitability, my calculations indicate that to break even, I need at least 30% usage of each node. With a rate of 2.5 euros per TB, this translates to needing around 40 TB of utilized storage per node to start generating revenue.

and no I dont have any compliance for the moment :smiley:

So for the moment based on your advice, I should start little and make virtual little nodes for downtime or something like that ? if I have a 144tb server I can put 6 x 24tb node like the maximum recommended
but of course I will not directly start whit 144, and just for information the physical server can handle 8 disk and 2 disk loss at the same time soooo I think I will be fine whit the data integrity

2 Likes

thanks for your reply, and to give you more context
I have for the moment one test server its a storage server whit professional grade component he is made to consume low electricity and handle 10 gbps connection and manage lots of disk

and I have for the moment 3 location available maybe more in the futur. eatch of this physical server can handle 144tb whit 2 disk redundancy because it have 10 disk, 2 for redundancy, 8 for storage

this server can handle VM or Docker, so if Storj really need a node to be little I can do that :slight_smile:

I think everyone just wants to stress that ingress has been slow, historically. And that it will take a very long time to fill your nodes. They will likely lose you money for quite a while until they have gained enough data to turn that around and you start earning more than your power costs.

If you understand this going in, you shouldn’t have any issues. We just see a fair amount of people starting up nodes thinking they will fill up quickly and earn them their income in total capacity right away and when they see that is not the case, they complain and quit.

If your expectations are realistic and you understand the length of time it takes to fill a drive, you should be good to go.

5 Likes

to be honest it is overkill setup. if it is only for Storj it will take long time to get profitable. But if they run anyway then it is Ok, also you can make nodes and fill all free space with chia, as nodes rise just delete chia plots, with 140 TB chia will give you additional 20-30$ a month that will help cover running costs.

2 Likes

thanks for everyone answer, yes I understand that filling that can be long, but I just want to know that, even if it will be difficult even if it will be long, have I a chance that one day I will see profits out of nodes that big ?

and yeah I plan to have other storage providers if storj is not filling up… I will not say names but there is a lot of similar projects but storj for the moment is the more promising

2 Likes

If you’re willing to cover any costs of spreading your space over say ten /24’s, then you should reach a point of covering your ongoing monthly costs after a year.

So perhaps in year two you start to see some returns?

2 Likes

I have been running a node since before the last network purge, meaning my node is as old as possible. During that time, with the exception of one week, my node always had free space available for new data and the success percentage for egress usually is above 99%

As such, I think my node has pretty much the most data that is possible to have right now. And that is 22.41TB, according to the dashboard. It used to have more but then two test satellites have been shut down and their data deleted. Right now the used space is slowly climbing up again.


(note that here it’s binary terabytes not decimal)

Customers upload new files. Customers also delete files. New files are distributed among all nodes, but deleted files are only deleted from the nodes that have them (obviously).
The more data a node has, the larger share of deletes it will get. If customers are deleting 1000 files per second and your node has 1% of the total files on the network, you will get 10 deletes per second, while a node with 0.1% of total files will get 1 delete per second. But both nodes wil get the same number of uploads.
From this, you can see that as the node grows, the growth slows down (compared to other, smaller nodes), because more files are deleted from it. This means that at a certain node size the amount of files deleted would be equal to amount of new files uploaded, so the node would stop growing. Apparently this is about 88TB which I do not think any node has reached.

6 Likes

thanks for sharing theses information, wen you say your node have the most data, is it 1 physical server or you have multiple virtual node on 1 server and you are respecting the 24tb limit ?

because for exemple I seen insane setups like the one from Post pictures of your storagenode rig(s) - #550 by Th3Van

you can even see his live stats, he have like 100 virtual nodes whit 10tb disks
I dont know if its allowed

the setup have more than 1 PB and visibly dont have the 88tb limit this limit seems related to the software and not hardware that is visibly why he have more than 100 virtual nodes ?

It is one node, running inside a VM. I have one public IP (I have a second connection but it’s slow), so I just run one node. Running more nodes in separate VMs would not be better for me.

My server has the same chassis as the ones in the picture in your link. The storage right now is raidz2 from 6x4TB drives and raidz2 from 6x6TB drives.

He probably uses VPNs to get multiple public IPs from different /24 subnets. That’s the reason why he has more data.

If I had two nodes with the same IP and created at the same time, they both would have ~11.2TB each with the total still being 22.4TB. If I had two public IPs from different /24 subnets, I could have two nodes with 22.4TB each, total 44.8TB.

~22TB is probably the current maximum you could have with a single public IP or a single /24 subnet.

yes but its not authorised right to have a VPN to bypass the rule ?

aaaand so why I can only have 22tb maximum whit one IP ? because of software limitation or something like that ?

sorry I dont know if I need to reply for you to see that I replied

Pentium100 gave a great answer above. It’s not that software is limiting anything to 22TB… it’s more that Storj is only popular enough right now for customers to upload so much data per month (compared to how much is deleted). So Pentium100 is seeing that around 22TB… new-uploads-per-month is the about same as file-deletes-per-month… so their node isn’t getting larger.

If tomorrow customers started uploading 2x as much data… then nodes would get larger before reaching that upload/delete balance point.

If you have more space to offer, and want it to fill sooner, offering that extra space on a different /24 IP is one way to do it. Some people have multiple IPs, some can request them from their ISP (like extra PPPoE sessions), some have VPS capacity, some use VPNs: lots of legit sources of IPs.

There’s not enough data on the network. Alternatively, there are too many nodes.
When a customer uploads a file, that file is split into segments, each segment is erasure coded and split into 80 pieces and the pieces get uploaded to nodes. Each /24 subnet gets no more than one piece of each segment.

This says that the total amount of data on the network is 25.9PB. It also says that there are 22.4k nodes, but does not say anything about subnets.
This says that there are about 12000 different /24 subnets with at least one node.
Divide 25.9PB by 12000 and you get about 2.15TB per subnet on average. I’d say having 10 times the average amount is pretty good. My node got some of the data in the past, when there were fewer nodes operational.

So, it’s not a software limit and my node is growing slowly right now, but this is just because there is not enough customer data to go around.

  1. Depends on many things. Is it only unused hardware that runs anyway? No? Then what are the costs of hardware and energy?

  2. Roughly 12 years on a single IP. Depends on many things, currently there is a lot of new data flowing in. Monthly updated new node report - #30 by IsThisOn

  3. Don’t buy hardware. Don’t base decisions on past data points. Remember that the current pricing is not set in stone, we had two drops in this year alone. STORJ ICO money or runway money very much depends on the current BTC price. There is a real risk of Binance or Tether blowing up, tanking BTC and by that tanking runway money.

ok I now understand, so why having multiple virtual nodes ? its to bypass the 24 rules or to not have raid ? if I dont want to bypass because I think its against the TOS and if I already do raid do I have any intrest on having multiple virtual nodes ?

from what I understand
the network balance the data between all nodes in some sort, but dont give more data because you have more nodes because if they use the same Router and IP they are considered as one node for the decentralisation of the network

so there is not enough data to fill big nodes ? and based on the current pricing investing in big nodes is not profitable because of how little the data is and how much time it takes to fill up ?