5 nodes on the same HDD vs 5 nodes on a separate disks

On average they will die at the same rate. All it would really do is spread out repair. But I don’t think repair workers have ever not been able to get repair done in time. And I’m pretty sure they can now be easily scaled on demand with the trusted delegated repair implementation. I’d say the risk of loss to the network is about the same whether it happens at once or spread over time. As long as they use the same IP.

yes! 5 separate HDDs cannot die in the same time with a probability of 1. One disk with 5 nodes have in 5 times higher chance to die and for these 5 nodes this probability is 1, even if probability to die for the disk is the same as for any of 5 separate disks.

People please move this question to different topic it not related to that topic.

2 Likes

4x5TB nodes on a single drive and single public IP. Drive dies, 20TB of data is lost.
1x20TB node on a single drive. Drive dies, 20TB of data is lost.

There should be no difference to the network, unless the network gave multiple pieces of the same segment in the first case, but that should not happen, right?

3 Likes

There is only difference between 5 nodes on the same HDD and 5 nodes on 5 HDDs.
There’s no difference between 5 nodes in total size of 20TB and 1 node in size of 20TB behind the same /24 subnet of Public IPs

In the case of 5 nodes one the same HDD, you will lose 20TB at once (as for 1 node of 20TB), but in case of 5 separate disks the probability to do not lost all is much higher.

1 Like

Obviously. So, since you allow SNOs to own 20TB disks, why would you not allow them to hold multiple nodes in the same disk?
The data they are holding is exactly the same, whether 1 node or 5 nodes.

You can make a rule prohibiting SNOs from having large disks on the basis of assuring a better network. I agree with the reasoning.
You have zero reasons to prohibit SNOs to have multiple nodes in the same disk.

Right, but since you’re allowed to have one large disk holding all of the data that 5 disks hold, that argument is superfluous.
The comparison must be between loosing 20TB in a single node or loosing 20TB in 5 nodes. Again, what is the difference?

Same location, same hardware and same disk brings more correlation to failures, apparently. But you cope well with unreliable hardware (RPI’s) connected (by USB!) to a series of old individually unreliable disks. Yet, the idea of having more than one node in a big, new, reliable disk(*) somehow gets under your skin…

In the end it’s still a matter of statistics. 20TB going down is 20TB going down. If the 20TB that goes down is distributed over 20 different nodes (on the same disk) instead of a single 20TB node, it shouldn’t make any difference to the network precisely because data was distributed per IP (not per node!), therefore there is no higher correlation between the data being held on 20 1TB nodes when compared to data being held in a single node in a 20TB disk.

(*) I said “reliable” because I wouldn’t care about the nodes if I had 20 1TB disks holding 20 nodes. But if I had 1 20TB disk holding 20 1TB nodes, you bet I would care for it. I wouldn’t let the disk get too old and I would move the nodes if the disk would start showing signs of being in trouble.

The certainty was built in order to simplify the reasoning. Just that.
You are deluded if you’re sure you’ll be able to copy the data before you loose everything… you might… but don’t be sure…

But the data is not correlated in the 5 node disk!!!
One hdd with 5 nodes dying is exactly the same as 5 hdd’s on different IP’s dying.
The only rule that would make sense would be imposing a maximum size for disks. That would have an impact on the network. Imposing a maximum number of nodes per disk makes no sense.

data is not correlated, but nodes on the one disk, on the same device, in the same location - are highly correlated.
The probability to lose 5/5 on own disks will be much lower than to lose one disk with 5 nodes.
You cannot change this fact.

1 Like

Running 5 nodes on same hdd are impossible.Bcs when they start the scan piece mechanism, the computer became inresponsive at all.And you are in luck if that thing just start on 2/5 nodes at once.But if it start on more than 3/5 nodes it become endless work for hardware.It takes days, if not weeks of work, bcs that many iops cant take shorter.And then become second scan pice, which occurs like twice a month.The hdd get overused, and the nodes cant even response to network calls at any reliable time to any sattelites.you get disqualifiquied in no time.It take just few weeks until you realize the reality.
All of this just a philosophy about something, that is possible in capability of hardware, but it is not possible in reality.But if you feel confidence in this thing, just do it, and tell your experience after 1 year of operating like that.I have no doubt that you give up after few months of operating.Satellites and overall network will fix your fault at all, but you just lose your precious time, and money bcs of your inoperable setup.

1 Like

I had nodes on the same disk for a long time, there is no problem.

However, I now moved the nodes back to individual disks because I wanted more performance and because of the rules. I have a new 20TB disk arriving today that completes the process.

It will be only 500GB so roi isn’t great

???
I’m not trying to change that fact. I’m trying to state the fact that the probability of loosing 20TB of data in “one node/one disk” is exactly the same as loosing 20TB of data in “5 nodes/1 disk”.

What? You have many nodes (lots of data), a new 20TB disk, and you’re starting the new disk with just 500GB? :upside_down_face:
Last time I read about calculation of the maximum size of data (not node) one IP/24 can hold, it was about 20TB…

It’s about 88TB now. And there are signs that there is permanent data that will never be deleted so it likely will keep growing beyond that, just quite slowly.

1 Like

But how do you calculate how many Tb can contain a single ip?

I do this in the earnings estimator. It’s a function of average ingress and delete % per month. It calculates when the deletes will roughly match ingress. There’s more to it, but you can look at the estimator yourself to see the formulas.

2 Likes

No one is arguing that. The probability of 1 disk dying is the same as the same disk dying. It will vary slightly based off of temp/amount of use, but mostly its going to die at the same time as disks made in the same batch. 1 disk will die when 1 disk will die. Theyre just saying you should have 5 nodes on 5 disks. A separate idea. For resiliency of the network and your payment. Ive had 5 nodes on 1 disk before, and while not fun and risking the disqualification of the nodes, all of those nodes are thriving on their own drives now. Theyre just going to keep telling you the recommendation of 1 node gets 1 disk.

Well I am arguing that the strain on the network of one 20TB node dying vs five 4TB nodes dying on different times is the same. As long as they all run on the same IP address and don’t share pieces of the same segments.