Repair egress significant, why?

I noticed on my node that repair egress is significant.
463 GB data egress and 365 GB repair egress
Does that mean that my node is holding old data that needs to be repaired because of other nodes disqualified and/or leaving the network?

1 Like

Yeah, pretty much. On new nodes you won’t see repair egress for a while, but the older the data gets, the more of it will need repairs.

It becomes a fairly decent part of your nodes income at some point.

6 Likes

To add color to BrightSilence’s point- I have a 4mo node with 5.1TB stored that is currently at about 3% of its total egress is repair, but a slightly newer one at 3mo with 2.7TB stored is doing less than 1% (~0.2% to be exact).

Both of these nodes are quite new compared to many of the others in the network- I would put more than a few Storj behind Bright’s statement of it being pretty high in payout once you near the 12-15mo mark.

You’d win that bet. :wink:
This is about a year in (since last network wipe, node has been older, but the network wipe reset that counter)
image
12.3TB stored right now.

For comparison, this is a much newer node. About 3 months old.
image
1.6TB stored.

Wow, your share of repair traffic is even higher than mine.

My node is 10 months old and stores 13.5TB.

That’s probably because I started 2 new nodes in the past few months. Normal downloads tend to happen on relatively recently uploaded data, while repair happens on older data. And newly uploaded data has been spread across 3 nodes for me for the past 3-4 months. The largest node only has 1/3rd of the newly uploaded data. I think if you add all my nodes together the numbers are more similar to yours.

This is the 4mo node with 5.1TB:

I guess these next 8mo will be an exciting shift.

1 Like

My test nodes (network health sensor). Not vetted, 17 days from “born”:



These are short cycle nodes for testing purpose. They are created, work for a while (1,2 or 3 month) and will be deleted without exiting.

By the way, it can be clearly seen here that being a good operator is not necessary. Nodes of any quality and age have equal priorities, regardless of a certain “reputation”. I do not want to disclose all my nodes, but I can state that the quality of the nodes does not affect the income.

I was going to create a new topic myself asking this. Guess it makes sense the older data, more chance of repair. Mine is just in on the 75% pay.

  • 1.8TB Space
  • Currently at 426GB Egress this month
  • The last month I’ve seen a lot more under repair, last 5 days I’m averaging 35GB Egress usage and 12GB Repair a day.

I’m based in the UK and I’ve seen over 70% of my traffic is on the Salt lake satellite.

Let the good times role… £££ :partying_face:

2 Likes

repair

I have confirmed your observations. Ever since I started my node, I have observed with surprise that I immediately get tasked with storing “after repair” elements. I don’t trust my node myself, and I get tasked with storing these… Something seems fishy.

Could this be explained by anyone form Storj Labs?

AFAIK, there is no difference between “repaired” data and brand new data. During repair all lost pieces are reconstructed and distributed to nodes.

Anyway, here’s the stats for my node (over 18 months old):

Upload                  Ingress         -not paid-                         349.94 GB
Upload Repair           Ingress         -not paid-                         320.36 GB
Download                Egress          20   USD / TB                        1.17 TB         23.45 USD
Download Repair         Egress          10   USD / TB                      636.79 GB          6.37 USD
Download Audit          Egress          10   USD / TB                       11.53 MB          0.00 USD

@peem data that needs repair needs to be stored on new storage nodes. I guess everybody running a node is getting upload repair traffic ingress. Ingress either way is not paid. download/egress earns money, and there you only have 3 MB repair vs 5 GB download. all good I think

1 Like

@Pentium100 thanks what is your node storage size and location?

The node is in Lithuania.

Right now it has 15.7TB stored (accoring to CLI dashboard) and 0.8TB free. I’m going to have expand the virtual disk soon.

You don’t need anyone from Storj Labs, the code is open source. I can tell you that node selection for upload is an all or nothing deal. It selects nodes at random from a pool of nodes that are online, have free space, are not disqualified, suspended or in containment etc. All of these criteria are a matter of you’re in or you’re out. Technically it first selects a subnet with “healthy” nodes by those criteria and then picks a random “healthy” node within that subnet.

There is indeed currently no differentiation between reputation levels. Reputation levels are just used to disqualify or suspend a node. Of course I can’t say if it’ll stay that way. Early on they posted a blog with big ambitions around using statistical models to score and prioritize nodes for selection. But any preferential treatment of high quality nodes also leads to a less even distribution. So it’s a trade off.

3 Likes

In my opinion it could make sense to exclude unvetted nodes from repair traffic because they are more likely to fail and in the worst case the network would need to repair the same file again soon.
But the problem with that approach could be that new satellites wouldn’t be able to have any repair traffic as long as the first batch of nodes is still unvetted. However, that might not be neccessary because on a new satellites nodes get vetted in a matter of days. Or an exception could be used.

Anyways, repair traffic is expensive and iirc storjlabs is already working on a solution to make the repair traffic cheaper for them.

Quite the opposite. Even starting out it doesn’t matter whether an unvetted node gets a repaired piece or a new piece, if it turns out to not be trustworthy it’ll lead to repair either way. The satellite won’t care much if it has to repair the same segment again or a new segment.

That said, you could say that the nodes that still had pieces for the segment that needed repair have already proved to be relatively long term reliable. Selecting a few higher risk nodes in that scenario would actually be safer than selecting higher risk nodes on a new segment where all nodes are selected at random.

There most likely is no difference to the network between having to repair the same file 10 times and having to repair 10 different files once.

1 Like

Yeah you’re probably right.

Thanks for the clarification.
The node’s reputation seems irrelevant at the moment… I wonder if it ever will…