Repair egress significant, why?

champmine18 · August 16, 2020, 9:52am

I noticed on my node that repair egress is significant.
463 GB data egress and 365 GB repair egress
Does that mean that my node is holding old data that needs to be repaired because of other nodes disqualified and/or leaving the network?

BrightSilence · August 16, 2020, 11:49am

Yeah, pretty much. On new nodes you won’t see repair egress for a while, but the older the data gets, the more of it will need repairs.

It becomes a fairly decent part of your nodes income at some point.

kalloritis · August 17, 2020, 3:04pm

To add color to BrightSilence’s point- I have a 4mo node with 5.1TB stored that is currently at about 3% of its total egress is repair, but a slightly newer one at 3mo with 2.7TB stored is doing less than 1% (~0.2% to be exact).

Both of these nodes are quite new compared to many of the others in the network- I would put more than a few Storj behind Bright’s statement of it being pretty high in payout once you near the 12-15mo mark.

BrightSilence · August 17, 2020, 3:10pm

You’d win that bet.
This is about a year in (since last network wipe, node has been older, but the network wipe reset that counter)

12.3TB stored right now.

For comparison, this is a much newer node. About 3 months old.

1.6TB stored.

hewicker · August 17, 2020, 3:23pm

Wow, your share of repair traffic is even higher than mine.

My node is 10 months old and stores 13.5TB.

BrightSilence · August 17, 2020, 3:25pm

That’s probably because I started 2 new nodes in the past few months. Normal downloads tend to happen on relatively recently uploaded data, while repair happens on older data. And newly uploaded data has been spread across 3 nodes for me for the past 3-4 months. The largest node only has 1/3rd of the newly uploaded data. I think if you add all my nodes together the numbers are more similar to yours.

kalloritis · August 17, 2020, 4:05pm

This is the 4mo node with 5.1TB:

I guess these next 8mo will be an exciting shift.

anon68609175 · August 17, 2020, 8:48pm

My test nodes (network health sensor). Not vetted, 17 days from “born”:

These are short cycle nodes for testing purpose. They are created, work for a while (1,2 or 3 month) and will be deleted without exiting.

By the way, it can be clearly seen here that being a good operator is not necessary. Nodes of any quality and age have equal priorities, regardless of a certain “reputation”. I do not want to disclose all my nodes, but I can state that the quality of the nodes does not affect the income.

dbjson · August 17, 2020, 9:03pm

I was going to create a new topic myself asking this. Guess it makes sense the older data, more chance of repair. Mine is just in on the 75% pay.

1.8TB Space
Currently at 426GB Egress this month
The last month I’ve seen a lot more under repair, last 5 days I’m averaging 35GB Egress usage and 12GB Repair a day.

I’m based in the UK and I’ve seen over 70% of my traffic is on the Salt lake satellite.

Let the good times role… £££

peem · August 18, 2020, 6:04am

repair

I have confirmed your observations. Ever since I started my node, I have observed with surprise that I immediately get tasked with storing “after repair” elements. I don’t trust my node myself, and I get tasked with storing these… Something seems fishy.

Could this be explained by anyone form Storj Labs?

Pentium100 · August 18, 2020, 6:21am

AFAIK, there is no difference between “repaired” data and brand new data. During repair all lost pieces are reconstructed and distributed to nodes.

Anyway, here’s the stats for my node (over 18 months old):

Upload                  Ingress         -not paid-                         349.94 GB
Upload Repair           Ingress         -not paid-                         320.36 GB
Download                Egress          20   USD / TB                        1.17 TB         23.45 USD
Download Repair         Egress          10   USD / TB                      636.79 GB          6.37 USD
Download Audit          Egress          10   USD / TB                       11.53 MB          0.00 USD

champmine18 · August 18, 2020, 6:22am

@peem data that needs repair needs to be stored on new storage nodes. I guess everybody running a node is getting upload repair traffic ingress. Ingress either way is not paid. download/egress earns money, and there you only have 3 MB repair vs 5 GB download. all good I think

champmine18 · August 18, 2020, 6:23am

@Pentium100 thanks what is your node storage size and location?

Pentium100 · August 18, 2020, 6:28am

The node is in Lithuania.

Right now it has 15.7TB stored (accoring to CLI dashboard) and 0.8TB free. I’m going to have expand the virtual disk soon.

BrightSilence · August 18, 2020, 7:53am

You don’t need anyone from Storj Labs, the code is open source. I can tell you that node selection for upload is an all or nothing deal. It selects nodes at random from a pool of nodes that are online, have free space, are not disqualified, suspended or in containment etc. All of these criteria are a matter of you’re in or you’re out. Technically it first selects a subnet with “healthy” nodes by those criteria and then picks a random “healthy” node within that subnet.

There is indeed currently no differentiation between reputation levels. Reputation levels are just used to disqualify or suspend a node. Of course I can’t say if it’ll stay that way. Early on they posted a blog with big ambitions around using statistical models to score and prioritize nodes for selection. But any preferential treatment of high quality nodes also leads to a less even distribution. So it’s a trade off.

kevink · August 18, 2020, 9:48am

In my opinion it could make sense to exclude unvetted nodes from repair traffic because they are more likely to fail and in the worst case the network would need to repair the same file again soon.
But the problem with that approach could be that new satellites wouldn’t be able to have any repair traffic as long as the first batch of nodes is still unvetted. However, that might not be neccessary because on a new satellites nodes get vetted in a matter of days. Or an exception could be used.

Anyways, repair traffic is expensive and iirc storjlabs is already working on a solution to make the repair traffic cheaper for them.

BrightSilence · August 18, 2020, 10:12am

Quite the opposite. Even starting out it doesn’t matter whether an unvetted node gets a repaired piece or a new piece, if it turns out to not be trustworthy it’ll lead to repair either way. The satellite won’t care much if it has to repair the same segment again or a new segment.

That said, you could say that the nodes that still had pieces for the segment that needed repair have already proved to be relatively long term reliable. Selecting a few higher risk nodes in that scenario would actually be safer than selecting higher risk nodes on a new segment where all nodes are selected at random.

Pentium100 · August 18, 2020, 10:18am

There most likely is no difference to the network between having to repair the same file 10 times and having to repair 10 different files once.

kevink · August 18, 2020, 10:32am

Yeah you’re probably right.

peem · August 18, 2020, 10:47am

Thanks for the clarification.
The node’s reputation seems irrelevant at the moment… I wonder if it ever will…