Yea i get alot of data now also
Thank you for the continued open communication friend.
Would I love even more detailed and even more frequent information? Yes.
Is this already more than I can ask for? Also yes.
Keep up the good work
Is the deleting of old data working properly?
The earnings.py script for my node shows 29.65TB as āuncollected garbageā. IT seems that while the virtual disk stays full or near full, the amount of data that the satellite thinks my node should have is going down when the test data expires.
Sounds like you have the same problem as me: https://forum.storj.io/t/how-do-you-solve-slow-file-deletion-on-ext4/27260
having the same thing happen over here.
If you are still on v1.105.4, then this is likely to occur. The collector, when deleting, does not update the used space on that version. iirc it was fixed in v1.107.
To correct youāll need to manually update (or wait for rollout) to latest and then trigger a used-space calculation.
That does not seem to be the case for me, unless TTL expired deletes are different from normal deletes. I remember when Storj deleted a lot of test data from Saltlake (in preparation for the current tests) it went smoothly, the disk space was freed quickly.
I restarted my node to let the used space filewalker run, hopefully it will find some space, but it still is concerning that the satellite reports that my node has way less data than there is on the disk.
Yes, I believe that this issue is not solved yet. My nodes have reports from the satellites that they should hold about 4.42TB from used 6.78TB.
But, they are still running a GC.
To my amazement: it looks like your team is squeezing out even a bit more speed over the last few hours. You must be reaching the limits of many SNOs internet connections: congrats!
Certainly youāve exceeded potential-customer expectations: and now youāre just showing off . Perhaps a new sales whitepaper soon showing how Storjās S3 speed kicks Amazon in the
?
None of this is a complaint: just a comment: weāll reserve all the capacity you can afford
Itās actually been quite āmehā for me.
Maybe I have too many full nodes nowā¦
I would rather see them reaching the limits of my disk space. This test data is going as fast as it comes, so a lot of wear for no gain.
Please pay attention to the warning most likely displayed underneath that top overview. Recently it has been very common for the last used space report from the satellites to be incomplete, which unfortunately means that the uncollected garbage calculation is likely inaccurate. I hope that issue gets resolved soon, but I can only report what the node knows and if it gets incomplete data from satellites, that report is just going to be wrong. Maybe itās not as bad as it looks. That said, I have quite a few nodes for which that graph just visibly shows a drop even on days where it is complete. It seems to me that expiration deletion is for some reason not catching everything and GC is still way behind on Saltlake.
I was thinking about that recently. Lets see if I can get the thought out of my head in a way that makes senseā¦
It seems like āoldā SNOs were able to fill a disk then park it on a port⦠and start a new disk⦠then when full park it on a port etc⦠basically always having one node growing⦠and the others idling. That worked because data tended to hang around longer⦠so idle/full nodes didnāt delete much: so it was no big deal if they shared an IP with the growing node. They didnāt need much ingress to stay full.
Now though⦠full nodes that are switched to share an IP⦠are still going to lose data at the full TTL rate: but not refill it as fast (as theyāre now sharing a /24). So would newly-filled nodes āleakā a little the first month they share that IP? I guess it would balance out eventually.
But⦠since underneath all the TTL data, thereās still some natural growth of long-term data⦠would that mean āoldā nodes are even more valuable? Like theyād be filling a larger and larger percentage of their space with that long-term data⦠so a freshly-filled node may be 90% TTL data⦠but a 3-year-old full node may only be 10% TTL data?
If thatās true⦠then itās more important now to make your nodes a bit more durable. Because if you lose them itās not just refilling (which may be quick with TTL data)⦠itās refilling with that long-term data (which reduces the TTL churn the node has to deal with).
Does that sound right?
I saw the warning, thatās why I wrote here to ask others about the problem. It may or may not be accurate, but if the test data was uploaded with 30 day TTL then I think it is at least somewhat accurate.
My node got almost zero ingress in the last two weeks, because it was full. So, if the test data was uploaded with 30 day TTL, then about half of it should be expired by now.
However, the used disk space does not reflect that (the graph is of the actual used space, like what you get with df
):
From the peak of 48TB itās now down to 44TB (and the node still thinks that itās full because of the other bug that can be temporarily fixed by restarting the node and letting it run the filewalker).
So, I think that the āuncollected garbageā value is more likely to be correct than not, especially since itās growing.
How do you get the value of āamount of data the satellite thinks my node hasā? From the database of API? I could create a graph for it.
Itās directly from the database. The API already does some calculation to summarize per day, which can compound issues. Feel free to have a look at the code in the script. I have to do some calculation since each report just shows bytehours over a variable period.
With the new traffic pattern and node selection the number of IPs is not as important as it was in the past. Calculate how much download traffic your internet can handle per month to get a rough estimate of how much storage you can reach. There might be still some long living data but I can see also a lot of none TTL data not even living 30 days.
If this is a single node I would guess it needs a 100MB+ bloom filter from US1 to have a decent false positive rate. What is the max size you have received so far?
Where can i read about this new node selection ?
Search for ānode selectionā in this topic, and check out the posts by littleskunk. Theyāve been updating us on the different selection criteria being tested by the devs.
The same bug also affects any kind of scripts you are running. It is not the satellite side that is wrong. The storagenode has incorrect numbers of what it believes it has on disk. In your case the node is doing all the TTL cleanup but without updating the numbers. On disk your node is shrinking while the scripts suggest you would have uncollected garbage.