Trash comparison thread

Raspberry pi node, SMR disk, negligible trash
image

2 Likes

This one really isn’t necessary. Why would the satellite care? If it delivered the message that the piece should be deleted, the job of the satellite is done.

Honestly I don’t think deletes are causing the garbage we’re seeing. It could be incomplete uploads that were cancelled after a few segments were already uploaded. It could be anything that leaves data behind that shouldn’t be there. What’s interesting is that my 3 nodes all seem to have about the same amount of trash, despite one being significantly bigger than the other 2. This suggests it’s caused by upload related traffic as that’s the same across all nodes. Deletes happen a LOT more on the larger node, but it doesn’t have more trash. So it’s almost certainly not related to that.

afbeelding

Today it grew a bit again.

I can see why this could be a problem for other nodes.
I’ve under-provisioned this drive quite a lot, it is a 4TB drive and i only share 3TB to have enough room to spare.

Hey you all!

Trying to give some better explanations and insights as to why Trash exists or piles up sometimes.
There is a simple reason for this process, which is called GC. As many of you are right, while a node with 100% should never miss a delete, there are ways to create pieces on even those nodes, that a user cannot delete itself.
The process is called zombie-segments ( all files are split into 64MB during uploads, so called segments), which can be left behind, if an upload is canceled midway through.
If the user does not upload a file to the same key/path again, those leftover segments are never cleaned up and are not visible for the user. In fact he is currently still charged for it. We are working on a fix on that that allows file versioning as well, but it is not put onto a fixed timeline yet, when this hits production.
That said, there is an automated process cleaning up such “zombie-segments” twice a week. After those runs (that do not execute actual network deletion requests), the Garbage collection system will pick those pieces up on the storagenodes. This can in fact result in a LOT of trash being collected. For our own safety in case of a bug, we keep trash for 7 days.
Simple explanation why:
Lets assume there is a fundamental bug in the garbage collecten (short GC) code, that will delete all pieces on all nodes. Without the trash functionality, that would mean GAME OVER. With the grace period of 7 days, in which we keep the trash on the nodes to be able to recover it, you might see a quite big chunk occasionally.

As last explanation for that extensive (sometimes multiple hundred GB sized) trash folder, this was mainly a test from my side while cleaning up data that nodes stored via my satellite. This should be cleaned up by know however, as that deletion happened multiple weeks ago.

Hope this helps to explain the appearance of the trash being unnecessary.

11 Likes

Thanks @stefanbenten for clearing that up!

As for the large batch, it could take a little longer for that to be cleaned up entirely as the bloom filters sent by the satellite will clean up about 90-95% of garbage during garbage collection. Each bloom filter uses a different seed so with every run a different 90-95% will be collected, so the number of remaining garbage drops fast. But that could cause garbage for more than a week.

Basically all these things combined will cause your node to pretty much always have at least some garbage. But it shouldn’t normally be more than a few GB.

@stefanbenten can you explain why there is such a discrepancy between different nodes though? It seems that most of the things you describe would normally hit all active nodes similarly. My node has been accepting data since the early days, so if any node would have large amounts of trash it should be mine, but that doesn’t seem to be the case.

1 Like

Thanks for explaining.

I think most of my garbage would be canceled uploads then.
I can accept that.

again, thanks for making it much more clear!

3 Likes

It highly depends on how much data of given satellites the store.
A good example was the test from my satellite. Nodes that never held any data on my satellite wont notice it, while others got a huge amount of trash.
Besides that, it also highly depends on where uploads came from that caused zombie-segments.

I will go through this thread and compile a list of nodes and will check them with the logs i have.

1 Like

@stefanbenten Didn’t your extensive cleanup happen back in late June? As of just yesterday my trash increased by 150gb. This is happening now, not from before.

My node has already gone through that trash from the test stefan benton satellite back in late June. All the trash now is from current test satellites. My node has been up for 479h since the new 1.6.4 patch. If brightscience’s node has been online since the 1.6.4 upgrade also, why doesn’t his node have an extensive trash folder while some of us do? This doesn’t make sense.

image

As said, this can happen at any time. My satellite was an good example.
With every GC run, you can get such amounts, depending on how successful the overall upload rate was from customers (upload failures/canceled) on that specific satellite.
You can take a look which satellite it is inside of the trash folder.

It still does not explain how some SNOs do not have this happen to them at all when we all have 100% uptime. If these are from zombie segment uploads, this should not be a targeted event. These pieces are being uploaded to random nodes correct?

Just a check, can long tail cancelations end up as garbage? I wouldn’t expect that, but that could explain some differences if that’s the case.

1 Like

Assumption from: https://support.storj.io/hc/en-us/articles/360028695691-INFO-piecestore-upload-canceled

Every upload is cut up into 110 erasure encoded pieces of which 29 are needed to recreate the data. Uploads stop after 80 of those pieces have been uploaded successfully.

Using the above quote, do the last 30 cancelled pieces always end of up in GC? If that is true, why can we not just delete them right away. This is not about zombie segments, its about being too slow to the race and having to suffer the consequences. (IO thrashing from copy/delete and 7 days of trash hold).

*my node is in Los angeles and looks like the test data is originated from Germany. So looks like nodes that are closer to the source will always win the race. Again, assumption.

Not directly, simply by latency and location of the upload, some nodes might ever be long tail cutted.
Eg.: I upload from Germany and you have a node in Brasil. Then its fairly unlikely that you would be faster than most of the other nodes, simply due to latency.
This means, that some nodes will never collect data from a satellite/user that canceled uploads based on those circumstances.

So do long tail cancellation end up in GC?

No, you understand it wrong then. Long tail cancelation explains why you get not the same data than other nodes, even handled by the same satellite.

If your node does not get data from users that caused zombie-segments, due to latency etc., then you will not get much trash.

If it is not from long tail cancellations and it is from zombie segments, why are there much great discrepancies between SNOs. I have 300GB of it and other SNOs have only a few MBs or none. Doesn’t seem very distributed.

its cancelled upload remnants which goes into the trash…

but because the log currently doesn’t show the accurate cancelled uploads, thus we are blind to the stats on this… and it seems random… (however the log issue with cancelled uploads have been identified and will hopefully be coming out with the next update…)

then it should be much more clear what is happening…

oh almost forgot to say… CALLED IT!!! xD

Node ID: 1ubaZxVGDbFCMtN5s6XdUSURgeNde2DLfBZYzhJoSLAQzfmXHK
Location: Los Angeles, USA

image

244G    ./6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
5.0G    ./pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
11G     ./qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
13G     ./v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa
14G     ./ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
14M     ./abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
285G 

Just delete it then if I’m slow to the race. Stop letting it go to GC. :nauseated_face:

one thing tho… if it’s cancelled pieces that was completed… then they wouldn’t be checked or needed later right… so one could essentially… well i’m not suppose to suggest stuff like that… :smiley:
tho i don’t really understand how that works… so most likely not a good idea… and i wouldn’t but if they aren’t checked… then … well i’m sure they will look into it in the next update seems like a simple solution just to not trash the pieces but actively delete them instead of saving them since they don’t serve any purpose…

but yeah … just me thinking out loud… it’s okay to be scared lol

Not the time to get cocky. @stefanbenten just said it wasn’t the long tail cancelations, but the cancelations could have caused some people to not get high trash amounts because they didn’t get the pieces in the first place.

I’m not totally convinced by that argument as I think my node doesn’t lose very many races, which means I should get the bulk amounts of trash as well.