Trash comparison thread

well there goes that idea lol

1 Like

mind you, i did not have ingress for a couple of days because of the trash folder being this big.

Might make a difference.

Deletes and canceled uploads are not going into the trash folder.

The only pieces going into trash are the ones filtered by the Garabe Collection Process ( and thus also the zombie segments ).

Thanks a lot for your replies and helping us figure out what goes on in our nodes !

@stefanbenten
sorry about me not understanding the context completely.

if bright explained it correctly, then the zombie segments are essentially the same as cancelled uploads and thus wouldnā€™t the answer simply be to delete them as well upon ā€¦ disruption of the connectionā€¦

or is there a reason to keep them, in case the connection is resumed or what notā€¦
and in that case wouldnā€™t it just be sensible to reduce the time before deletion of these particular zombie segmentsā€¦ i mean it sucks when a download crashesā€¦ but it also seems kinda crazy if i can resume it in a full weekā€¦ (and if one could do that, canā€™t it be exploitedā€¦)

currently some of the 1tb nodes have like 300-400gb trashā€¦ going from a week to a daily schedule would reduce this by a factor of 7 making it a much more reasonable numberā€¦

ofc just postulatingā€¦

1 Like

Zombie segments are easy to visualize. Lets take a file of 512MB.
With the previous explained segment size of 64MB, you will need to upload 8 of these to store your 512MB file.

Lets assume it goes like this:
Segment 1 done,
Segment 2 done,
Segment 3 done,
Segment 4 done,
Segment 5 started,
User CTRL+C the upload.

In this case the pieces of segment 5 are deleted immediately. On the other hand, the pieces of the segments 1-4 are fully transfered and settled/commited. That means the nodes store them and the satellite is keeping track of such segments, even if the upload was canceled.
The user as of right now, wont see those segments in his listing operation, but in fact actually pays for those left over segments, that are not correlating to a file.

For exactly these scenarios, the segment-reaper searches for files that are not fully commited (we know upfront how many to expect, do not want to go into more details here) and deletes the remaining segments out of the track keeping system.

Shortly after, the GC process will generate the bloom filters that describe which pieces you should and which you should not have. Once that process sent these filters to your node, it will move all pieces, that do not match said filter into the trash directory.
This is purely a safety measurement from our side to ensure we do not accidentially delete all data with a bug in the GC process.

Hope this explains the different scenarios and which end up in trash folder and which do not.

5 Likes

but whats the point in keeping a partial upload (zombie segments) that already got 1 segment deletedā€¦?

iā€™m a big proponent of failsafes xD

does that mean the trash folder is also paid for to SNOā€™s because i suppose thats mainly what all the crying is about

1 Like

There is no atomic safe way to delete the remaining, already fully commited, segments with the current design. As said previously, we have a document/design how to fix this issue, but it is not yet implemented.
And no, the trash folder currently is not paid.

2 Likes

Also not really possible to pay for it. If you somehow do start paying for it Iā€™ll make sure itā€™s always filled. :wink:

1 Like

image
image

2 Likes

atleast you are getting hot coffeeā€¦

Same for me, 1/8 th of storage is trash and its growing, also free space is down negative.

1 Like

Itā€™s good to hear that you guys are working on resolving this issue.

Just a question: Why would the customer be charged for these zombie segments when SNOā€™s are not being paid for these as theyā€™re in trash?

2 Likes

As long as they are zombie segments the customer has to pay for it and the storage nodes are getting paid. It gets moved into the trash folder after the zombie segment reaper has deleted the zombie segment and at that point the customer donā€™t need to pay for it.

3 Likes

not sure if itā€™s much useā€¦ a bit lower than usual
image

Why are there such a big discrepancy between SNOs for zombie segments. Isnā€™t storj distributed? Everyone should have an equal distribution of zombie segments if this was the case. @goldyka where are you located geographically.

Thatā€™s a really good post. It needs more visibility.

Im also located in Romania

we really need a trash graph disk space used this month graph on the dashboard just like we need a daily deleteā€™s on the bandwidth used this month graph.

like having graphs in multiple colors that switch color when over lappingā€¦ so itā€™s easy to see two graphs in one locationā€¦

like this two easy to read graphs in one


PS mostly storj data, tho not quiteā€¦ it was just a good example

1 Like

afbeelding

It deleted some!

Already on its way to being filled again.