Current situation with garbage collection

Roxor · April 10, 2024, 1:48pm

That’s going to be so nice! Between maintenance tasks, and occasional ISP reconnects, and just regular updates it feels like used-space-filewalker gets interrupted quite often when nodes restart. I don’t care if it may lose some precision: as long as it’s not starting-from-scratch all the time.

Mitsos · April 10, 2024, 1:49pm

Exactly. Now all of the previous segments that reference that changed letter need to be deleted because EC doesn’t match anymore. Your node got 1 new piece and has to delete its 1 old piece. We may be arguing semantics, and I’m all for it, but essentially your ZFS array has to allocate space for 1 new piece, wait for the bloom to come (your 1 old piece is still held on your node), then delete that old 1 piece. As in copy the changed file, wait for the copy to complete, then delete the old file.

Ambifacient · April 10, 2024, 1:50pm

This is great to hear thank you for following up on this.

BrightSilence · April 10, 2024, 1:59pm

That is great news indeed. Though the last one I received was also 4100003 bytes. How come that one was received without issue then?

littleskunk · April 10, 2024, 2:02pm

There was a deployment in between. The new code will try to send any bloom filter that is bigger than the magic number the new way and your node can’t deal with that.

Mitsos · April 10, 2024, 2:10pm

So for the adventurous ones among us, does that mean that using the latest versions (higher than what is being currently being deployed) will use the new blooms properly?

pangolin · April 10, 2024, 2:20pm

One of my windows nodes is on 1.101.2. Looking good so far…

PS C:\Users\Administrator> cat "C:\Program Files\Storj7\Storage Node\storagenode.log" | sls "bloom"

2024-04-05T15:58:34+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode", "createdBefore": "2024-04-01T17:59:59Z", "bloomFilterSize": 2675}
2024-04-06T10:03:27+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "createdBefore": "2024-03-27T17:59:59Z", "bloomFilterSize": 192700}
2024-04-07T04:22:43+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "createdBefore": "2024-04-02T17:59:59Z", "bloomFilterSize": 41194}


PS C:\Users\Administrator> cat "C:\Program Files\Storj7\Storage Node\storagenode.log" | sls "gc-filewalker"

2024-04-05T15:58:34+02:00       INFO    lazyfilewalker.gc-filewalker    starting subprocess     {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-04-05T15:58:34+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess started      {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-04-05T15:58:34+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess Database started        {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode"}
2024-04-05T15:58:34+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Process": "storagenode", "createdBefore": "2024-04-01T17:59:59Z", "bloomFilterSize": 2675}
2024-04-05T15:58:59+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "piecesCount": 7070, "piecesSkippedCount": 0, "Process": "storagenode"}
2024-04-05T15:58:59+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess finished successfully        {"satelliteID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2024-04-06T10:03:27+02:00       INFO    lazyfilewalker.gc-filewalker    starting subprocess     {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-04-06T10:03:27+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess started      {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-04-06T10:03:27+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess Database started        {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode"}
2024-04-06T10:03:27+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "createdBefore": "2024-03-27T17:59:59Z", "bloomFilterSize": 192700}
2024-04-06T10:45:27+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Process": "storagenode", "piecesCount": 724411, "piecesSkippedCount": 0}
2024-04-06T10:45:27+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess finished successfully        {"satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-04-07T04:22:43+02:00       INFO    lazyfilewalker.gc-filewalker    starting subprocess     {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-04-07T04:22:43+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess started      {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2024-04-07T04:22:43+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess Database started        {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode"}
2024-04-07T04:22:43+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker started   {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "createdBefore": "2024-04-02T17:59:59Z", "bloomFilterSize": 41194}
2024-04-07T04:26:28+02:00       INFO    lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Process": "storagenode", "piecesCount": 94601, "piecesSkippedCount": 0}
2024-04-07T04:26:28+02:00       INFO    lazyfilewalker.gc-filewalker    subprocess finished successfully        {"satelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}

littleskunk · April 10, 2024, 2:24pm

Negative. The idea was that the satellite sends the bloom filter the old or the new way depending on the side. The new way should never be used unless we increase the maximum bloom filter size. Before we do that we need to roll out the new storage node version to as many nodes as possible and also we need to test the new way to send bloom filter.

So for now the bloom filters are still limited to 4 100 000 003 bytes regardless of how big your storage node is. We can fix that without having to wait for a storage node rollout. The satellite can still send that the old way. That will buy us a bit of time. Its better to run a bloom filter that has a high false positive rate (bigger storage nodes) than running no bloom filter at all.

Mitsos · April 10, 2024, 2:36pm

Please stop asking me for sources. You can use google, duckduckgo or bing.

Mitsos · April 10, 2024, 2:49pm

“~70%” (direct quote) is closer to “80%” (direct quote) than to “50%” (direct quote).

pangolin · April 10, 2024, 3:07pm

In my opinion storj is the best case for ZFS. It is always complete files. Those files are always small and when using default blocksize the majority of files is just one block. Even if ZFS is reporting high fragmentation there should be a good chance for new files not to get fragmented.

arrogantrabbit · April 10, 2024, 3:32pm

Wait a second. Does it mean that if node is offline when the filter is sent the associated garbage remains forever? Or do I misunderstand?

littleskunk · April 10, 2024, 3:44pm

No it still is a bloom filter. It is kind of the oposit. Instead of a list of pieces to delete the bloom filter allows your node to check which pieces it has to keep on disk and it can trash any piece that is not a match against the bloom filter. So if you skip one bloom filter the next one would just clean up the garbage that wasn’t removed by the previous run.

Same for the false positive rate. A normal node will delete 90% of the garbage but 10% will be a false positive match and the node keeps it on disk. Next bloom filter removed another 90% of the left over garbage. That leaves 1% + additional 10% new false positive garbage. And one week later we are looking at 0.1% and so on. Even with a 10% false positive rate it should get cleanup up quickly.

Thats for normal nodes. Things change when the bloom filter isn’t big enough to keep the false positive rate at 10%.

arrogantrabbit · April 10, 2024, 3:48pm

Thank you, yes, this is what was my understanding how bloom filter is used here, but I was confused by the discussion of re-sending the missed filter. What would be the point to resend it then, if the next one should do the job?

pangolin · April 10, 2024, 3:51pm

Since the delete by bloomfilter is such a mess I wonder why you are doing it this way? I can’t see any advantage over direct deletes from my point of view as a SNO.

littleskunk · April 10, 2024, 3:52pm

The bigger nodes haven’t received a bloom filter in a month now. Sending them the missed bloom filter now cuts the time it takes until the space gets freed up. And since the bloom filter was expensive to generate we can as well send it for free instead of waiting for the next expensive bloom filter generation. Also we get a faster feedback if the fix worked. Waiting for the next bloom filter could set us back another week.

littleskunk · April 10, 2024, 3:57pm

How is your node going to recieve a direct delete while it is offline? → We need a fallback process for this anyway. And it turns out bloom filter are amazing. So we can as well use them as the primary solution to delete any garbage.

Direct delets have been disabled since they don’t work well with server side copy. Let’s say the customer uploads segment A and makes a Copy of it. Then he deletes segment A. In this case the satellite is not allowed to send direct deletes because that would wipe the copy as well. 2 segments in the satellite database are pointing to the same set of nodes and pieceIDs. Only after the last copy gets deleted it is safe to delete from the storage nodes. Now garbage collection does this with ease. With direct deletes it gets quite expensive to find the last copy of a segment.

pangolin · April 10, 2024, 4:09pm

It can not receive a bloom filter either when it is offline. The only difference would be to issue another direct delete instead of adding it to a bloomfilter again.

According to the server side copy issue…This seems to be not a node related problem and should be addressed where it happens.

littleskunk · April 10, 2024, 4:14pm

So you would like the satellite to keep a list of billions of direct deletes that couldn’t be send and retry a few times? When should the satellite give up cleaning up that backlog? When should it assume the node is not going to come back and just drop the delete message?

Oh it will be a problem for the storage nodes. Server side copy is a requirement for some of the recent customers. Would you like to get some usage from these customers or should we tell them to go away and store the data somewhere else?

pangolin · April 10, 2024, 4:28pm

How does the satellite know what pieces to include in the bloom filter? Isn’t this a much bigger list?