GC into old trash date folder?

Was something changed?
I see a Bloom filter with file system date 12th of September. It is my understanding that that was the day it arrived on the node.
There is no other Bloom filter for this satellite.
But I see that garbage collection is happening into the date folder 2024-08-29 which is 14 days ago? Shouldn’t there been a date folder created with date 12th September and garbage collection happen into that folder rather into a date folder that is 2 weeks old?

ls -l  /config/retain/
-rw-r--r-- 1 root root    5676 Sep 12 21:30 pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa-1725415045536926000.pb
-rw-r--r-- 1 root root   96732 Sep 12 21:35 qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa-1726077599555265000.pb
-rw-r--r-- 1 root root 3618783 Sep 12 16:46 ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa-1725818399995868000.pb
-rw-r--r-- 1 root root  670313 Sep 12 03:11 v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa-1725991199962559000.pb

ls /trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
2024-08-27  2024-08-29

ls trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-08-29/xc/ | wc -l
4895

ls trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-08-29/xc/ | wc -l
4896

ls trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-08-29/xc/ | wc -l
5084

Is there something wrong?

What is the date inside BF?

How do I check this?

In your logs, retain prints for what’s data before it should be applied.

Unfortunately it is no longer in the logs.
Does it work that way? Can a Bloom filter from the 12th tell the node to collect garbage into such an old date folder?
Because such an old folder would be up for deletion by the trash core immediately and that would mean no 7 days non-deletion period for the collected trash.

I just trying to figure out what’s going on with your node, because my nodes put their trash in the date-named folders where the date in the name matches the date when the BF has started to be processed.
And when the node is restarted a day later due to upgrade it created a new folder with a name as a new date but still processing the yesterday started BF.
Is it possible, that your BF which started at 2024-08-29 is still processing?
You may check by /mon/ps endpoint on your debug port.

This is what I have seen on other nodes as well: BF date → start process → date folder = start date of processing.
That’s why I have asked if maybe something has changed so that maybe when a new Bloom Filter arrives while an older one is still processing, maybe new behavior is that it gets merged with the existing one and processing keeps going on or something.
But I have really no idea. I could restart the node and see what happens.

I do not think that there is a change for merging BF for different dates, it’s too dangerous to merge them, the best course of actions is to replace the not processed BF with a newly arrived. And this is how it works now.

This would happen:

Actually that’s mean, that this half processed data would be removed one day later.

Well if I restart the node now, my guess would be that it would resume the GC but that the node would then create a new trash date folder with date of today as 2024-09-13. So normally it should start immediately to delete the date folder 2024-08-29 and deletion of the date folder 2024-09-13 would be only after 20th of September.

Edit:
Just a little update. The node has finished gc-ing into xc and xd and is currently doing xe:

ls /trash/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/2024-08-29/xe | wc -l
3507

So garbage collection into old trash date folder is ongoing. I wonder if this maybe could have something to do with expired pieces that had not been collected?

exactly. At least it’s happened with my node that date.

Yes, it will do, see

more likely it’s related to a very slow process of the garbage collection on that node.

But the question is still the same: Does it make sense for a Bloom filter from 12th September to trash pieces into a 28th August date folder?
If it is assured that the pieces can be deleted without 7 days non deletion period, then it is fine.
If the non-deletion period is required, then this sounds like a failure to do so.

You may just restart the node and it would process the remaining pieces to another folder, which will be deleted 7 days later.
I do not think it’s worth it. Just let it finish its work.
Moreover, you are interested that it would continue to trash to the older folder, because likely it’s deleted in the same time.

I don’t think I will do that now because interesting things are happening:
The node is deleting trash in the same date folder into which it is currently garbage collecting. It is currently deleting the it folder contents, while it is moving garbage into the xw folder.
This could get interesting.

1 Like

For my understanding it’s working as a TTL collector now with an additional move call.

I can see that the folders it is currently deleting have last modified date around September 2nd.
So there must have been a Bloom filter around that date and before 12th September that trashed the garbage in there.
It looks like 2 different Bloom filters were set to trash into the same date folder.
I don’t know if this is intentional. It sounds weird.

As far as I understand, the date folder is creating when the BF is started to process and it moves pieces there. If you didn’t increase the retain concurrency, then reusing the same folder with date will only be possible if the first bf is completed before 2024-09-03t00:00:00Z.
So, I do not think that’s possible that the BF for before 2024-09-12 could use 2024-09-02 folder. And by the way, you can also track it with the BrightSilence’s script.