Current situation with garbage collection

Toyoo · April 4, 2024, 10:02pm

For some reason discourse truncated the links (despite that I still see the fragment identifier when editing it). See section 4.1 for the inode, and section 4.3 for the directory data structure.

Mitsos · April 4, 2024, 10:05pm

If I’m wrong, then I stand corrected.

The original file’s data is still untouched though .

Toyoo · April 4, 2024, 10:05pm

Never said the file’s contents change.

Mitsos · April 4, 2024, 10:08pm

You said the original file is copied to a new directory and the original is then removed, or am I misunderstanding?

Toyoo · April 4, 2024, 10:09pm

Sorry, I don’t see the word copy in my messages?

Mitsos · April 4, 2024, 10:12pm

??? Unless you have the RAM to handle a 100GB file while it is read from a drive, deleted, then re-written to a new directory, that sounds like a copy to me. As I said, this is not the case. Don’t take my word for it, test it out with a 100GB file.

Toyoo · April 4, 2024, 10:13pm

By remove the file from directory I meant remove the inode. By add the file to a directory I meant add the inode. The file (inode or whatever equivalent other file systems have) is the same. Sorry if this became confusing.

Mitsos · April 4, 2024, 10:17pm

Then we are in agreement that only inodes get touched, not file data.

I think that the upcoming solution for trash is the least IO expensive way for an SNO. Files get moved to a by-date folder, then that folder gets deleted X days after. No need to rescan the entire trash folder each time you need to figure out what is older than X.

Roxor · April 5, 2024, 1:29am

Don’t you guys start agreeing yet! I want another ext4 vs. zfs argument!

tankmann · April 5, 2024, 7:12am

Why is that? I have a node which is way older and at around 13TB or so - and it has only GBs in Trash? Maybe it is because there’s no space left (it is on the drive but not for STORJ) it cannot move / delete stuff around and is stuck?
1.57 TB of trash is gigantic…

Mitsos · April 5, 2024, 8:01am

That argument was settled a long time ago: ZFS is perfect for long term storage of never changing data. ZFS does not have any sort of defragmentation, other than “create a new dataset and send everything over”. If you use ZFS for storj, you’ll see 50-70% fragmentation in the first year. If you use ext4 you’ll see 3-4% over the next 20 years.

Mitsos · April 5, 2024, 8:08am

As far as I know a node can’t stay full forever. Eventually someone somewhere deletes something and this in turn trickles down to your node. There was an idea a few months ago to mark full nodes as unhealthy and migrate some data off them, dunno if that was implemented. If that node only shows GB in trash, that means it’s not going through the blooms properly or you got really really lucky (tip: you are not that lucky ).

To anyone freaking out with trash: it hasn’t even been a week of deletes. In a couple of days that trash will start being cleared up. Whether or not it will be immediately replaced with more trash is irrelevant. Everyone should get a nice cup of (decaf) coffee and wait for the nodes to clear themselves up.

tankmann · April 5, 2024, 8:37am

Let me check my node at home, my VPN doesn’t work so cannot check now but tonight. Maybe it’s more.
For the other one - I am chill, so lets see in a weeks time how it developed.

Toyoo · April 5, 2024, 8:42pm

No disagreement here, just bad wording leading to misundersanding.

littleskunk · April 8, 2024, 10:07am

Turns out there is also a bug on the storage node dashboard.
Left side:_ ~3 740 000 000 000 displayed as 3.41 TB (base 1024)
Right side: ~3 990 000 000 000 displayed as 3.99 TB (base 1000)

Because the dashboard shows the numbers wrong it looks like 580 GB unpaid space. In reality the unpaid space is just 150 GB.

snorkel · April 8, 2024, 11:01am

They stop making reliable things that just work, like they use to.
Cars, electrostuff, filesystems…
The good ol’ ext4 still kicking ass of any new kids on the block.

snorkel · April 8, 2024, 11:05am

This freakin’ dashboard again… How hard can it be to implement it the right way? There are just a few fields and numbers. It made so many angry spartans just of some minor bugs.

littleskunk · April 8, 2024, 11:07am

You are welcome to submit a pull request. You will find out quickly how hard it will be

snorkel · April 8, 2024, 11:08am

That’s why I don’t code . I was just whining to pass the time.

nerdatwork · April 8, 2024, 11:11am