Current situation with garbage collection

For some reason discourse truncated the links (despite that I still see the fragment identifier when editing it). See section 4.1 for the inode, and section 4.3 for the directory data structure.

If I’m wrong, then I stand corrected.

The original file’s data is still untouched though :slight_smile:.

Never said the file’s contents change.

You said the original file is copied to a new directory and the original is then removed, or am I misunderstanding?

Sorry, I don’t see the word copy in my messages?

??? Unless you have the RAM to handle a 100GB file while it is read from a drive, deleted, then re-written to a new directory, that sounds like a copy to me. As I said, this is not the case. Don’t take my word for it, test it out with a 100GB file.

By remove the file from directory I meant remove the inode. By add the file to a directory I meant add the inode. The file (inode or whatever equivalent other file systems have) is the same. Sorry if this became confusing.

1 Like

Then we are in agreement that only inodes get touched, not file data.

I think that the upcoming solution for trash is the least IO expensive way for an SNO. Files get moved to a by-date folder, then that folder gets deleted X days after. No need to rescan the entire trash folder each time you need to figure out what is older than X.

1 Like

Don’t you guys start agreeing yet! I want another ext4 vs. zfs argument! :wink:

2 Likes

Why is that? I have a node which is way older and at around 13TB or so - and it has only GBs in Trash? Maybe it is because there’s no space left (it is on the drive but not for STORJ) it cannot move / delete stuff around and is stuck?
1.57 TB of trash is gigantic…

That argument was settled a long time ago: ZFS is perfect for long term storage of never changing data. ZFS does not have any sort of defragmentation, other than “create a new dataset and send everything over”. If you use ZFS for storj, you’ll see 50-70% fragmentation in the first year. If you use ext4 you’ll see 3-4% over the next 20 years.

2 Likes

As far as I know a node can’t stay full forever. Eventually someone somewhere deletes something and this in turn trickles down to your node. There was an idea a few months ago to mark full nodes as unhealthy and migrate some data off them, dunno if that was implemented. If that node only shows GB in trash, that means it’s not going through the blooms properly or you got really really lucky (tip: you are not that lucky :slight_smile: ).

To anyone freaking out with trash: it hasn’t even been a week of deletes. In a couple of days that trash will start being cleared up. Whether or not it will be immediately replaced with more trash is irrelevant. Everyone should get a nice cup of (decaf) coffee and wait for the nodes to clear themselves up.

1 Like

Let me check my node at home, my VPN doesn’t work so cannot check now but tonight. Maybe it’s more.
For the other one - I am chill, so lets see in a weeks time how it developed.

No disagreement here, just bad wording leading to misundersanding.

3 Likes

Turns out there is also a bug on the storage node dashboard.
Left side:_ ~3 740 000 000 000 displayed as 3.41 TB (base 1024)
Right side: ~3 990 000 000 000 displayed as 3.99 TB (base 1000)

Because the dashboard shows the numbers wrong it looks like 580 GB unpaid space. In reality the unpaid space is just 150 GB.

8 Likes

They stop making reliable things that just work, like they use to.
Cars, electrostuff, filesystems… :sweat_smile:
The good ol’ ext4 still kicking ass of any new kids on the block.

This freakin’ dashboard again…:man_facepalming:t2: How hard can it be to implement it the right way? There are just a few fields and numbers. It made so many angry spartans just of some minor bugs.

1 Like

You are welcome to submit a pull request. You will find out quickly how hard it will be :wink:

3 Likes

That’s why I don’t code :sweat_smile:. I was just whining to pass the time.

3 Likes