Trash does not go away in 7 days

striker43 · May 3, 2024, 6:34pm

yes, if you don’t run the used-space-filewalker, the node still thinks it has a huge amount of trash, even if your trash folder is empty. But after running it, it will update all the values and show 0.

Mitsos · May 3, 2024, 9:26pm

Allegedly. On a couple of nodes of mine the df vs the node reported space is skyrocketing. I’m monitoring this across my other nodes. For example a node that was previously at 98% df reported space, is now at 83% df reported space, yet the node still thinks there are only 494.46MB free. Used space FW was run when this node was upgraded on April 24th.

/dev/sdb1                     3.6T  3.0T  646G  83% /mnt/node16

Yea I get underlying storage cluster size, but we aren’t talking about a GB in difference here. This node was previously reporting free space pretty accurately with no changes on it. It is 100% related to the (edit: automatic) trash deletion, but as Powell (has always) said “we need more data to support this”.

Edited (yet again) for full disclosure:

du -sh /mnt/node16/storagenode/storage/trash
191G    /mnt/node16/storagenode/storage/trash

Screenshot_20240504_005314

Mitsos · May 3, 2024, 11:02pm

The allegations are true!
Screenshot_20240504_015416

Obviously the solution will be to constantly run used-space FW. Can a future update take care of that?
I understand that the following is the highest priority (rumor has it all company vacation has been recalled and there will be a meeting first thing Monday morning with everybody physically present, from the janitor to the CEO) to discuss how to fix the following show-stopping bug, but can someone please allocate some time to continuously running the used-space FW? I’m not expecting an ETA this year, maybe 5-6 years down the line, it’s not too serious.

Screenshot_20240504_015655

For those that are obviously humans, the “version” padding is off, it needs to be centered above the version number. If I had caused that bug, I would have handed in my resignation already, it’s very serious and extremely high priority, hence Monday’s morning meeting.

Mitsos · May 4, 2024, 12:43am

Re-run used-space FW on another node. Old free space was ~500MB. New free space is 1.91TB.

And again: used-space FW was run when these nodes were upgraded last week or so.

snorkel · May 4, 2024, 2:13am

It seems that those recursive dirs have been deleted. I don’t see them on my nodes anymore. And no older than 7 days dirs either. I didn’t do anything manualy, just let the software do it’s thing.

Ambifacient · May 4, 2024, 3:37am

For me the trash is removed properly, but the trash reported space is not updated. My trash growth has been monotonic for the past month, but the true value is much lower. Right now to avoid doing the full used-space filewalker I just run du -sb in the trash folder and update the databases manually.

thelastspark · May 4, 2024, 3:42am

Just to confirm, if we are okay leaving a messed up trash folder, you guys will be getting around to making it clean itself up?

Or will it always require a manual deletion step from SNO’s (similar to unlisting the satellites a short while ago)

Mark · May 4, 2024, 5:39am

Same problem on my node. The used space database(for trash) did not update after the trash was deleted. It caused my node to continue to think it was full when it actually now had space available, preventing new ingress from occurring. In my case, I re enabled the file walker in the config file and restarted the node to activate it. Next time, I might just edit the DB like you did. I don’t enjoy wasting additional electricity and wearing out the drive faster by walking millions of files just so the node can realize that the trash folder is empty…

snorkel · May 4, 2024, 8:39am

So now the databases don’t update? This trash is an endless problem…

Alexey · May 5, 2024, 3:58am

11 posts were split to a new topic: A new logo for storagenode

kocoten1992 · May 4, 2024, 8:48am

Could we have some sort of command to resync it?

./storagenode doctor:trash, ./storagenode doctor:trash:resync … etc? I realize there is no way to have ACID transaction between a database and a filesystem, sometime powerloss happen, provide util command like that for node operator to schedule the fix themself is the best way to move forward?

00riddler · May 4, 2024, 9:55am

That’s what i don’t understand.
Why does trash deletion not automatically update the database for the used disk space?
Why do we need to run an additional used-space FW for that reason?

Mitsos · May 4, 2024, 10:23am

It used to update it, now it doesn’t.

snorkel · May 4, 2024, 10:39am

Too many code changes, too short time, not enough testing… Now we do the testing.

Alexey · May 4, 2024, 11:41am

Perhaps, unfortunately I cannot check this anymore.

Alexey · May 4, 2024, 11:43am

I would expect that the pieces:trash chore must update databases as anyone else (if it’s finished of course).

Alexey · May 4, 2024, 11:47am

Then perhaps you need to check our vacancies? Because we treat people with a respect. This bug only mean that we need to create a test after the fix, not to fire people. Otherwise how they would grow?

Alexey · May 4, 2024, 11:49am

I think it should be fixed in coming releases.

Mitsos · May 4, 2024, 11:50am

The chores are finishing successfully, as shown in the log output I have already provided a few replies back. The fact of the matter is that out of all the new “improvements” to the storagenode, none are currently working as expected.

I didn’t say fire people. I said that the version padding being off is a high-priority-show-stopping bug, which if I had caused, I would have handed in my resignation.

Alexey · May 4, 2024, 11:52am

I still did not get it, sorry, perhaps it’s a sarcasm which I do not understand even on a native language…

However, if the trash chore didn’t update databases after the successful run I would call it a bug.
Could you please submit an issue on our GitHub?