Trash does not go away in 7 days

No it does not, the file modification time stays the same. You aren’t modifying the file’s contents. (tested it before posting the previous reply because I thought I was going mad reading all of this)

To solidify the fact that trash does indeed get cleaned up, I had some folders in trash that were deleted a few minutes past 00:00 UTC last night (was waiting for that just to double check I’m not going mad)

So to sum this topic up and hopefully stop the hourly topics on the subject: On baremetal Linux, running version 1.102.3, date folders have been verified to be deleted as scheduled.

I checked my nodes and noticed that the trash is being deleted, but very slowly (it’s still deleting 2024-04-16 on some nodes). Slow removal is okay for me, but I still don’t understand why, after removing all files, the trash-filewalker kept directories (I mean all “date/prefix” directories). Is this expected behavior? It’s really confusing because without manually checking, it’s impossible to determine if it’s really empty. (ubuntu, docker, 1.101.3)

1 Like

Did you actually checked your clime with the scripts above?
If not, please do it.

Because we supposedly have a bug here related to removing an empty directories.

2 Likes

Checked it with ls, which shows the modification time by default. Moved a file that was created many months ago to a different directory and back. I also see this when manually updating my nodes. the executable file shows modified time a few days in the past after unzipping and moving it to the proper directory.

Unzipping?!
Pieces are not zipped in a first place. What’s did you check then?

Alexey, please re-read my comment. I never said I checked zipped trash pieces.

Yes, I have re-read, but zip is not a format for a special meaning. This is another one discrepancy task

The agreement between us, is that moving a file from one directory to another does not change it’s creation date and does not change it’s modification date.

In storj terms:
Files moved into trash will show their creation date(=modification date) and that is what all scripting shows in this thread (there are commands shown that explicitly look for files). Those files do not interest us at all (as I have previously said).

What does interest us is if there are date folders (or any other folders) left behind long after their scheduled deletion. As I have shown, that is not the case, all folders under trash are being deleted as scheduled.

So yes, we are in agreement that trash is being cleaned exactly as planned and that there is nothing wrong with the trash deletion process.

1 Like

I am posting an ls output from one of my nodes that may have been restarted during the migration to the new date-named folders and still show a mix of date-named and the old prefix-named folders. I will post the output of them being cleaned later tonight (if I am awake) or tomorrow as soon as possible to show that those folders will be deleted exactly as planned as well.

drwx------ 1026 node8 node8  20K Jun 30  2022 2024-04-18
drwx------ 1026 node8 node8  20K Apr 25 09:10 2024-04-25
drwx------    2 node8 node8 4.0K Apr 18 16:52 22
drwx------    2 node8 node8 4.0K Apr 18 16:52 23
drwx------    2 node8 node8 4.0K Apr 18 16:53 24
>>>> some lines removed due to max posting limit <<<<
drwx------    2 node8 node8 4.0K Apr 18 16:52 zx
drwx------    2 node8 node8 4.0K Apr 18 16:52 zy
drwx------    2 node8 node8 4.0K Apr 18 16:52 zz

As promised, here is the output of ls from the same node and satellite:

drwx------ 1026 node8 node8  20K Apr 25 09:10 2024-04-25
drwx------    2 node8 node8 4.0K Apr 18 16:52 22
drwx------    2 node8 node8 4.0K Apr 18 16:52 23
drwx------    2 node8 node8 4.0K Apr 18 16:53 24
>>>> some lines removed due to max posting limit <<<<
drwx------    2 node8 node8 4.0K Apr 18 16:52 zx
drwx------    2 node8 node8 4.0K Apr 18 16:52 zy
drwx------    2 node8 node8 4.0K Apr 18 16:52 zz

Date-named folder 2024-04-18 has been deleted 26-18=8 days later automatically.
The prefix folders should be deleted with tonight’s run (because their dates do not YET match the requirements).
Below is the Relatively Exact™ time the node got updated:

2024-04-18 19:34:22	1.101.3
2024-04-18 19:19:22	1.99.3

I will post the output of the prefix folders being deleted as well.

1 Like

could you please explain it to me? I do not understand, what do you mean here by “unzipping”, sorry…
We do not use any zipping anywhere, it’s so confusing to me…

Downloading storagenode_linux_amd64.zip from github, unzipping it and moving the resulting storagenode executable to the proper (according to my usage) directory before restarting the node.

The modified time of that storagenode executable will be days before the date of me moving it, proving that moving a file from one directory to another keeps the file’s modification date unchanged.

More details:
-rwxr-xr-x 1 root root 37M Apr 16 13:07 node1 (I rename storagenode to the nodeID for me)
This node was updated 2024-04-24 00:19:15 1.102.3 (ie I move that node1 executable on the 24th of April. It’s current modification time is 16th of April.

EDIT: Oh yes, I forgot that even renaming it doesn’t change its modification time (as it shouldn’t: mv command on linux = renaming).

oh, I now get it. You mean the unzipping a binary version of storagenode, not zipping/unzipping pieces…

3 Likes

I have similar experiences, trash is much older 7 days.

drwx------ 3 root root 4096 Apr 18 06:01 
drwx------ 3 root root 4096 Apr 18 06:01 
drwx------ 3 root root 4096 Apr 18 06:01 
drwx------ 2 root root 4096 Apr 26 08:10 
drwx------ 3 root root 4096 Apr 26 08:13 
drwx------ 4 root root 4096 Apr 27 09:28 
drwx------ 4 root root 4096 Apr 26 08:07 

Node is stuck >8days with 1.64TB trash on it (18th of April),
it seems like this behaviour was first seen with v 1.101.3

After watchtower updage from 1.99 to 1.101 i found tons of following logs:

2024-03-27T16:00:22Z	ERROR	collector	unable to delete piece	{"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "HEUJFR5PLARV5TMZSIP36VQ6PKUTYIIHMLICIR56AX4CHELQBIQQ", "error": "pieces error: context canceled; v0pieceinfodb: context canceled", "errorVerbose": "pieces error: context canceled; v0pieceinfodb: context canceled\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteExpired:365\n\tstorj.io/storj/storagenode/pieces.(*Store).Delete:344\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:97\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

followed by a bunch of:

2024-04-12T13:00:28Z	ERROR	collector	unable to update piece info	{"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "47KWWXM3FN67VXVIHZZAKLQO5SEQTKO5LXT7MMJANS4OQVQE66LA", "error": "pieceexpirationdb: context canceled", "errorVerbose": "pieceexpirationdb: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*pieceExpirationDB).DeleteFailed:99\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteFailed:597\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:109\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

Unfortunately the node was running full last night with now 2.24TB trash on it.
However i noticed that the disk is not full at all, the 1.64TB of trash seems already to be deleted on disk but not released again.

Filesystem                         Size  Used Avail Use% Mounted on
/dev/sdb                            25T   22T  1,7T  93% /mnt/datastore
Storage Node Dashboard ( Node Version: v1.101.3 )

======================

ID     ;)
Status ONLINE
Uptime 227h43m2s

                   Available         Used      Egress     Ingress
     Bandwidth           N/A      2.14 TB     0.88 TB     1.26 TB (since Apr 1)
          Disk     457.67 MB     25.10 TB
Internal 127.0.0.1:7778
root@storj: find /mnt/datastore/storage/trash/ -mtime +7 -type f -exec du --block-size=1000 '{}' ';' | awk '{total+=$1; count++}END{print "TOTAL", total/1000, "MB\n" "count", count}'
TOTAL 582340 MB
count 2043449
root@storj: 

Bug?

Docker, Ubuntu 20.04.6 LTS

I shared your info with the team.
I would assume from the available data that’s possible.

1 Like

I’m having a high level of deletion on my nodes and it’s kind of strange, the garbage can is filling up and it hasn’t gotten rid of it for weeks now.

In some post I saw that it was talking about certain days of the month but I don’t remember the post and I don’t understand either why it is released certain days when if it is something that is normally deleted it should be retained as little as 1 week.

Of course, the difference between space used and what the satellite counts is still not right.

We also have a bug in displaying the value on the left graph. It’s In GiB, not GB as on the right side.
So, you need to convert the left graph to SI,

8.92TiB/1e12 = 9.80764371976192 TB

However, I guess your node version is not 1.10x.x and you didn’t have all filewalkers finished successfully, since even then you have a discrepancy about 300GB.

Update on the trash saga: turns out that prefix-named folders aren’t getting deleted.

find /mnt/node8/storagenode/storage/trash/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/ -maxdepth 1 -type d -ctime +8 | wc -l
1024

find /mnt/node8/storagenode/storage/trash/qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa/ -maxdepth 1 -type d -mtime +8 | wc -l
1024

More info on these folders in my previous comments.

2 Likes

Thanks! I have passed this to the team.
Would you be able to provide a storagenode version also?