Logging behind trash / delete explained

So as I understand, first of all pieces are sent to trash and only later actually deleted from disk. Which is what I can witness by lines like:

2023-01-01T01:11:44.794Z        INFO    piecedeleter    delete piece sent to trash      {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "JER35XTPIPLZRSB5RHEBAKQGCY3BL4FUPNFMVLZG6DPOKSQA4ZHQ"}

and those appear very often:

And then I’m seeking to find actual deletion from disk being logged, which is (as I understand) lines like these:

2023-01-01T01:11:45.712Z        DEBUG   blobscache      deleted piece   {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "disk space freed in bytes": 929536}
2023-01-01T01:11:45.776Z        DEBUG   pieces  deleted piece   {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Piece ID": "WOBLUMFFKDIYAHI6EFR7637Z45YFKRG5QW6EWQ7MRRNPLRR6JNHQ"}
2023-01-01T01:11:45.776Z        INFO    collector       deleted expired piece   {"Process": "storagenode", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Piece ID": "WOBLUMFFKDIYAHI6EFR7637Z45YFKRG5QW6EWQ7MRRNPLRR6JNHQ"}

However those appear to be relatively rare. What I witnessed tonight, was ~2GB deletion in 1 hour:

But all I see in log around that time regarding deletion is the start of emptying trash:

2023-01-03T01:11:44.101Z        DEBUG   pieces:trash    starting to empty trash {"Process": "storagenode"}

And nothing about actual deletion:

[root@localhost ~]# grep -P "2023-01-03T0(1|2)" /storj/node.log | egrep -v "download started|downloaded|download canceled|download failed|upload started|uploaded|upload|canceled|sent to trash|pinged|orders"
2023-01-03T01:11:44.059Z        INFO    bandwidth       Performing bandwidth usage rollups      {"Process": "storagenode"}
2023-01-03T01:11:44.101Z        DEBUG   pieces:trash    starting to empty trash {"Process": "storagenode"}
2023-01-03T01:11:44.689Z        DEBUG   version Allowed minimum version from control server.    {"Process": "storagenode", "Minimum Version": "1.24.0"}
2023-01-03T01:11:44.690Z        DEBUG   version Running on allowed version.     {"Process": "storagenode", "Version": "1.69.2"}
2023-01-03T02:11:44.060Z        INFO    bandwidth       Performing bandwidth usage rollups      {"Process": "storagenode"}
[root@localhost ~]#

So my question is - how do I correctly identify actual deletion from disk?

Usually only some deleted pieces are sent to the trash, when you node is actually was not online when the request from the customer comes to your node.
Usually deleted pieces should be deleted right away, but if your node were offline when the customer requested to delete data, it will be collected by the garbage collector later and sent to the trash.
After 7 days it will be removed from the trash.

1 Like

Thanks but still not entirely clear about what happened last night and why that deletion was not logged anyhow?

And @Alexey - how about this Deleted pieces will be sent to trash by default ? It states pieces will not be deleted immediately…

2 Likes

Yes, but this were a temporary solution. The main behaviour is to delete piece when the customer requests to delete a piece.
If your node is not responding, or is offline, it will miss this message, so the garbage collector will move these pieces to the trash and they will be permanently deleted after 7 days.
However, I will ask the team, may be I’m wrong and this decision is still in place.

Try to search for deleted expired piece in your logs.
You may see all logs from the collector too - search for collector.

But @Alexey that’s exactly what I’m asking… :slight_smile: I see those lines… But number of them is so low if compared to “sent to trash”… Look at yesterday and two other days full day for example:

[root@localhost ~]# grep -c "2023-01-03.*deleted expired piece" /storj/node.log
50
[root@localhost ~]# grep -c "2023-01-03.*delete piece sent to trash" /storj/node.log
15684
[root@localhost ~]# grep -c "2023-01-02.*deleted expired piece" /storj/node.log
24
[root@localhost ~]# grep -c "2023-01-02.*delete piece sent to trash" /storj/node.log
13606
[root@localhost ~]# grep -c "2023-01-01.*deleted expired piece" /storj/node.log
22
[root@localhost ~]# grep -c "2023-01-01.*delete piece sent to trash" /storj/node.log
12795
[root@localhost ~]#

And it’s not just few days, I rotate log weekly so have a look at full week:

[root@localhost ~]# grep -c "deleted expired piece" /storj/node.log-20230101
886
[root@localhost ~]# grep -c "delete piece sent to trash" /storj/node.log-20230101
87108
[root@localhost ~]#

So I send so many things to trash but rarely delete them? I think I delete them, but that is for some reason not logged? My initial post explains the problem, 2GB deleted in 1 hour without single log line telling about it, except that “starting to empty trash”.

Nothing lasts longer than a temporary solution :smiley: (Even on the roadmap I don’t see the topic that would replace the temporary solution)

Let’s say the customer uploads a file with an optional expiration date. The customer can delete the file earlier like any other file that has no expiration date. Or the customer just waits until the expiration date is reached and the file will get deleted automatically. Now if we restore our database from a backup we also need to restore the pieces from trash because in the backup we will be missing the information that the customer has chosen to delete the file. The expiration date is permanent and can’t be changed. → deleted expired piece doesn’t touch the trash folder. Well, at least that is what I would expect. I wouldn’t be surprised to see a bug here. Either way, this is not the job that empties the trash folder.

I would expect to see this log line: storj/storagenode/pieces/trashchore.go at 9544a670d730f19f9ead5864dfb3c9da9f7f0b6d · storj/storj · GitHub

It looks like that is the only log line. So every now and than your node logs that one line and deletes maybe halve of your trash folder without showing any additional log line. The next day it produce the same log line but this time doesn’t delete a single piece out of the trash folder. Yea I guess that almost counts as not logged.

2 Likes