Trash comparison thread

Alexey · July 17, 2020, 5:54pm

Is it older than 7 days? Check the modification time

litori · July 17, 2020, 5:58pm

No, the trash are less than 7 days.

Vadim · July 17, 2020, 5:58pm

As i remember, there was made that if node online all the time, data will be deleted online, and if node not got it at moment of deletion, then it was gone to trash. Or i mistaken somewhere?

Alexey · July 17, 2020, 5:59pm

Offline or unable to do it when asked.

litori · July 17, 2020, 5:59pm

My node has been up for 408h 24m, which is greater than 7 days and the trash are all under 7 days.

Vadim · July 17, 2020, 6:01pm

so if node there shold not be so many trash at all.

litori · July 17, 2020, 6:02pm

@Alexey @Sasha so which is it? My assumption was due to the fact that my node is slow so it can not perform the trash delete right away. So is this assumption correct?

goldyka · July 17, 2020, 6:14pm

Should i be worried?
This happened 3 days ago, didnt change anything in the config, and got my free space in the - zone

Alexey · July 17, 2020, 6:36pm

Depends on connection speed maybe, because:
Windows docker node

Windows GUI node

Raspberry Pi3

SGC · July 17, 2020, 8:46pm

if i was to hazard a guess, then the trash deletions is either related to successrates (not the logged ones, the real success rates) in bandwidth comparison we see that almost all nodes get exactly the same ingress with slight deviations, but often within 1% of each other unless if they have other factors affecting it…

if everybody gets the same amount of ingress and it’s a race, the it is possible that those that loose the “race” / have lower successrates deletes the cancelled data afterwards it’s been received, because it becomes completely downloaded before the node has time to cancel it.

another possible option is that new or certain data is more often deleted and thus some nodes will experience long periods with large delete to stored ratios… just like some will have increased egress because of some of the data their node have…

this is however purely speculation, tho i think they are both valid arguments from how i understand the fundamental concepts… but i have no detailed or real understanding of how the system really works…

This is not financial advice, YMMW, I am not a health professional, if you have health concerns you should go see a doctor, i am in no way responsible for any damages, mental issues and or insults you may derive from reading my comments… xD

just covering all my basis hehe

Derkades · July 17, 2020, 10:32pm

Trash is very small for me. Large node:

Small node:

Medium node on unstable connection with ~5 hours downtime every day, still not a lot of trash.

mrkeyboardcommando · July 17, 2020, 10:47pm

The funny thing is… this node is monitored by me.
As is my internetconnection.

Both were not down the last couple of weeks.

So, based on information i got, the node just needs to delete the data right away.
It should not be moved to trash.

My question is still, why is there so much data in trash.
Is it because a sattelite is misbehaving, is my node misbehaving… is it something else.

Negative earth beams maybe?

For the sake of completeness:
133G 6r2fgwqz3manwt4aogq343bfkh2n5vvg4ohqqgggrrunaaaaaaaa
378M abforhuxbzyd35blusvrifvdwmfx4hmocsva4vmpp3rgqaaaaaaa
8.4G pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
9.3G qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa
13G ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
11G v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa

litori · July 17, 2020, 11:55pm

If my node is online and due to it being slow, can not finish the delete when the delete request is submitted to my node right away. Why can’t the node just queue the delete and report back to the satellite that the delete is finished? Why does it need to be copied and move to trash and wait 7 days before the delete happens.

The crappy part is that “because my node is slow” now incurs a “copy and delete” penalty to the trash folder AND my node needs to hold the trash for 7 days. fml…

mrkeyboardcommando · July 18, 2020, 12:08am

i do not think that is the way it works… or should work.

Lets wait what others have to say about this behavior.

kevink · July 18, 2020, 5:02am

There is a delete queue now (check the last changelog).
So iirc if your node is too slow when the request is sent, at one point the delete instruction won’t get executed anymore and the Garbage collector will run eventually and find pieces that shouldn’t be on your node anymore. This process copies the files to the trash (instead of moving it…) where it stays 7 days, which seems to be important because Garbage Collection works with a bloom filter that doesn’t always hit the correct pieces and as I could observe on my node, it can happen that multiple hundred MB are being moved to the trash only to be moved back to blobs some minutes/hours later, which is kind of strange… This mostly happens when the node restarts so I’m not sure it is the same mechanism but that’s my observation so far.
I was waiting for my trash to go up again so I could investigate but currently it is at 0Bytes so have to wait a bit longer…

litori · July 18, 2020, 8:52pm

So if my node is online and received the deletes request, just let it keep queuing the deletes and report back to the satellite when it’s done. Why does it need to go to garbage collection. Leave garbage collection to the nodes that were offline and never received the deletes request.

Sasha · July 19, 2020, 11:58pm

If there is a delete queue the delete request should stay in the delete queue until it’s completed and shouldn’t move into a “garbage collection” scenario for valid delete requests.

Delete queue should be persistent ie, if you shutdown the node and start up, the delete queue should have the same information as it did before the shutdown.

mrkeyboardcommando · July 20, 2020, 12:10am

Is it that when a node is too slow handling the delete request and the queue is not emptied fast enough that the request times out… and then nothing happens till the GC comes along?

Seems a waste of computing power.

Why not (as Sasha said) keep de list indefinitely and report when the request is received and again when the request is processed?

Then there is nothing moved to GC.

Also, my node does not reports that it “context canceled” a delete. at least not for a long time.
========== DELETE =============
Failed: 0
Fail Rate: 0.000%
Successful: 58075
Success Rate: 100.000%

kevink · July 20, 2020, 4:59am

idk, that’s how i understood it but I can agree that the current solution is quite weird and still leads to a lot of trash on some nodes.
Just as copying files to the trash folder is unneccessary IO on an already slow device (but it could have been offline and therefore missed the deletes).

Sasha · July 20, 2020, 5:05am

This should be investigated further by StorJ with a high spec node to identify the root cause to address the concerns raised by SNO’s.