Thanks for your thoughts, I forwarded them to the team.
I disabled trash on my slowest nodes with db on USB stick, and they are flying. I don’t care if they get DQed.
Does it actually work? Because I read at least a one report that it doesn’t, but I have no confirmations so far.
I will tell you in 2 weeks if there is more trash or zero trash. Now I have a few GB on both. Some storjling said it works.
I have db on stick and the second node dashboard wasn’t loading before. Now it works.
It should work, but I has not tested it yet and didn’t has reports from others.
The trash increased on the first one.
I stopped, rm and started again just to get a clean log and make sure it takes the config into account.
I’ll keep monithoring.
With trash disabled, retain it still runs right? And should say the same thing, like “moved n pieces to trash” even though it deletes them?
I’m not sure. Could you please check - did it move this piece to the trash?
You may calculate its name using this guide:
Since they telling us not to use it for years I wouldn’t be surpriced if direct delete is not even implemented…
It was. But it was so painfully slow for the customer…
I mean direct delete bloomfilter negative files if ‘pieces.delete-to-trash: false’ is set. Was this implemented?
I am pretty new on this and just discovered what the “trash” is, I was really wondering what that trash folder was on my node. My take is this:
Storj pays the SNO to store data that it might retrieve later. When Storj decides he is not going to retrieve some data anymore, it stops paying for it.
I mean that is the whole business model, right ?
Wether that data is labeled storage, trash or pink unicorn, it does not change the nature of the data. If it is stored with the goal to retrieve it later, it should be paid because… that is the whole point of the agreement and business case…
Seems very sleazy to me to just call it “trash” therefore not pay for it, but it should NOT be deleted by the SNO because the data might be recalled later. Which is exactly the same with all the other data: don’t delete it because it might be used later and if it’s not there you won’t get paid or might even be disqualified.
That is truly mind boggling.
I find a good way to find if something makes sense or is logical or not, is to push it to the extreme and see if it still makes sense. Here it goes:
Storj should just move all data uploaded to nodes into the trash folder within 1 second of uploading it. It’s a winning proposal for Storj. It will only have to pay for the data that accrued 1 second of storage, because trash is unpaid. So the monthly average will be really low. And Storj can just retrieve anything the customer needs from the trash, since SNO are barred from deleting it.
Big win for Storj.
I think everyone can see this scenario would not be right or just.
Therefore that trash system should not exist. It’s absurd and bordering on a scam.
Either Storj pays for the storage it uses, or it doesn’t. Calling it trash is just weasely semantics to avoid responsibility.
Now, I understand that storj wants to offer the possibility to their clients to restore files they deleted, but that is Storj’s problem. Storj offers an extra service on top of just “store your files” so they have to figure how to implement it within the confines of the agreement with the SNO.
Of course, the agreement with SNO might include all kind of exceptions in legalese, and make this modus operandus technically allowed. That does not mean it’s just or ethical.
I am very sad to have found out what that trash really is and how it works.
Yet another reason a truly decentralized storage market would be great, such things would not happen.
I trust progress will continue and such issues will be obsoleted soon.
No, as far as I know it is not for customers use. It is more like an additional backup for storj.
same thing, for an SNO
one more reason Storj should pay for it then
Of course they should pay for it.
That’s not what trash is used for.
Trash is there in case a satellite’s database suffers a catastrophic failure and needs to be restored from a backup. In large databases that can never happen because the backups simply can’t be made on such a large scale without affecting the actual databases. What instead happens is a rolling backup. This is called a log, in most instances.
Take this scenario for example: you take a backup at 01:00. Next morning at 09:00 you noticed that the satellite has crashed because it saw a database issue. You start working on getting it back up and running and see that the databases are corrupted. Your only solution is to restore the 01:00 backup. You have just lost 8+ hours of the database.
Actual scenario in large deployments: You stream the transaction log to a different machine. This machine performs hourly offsite backups. You notice the satellite going down, etc etc. You revert the transactions based on the log. I hear some skeptics already typing “BuT wHaT aBoUt wHeN dAtA iS dElEtEd?!?!”. The transaction log contains every single transaction that affects the database. Every. Single. Transaction. You backup the database based on this transaction log (which is easier to do on a different machine and doesn’t affect the satellites in any way, ie the missreporting of data by US1 all the time). You noticed that the crash happened at 07:54. You restore a previous backup taken by the other machine at 04:00 and replay all the transactions leading up to 07:54.
Ideally trash should only be kept for 24 hours. I’ll give the customary tolerance for margin of error and call it 48 hours. If a satellite error can’t be noticed and corrected within 48 hours, going back a week is more disastrous. That also means that the satellites can send blooms every 48 hours ofc.
Storj should put it in their TOS that they might store deleted data for a week in an unpaid trash. Then these discussions will hopefully stop.
Compared to the current 2-3 weeks reducing it to a few days would be a huge improvement.
To be fair, when a node has plenty of disk space all this trash situation is still quite unpleasant but bareable. It gets much worse when the disks are full and half of your disk gets blocked for 2-3 weeks as it happened for me and some others. This is not just losing your money, it’s really demotivating.
It would be fair if Storj admitted it needs to be fixed or at least significantly improved and worked on that.
At the start of the deletes I had ~30TB of trash, trust me I completely understand
Still hovering around 10TB.
Even though I am in no way affiliated with Storj, the official response will be “it’s not a priority right now”.
At least they should make bloom filters big enough. It seems like we are still far away from 10% false positive rate for big nodes.
Why the pieces have to be moved to trash folder? Can’t they just be marked as trash and deleted at a later date?