Testing Garbage collector

Vadim · December 25, 2019, 1:53pm

Have a question.
When data is not payd, as it was marked as deleted or as it deleted realy?

It shold be payd untill it stored on HDD, then it will be also less problems with calculations.
Also it is good if it stored 30 days not 7, then it will be posible to client undo deletion or changes in 30 days, this function has Dropbox, and it saved from lot of problems when crypto virus got to Accounter pc.

littleskunk · December 26, 2019, 11:57am

So If I change my storage node to ignore all delete message I can increase my payout? That sounds like a great way for cheating

kevink · December 26, 2019, 12:01pm

The point of garbage collection is not to provide a trash folder like dropbox does but to delete pieces that shouldn’t even be there anymore but the node somehow missed the delete command.

A trash bin has to be implemented by a developer of a service, e.g. if someone makes a dropbox using STORJ.

anon27637763 · December 26, 2019, 12:52pm

I don’t believe that was the meaning of @Vadim post. There’s a problem here with sending a “Delete Piece” command to a SN without actually removing the data piece. Those deleted but undeleted pieces occupy hard drive space on the SN. So, in effect, a new and unpaid Storage Node function has been created… the Undelete Button… and the SNs are now paying for that function through decreased effective storage space.

However, Vadim’s suggestion would only increase the amount of unpaid storage space a given SN would be required to utilize for this Undelete Button controlled by the satellite.

Hopefully, in the future, the garbage collection service will remove deleted pieces much more quickly than 7 days… or in other terms 7/days_in_the_month … 7/31 = 22.58% of a month in unpaid storage use.

littleskunk · December 26, 2019, 1:46pm

The trash folder is only for garbage collection because garbage collection has a high risk. One bug like an empty bloomfilter and we will wipe out the entire network. We need the trash folder in order to recover from such a situation.

A normal delete message from the customer will still get deleted without any trash folder. In an ideal world without any bugs you should get delete messages from the customers every time. Only when you are offline for a short moment because of updates, restarts or connection issues you might miss a few delete messages. If you are online all the time garbage collection should move 0 pieces in the trash folder. We still have to fix a lot of bugs to get to that long term goal.

anon27637763 · December 26, 2019, 1:59pm

Right now, I have 97 GB of “trash” that is currently stored in an unpaid folder. This is a problem.

donald.m.motsinger · December 26, 2019, 2:14pm

Where can I send the 15 cent you’re missing out on if it were stored for a whole month? People, this is beta software, jeez…

anon27637763 · December 26, 2019, 2:19pm

That’s why my node posts an Ethereum address. Right?

My node generates about $0.02 per hour … so, 0.15 ~= 7 hours of regular use at the current rate of use.

Where can I send my bill? The satellites don’t post an Ethereum address.

In reality, this unpaid trash storage is a big problem. The trash folder can easily consume quite a large percentage of the overall available storage space. And in that situation, a given node loses out on possible Egress traffic during this particular Beta testing period… since it has been posted previously in other threads that Egress is currently mostly for new data pieces.

kevink · December 26, 2019, 2:31pm

it only stores large amount of data if your node is offline a lot. so it is most likely your own fault.
For example: My 1st node has 300GB in the trash bin! But it was offline for quite some time during extensive testing so that is entirely my fault…
And since deleted data doesn’t get downloaded, your hardly missing any payment… And it is only a “problem” if your node is full because then the trash would take the space new data could use. But if your node isn’t even full, it doesn’t make any difference at all.

donald.m.motsinger · December 26, 2019, 2:38pm

Math doesn’t seem to be your strong point as you showed in the post before that too. You get paid $1.50 per TB stored for a month, so 97GB would be ~$0.15. And you’re saying you don’t get paid for any of your stored data.

…in unpaid storage use for deleted pieces, which didn’t get deleted right away. Usually deleted pieces get deleted right away. If not, then the garbage collector deletes it within 7 days. You’re over exaggerating the problem. It’s nowhere near 22.58% of not getting paid per month for anything.

anon27637763 · December 26, 2019, 2:50pm

An offline node doesn’t receive any new data pieces while offline, nor does it earn any egress paid traffic.

An unpaid trash folder which might end up consuming a very large percentage of any given storage node’s drive space is a rather serious problem.

anon27637763 · December 26, 2019, 2:55pm

Data piece ZZ gets deleted on day 1 … it is not removed until day 7 … that data piece is occupying storage space for ~22% of a month but is not paid.

Vadim · December 26, 2019, 3:39pm

But if Statelite delete files and at the same time holding them more 7 days and node cant get new files for this place, also sounds like cheating.

heunland · December 26, 2019, 4:04pm

As @littleskunk already explained, this is necessary as a security feature to prevent network failure. In my opinion this is equivalent to asking us to not store any files on your hard drive that are needed to keep the system running correctly. Would you rather risk losing the entire network with no way to restore it than have a few full nodes experience a one week delay in being able to occupy some small amount of space on their disk with new data? To build a resilient network, some tradeoffs need to be made, and in this case, those that would be most affected in the future are the ones that already did not run their nodes correctly. This is to provide a way to safely clean up files that are already marked as deleted by the satellites and have been taking up space on your disk not earning you money already a lot more than 1 week. If our engineers thought it would be safe to delete these files with no safety delay period, they would have done so. If an alternate safe solution is found later, I am sure they will implement it. I think that some users are exaggerating the potential impact of this. Just because now they have a significant amount of garbage data being deleted that had accumulated over a significant time span when we did not have garbage collection in place, doesn’t mean that you will see this same amount of data marked for deletion taking up space in the future, unless your node is offline a lot, in which case you should rather look into remedies for the excessive downtime on your node.

littleskunk · December 26, 2019, 4:15pm

Just wait 7 days and it will be no problem anymore. In fact I don’t see the problem at all. Maybe you can explain what you would expect in this situation. Do not move the data into the trash folder in the first place? Just keep the unpaid 97 GB for a few more month? You can disable garbage collection if you want to try how that will work.

100GB for 7 days would be 16800 GBh. That is the same as storing 23GB for an entire month. → 0.0345 $

An unpaid trash folder which might end up consuming no space of any given storage node’s drive space is no problem at all. That is the statement that we can agree on and we can talk about what we can do to get to that target.

That is not correct. Data gets deleted on day 1 will get deleted. Only if the storage node doesn’t receive a delete message garbage collection will kick in and move it to a trash folder for 7 days. Don’t blame garbage collection for cleaning up the missed delete messages.

Same deal. That is not correct. Don’t blame garbage collection for cleaning up the missed delete messages.

This conversation is going into the wrong direction. Garbage collection is cleaning up the space that is unpaid. We should try to make sure garbage collection never gets such a hugh amount of work. Everyone is welcome to help us. I can create a list of tests that we should run with storj-sim and than we can go ahead and fix possible issues we might discover. That is the approach I would like to see here.

Vadim · December 26, 2019, 4:20pm

OK I thinked it would be permanent feature that it will be always 7 days.
It it as described, then it is understandable and OK for me.

anon27637763 · December 26, 2019, 4:47pm

This is the only reasonable statement from the above rants from several posters.

The software is in Beta at the moment, and I’m pointing out a very basic problem with the process as currently implemented.

If, in the future, the software is released as version 1.0 and this problem ends up pushing multiple SNOs out of paid storage, there is going to be a big problem. Attacking the messenger doesn’t solve anything, and is unprofessional (IMO).

littleskunk · December 26, 2019, 5:23pm

Pointing out a problem has a different meaning for me. I prefer to explain the expected behavior vs actual behavior followed by searching for the issue that is causing that. Please go on and explain it to me. Do we have a bug here? I have problems to follow you. I still don’t see the problem.

What is your proposal?

I am not sure if you are talking about me now. If so I am also happy to stop the conversation right here and now. Otherwise I will go on explaining what garbage collection is doing and ask some questions to understand what your proposal is.

anon27637763 · December 26, 2019, 5:52pm

There’s no bug in the traditional sense, meaning there’s no programming error.

There is a problem with the design of the garbage disposal system as implemented. It is as follows:

A given SN has a given allocated storage space. This space is a discrete number and not elastic.
A given SN is paid for two items, Egress and Data at Rest over time.
A new function called, “Garbage Collection” has been implemented. This function collects all the data that is currently unpaid due to errors (perhaps better called ingress exceptions) in the storage protocol of data pieces.
The “Garbage Collection” process continues to store the ingress exceptions for 7 days as a risk management mitigation from the Storj Network’s perspective.

The problem with the above is that the storage node is continuing to store erroneous data pieces for 7 days unpaid. The node may or may not be “at fault” for the presence of the erroneous data pieces. However, if fault is to be assigned, the fault must be with the protocol itself which allowed the erroneous placement of the data pieces. Furthermore, the given Storage Node is being doubly penalized for storing erroneous data through the continued lack of incoming data pieces as well as decreased effective paid storage space.

In my particular case, my node was offline for approx 10 hours towards the end of Summer 2019, and has been online and fully functional since that time. How did my node accumulate 97 GB of erroneously placed data pieces if it was running without error for an entire 3 months? I also see in my logs something like this:

2019-12-26T16:39:34.084Z|INFO|piecestore|upload started|{"Piece ID": "AQ6AJITGXZUXCP67A4QL4ZCFCFFCFEOAQTU6I2LQJWBUSXNER4UA", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "PUT"}|
2019-12-26T16:39:41.558Z|INFO|piecestore|uploaded|{"Piece ID": "AQ6AJITGXZUXCP67A4QL4ZCFCFFCFEOAQTU6I2LQJWBUSXNER4UA", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "PUT"}|
2019-12-26T16:40:28.668Z|INFO|piecestore|deleted|{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Piece ID": "AQ6AJITGXZUXCP67A4QL4ZCFCFFCFEOAQTU6I2LQJWBUSXNER4UA"}|

Is this data piece that was uploaded and then deleted within 1 minute part of that 97 GB ?

The basic problem is that it seems like there’s now a function which could be used in such a way to completely fill a given Storage Node with unpaid data at rest.

My purpose in posting is not to decide why (or fault), but just to point out the possibility.

littleskunk · December 26, 2019, 6:05pm

What is the alternative? Don’t run garbage collection?

I can write down a bunch of storj-sim tests that would allow us to find the answer to that question. Anyone interested in executing them?

Negative. Garbage collection is printing out all the pieceIDs it catches. Grep for that pieceID next week and you will see that this piece was deleted without garbage collection.

Disable garbage collection please. We can come together once we both agree that garbage collection is working fine.