storagenode/retain: do not remove bloomfilter on retain failure
I am not a coder. And I donāt know much about how all this hangs together.
But from what we have seen is that a node restart led to deletion of the bloomfilter.
Now with that line above it is safe to assume that a node restart resulted in a retain failure?
If so, why? Interruption, yes. Failure no.
But with change to not remove on failure, hopefully after that change we wonāt see another possible outcome that if there is a real failure caused by bloomfilter or other defects like not accessible files or something that it keeps trying to process over and over again and always errors out.
Again: If the processing is working keep resuming after a node restart.
But if the processing is not working and it keeps quitting for the same error over and over again, not making any progress, then it should better not resume at some point to not wast IOPS.
And again the suggestion not to delete the bloomfilter but to move it to trash, so it could be recovered and inspected in case this is necessary. At least for some time until finally deleted by GC.
We will continue to try and improve our code thanks to your feedback!
Keep in mind there are two conversations happening:
How the code is today.
How the code should be.
To solve your problem we can only discuss how the code is today. For how the code is today, it is a simple fact that if your node isnāt able to run very long, it will not be able to complete many tasks. So, please donāt be upset with answers that say this, they are simple facts. When I say something like āthereās your problemā, I am not saying that it is your fault, but I am identifying what seems to be unique about your situation that is causing more urgent complaints than from other node operators.
For how the code should be, we agree that we need improvements! We are working on making nodes better able to handle shutdown. There are a number of changes coming soon to hopefully improve this situation. The storage node software remains a work in progress. Itās much better than it used to be but still has room for improvement. Thanks for your patience!