Trash does not go away in 7 days

thepaul · May 3, 2024, 1:19pm

I’ll take those, I guess.

While we would like to think that we are smart enough to foresee all possible bugs, it turns out we are still human . As to why there wasn’t a fallback mechanism, this is because the migration step is supposed to be extremely simple and quick (it consists only of three renames/moves) and in the unlikely event it was interrupted it could be fixed manually.

But somehow the migration seems to have been kicked off multiple times all at once, and in some cases deletions were already being processed before the migration, leaving behind directories in the old location. This seems to be because of file walker subprocesses opening the blob store at the same time as the main process, which wasn’t caught because unfortunately our test systems don’t spawn the filewalker subprocesses in the same way.

We don’t have a fix yet. This situation is not considered ultra-high priority because the only data at risk of being lost is already in the trash. We definitely do see how it’s a problem for our SNOs, though, and we would like very much to fix it. The plan is to recognize the recursive directory problem and the unmigrated directories problem and correct them automatically on node start.

If the amount of data in the trash is causing you problems, you can delete it. Ideally you should keep pieces from the last 7 days. It may be helpful if you keep the data, just so we can test our fixes, but you’re under no obligation to do so.