It seems that the new feature “save-state-resume GC filewalker” isn’t functioning as expected

Andrew · April 16, 2024, 5:47pm

Version: v1.101.3
Operating System: Ubuntu
Environment: Docker

It seems that the new feature “save-state-resume GC filewalker” isn’t functioning as expected. Both new databases (used_space_per_prefix.db and garbage_collection_filewalker_progress.db ) always remain empty (no records with satellite ID and prefix), even after the node was stopped. DBs files size stays 24576 bytes and modification time stays equal to the moment when files was originally created after updating the node. Is this the expected behavior? I thought in this version when node is being stopped filewalker should save sat id and prefix (last scanned directory) to DB.

Steps to reproduce:

Execute the command: sudo docker start storagenode.
Wait a few minutes/hours.
Execute the command: sudo docker stop -t 300 storagenode.

The last records in the log file are as follows:

2024-04-16T17:25:29Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode”}
2024-04-16T17:25:29Z INFO lazyfilewalker.used-space-filewalker subprocess exited with status {“Process”: “storagenode”, “satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “status”: -1, “error”: “signal: killed”}
2024-04-16T17:25:29Z ERROR pieces failed to lazywalk space used by satellite {“Process”: “storagenode”, “error”: “lazyfilewalker: signal: killed”, “errorVerbose”: “lazyfilewalker: signal: killed\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*process).run:85\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*Supervisor).WalkAndComputeSpaceUsedBySatellite:130\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:738\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-04-16T17:25:29Z ERROR piecestore:cache error getting current used space: {“Process”: “storagenode”, “error”: “filewalker: context canceled”, “errorVerbose”: “filewalker: context canceled\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkSatellitePieces:74\n\tstorj.io/storj/storagenode/pieces.(*FileWalker).WalkAndComputeSpaceUsedBySatellite:79\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:747\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:58\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78”}
2024-04-16T17:25:29Z DEBUG tracing collector stopped {“Process”: “storagenode”}

After the node was stopped, files used_space_per_prefix.db-wal and used_space_per_prefix.db-shm still exist in the database directory (I believe both files should be removed by the node when it shuts down).

Is this a bug, or am I misunderstanding something?

Ambifacient · April 16, 2024, 6:39pm

You are seeming reporting info for the used-space filewalker, while the change is only for the garbage-collection filewalker. You will have to test again when there is an active gc-filewalker process.

The logs of “context cancelled” is from the result of you shutting down the node, causing the used-space filewalker to cancel what it’s doing, which is expected.

Andrew · April 16, 2024, 7:46pm

I’m not sure about this. As I can see in the source code on GitHub, the developers made similar changes for both filewalkers. Would be nice to get clarification from the team.

jammerdan · April 23, 2024, 3:13am

@Alexey

As I have an issue with used filewalker not finishing I was wondering the same: With resume feature for the used space filewalker it may help to get this filewalker to finish.
Can you confirm the resume feature is for the used file walker as well from 1.101.3 on?

And how to determine by the logs if filewalker is in fact resuming and not restarting from the scratch?

Alexey · April 23, 2024, 8:16am

As I was informed, it should store a BF on the disk and if the current GC is running - it will finish its work if not interrupted, in which case it will start with a new one from scratch.
So, I do not know what to say. If it would not interrupted it should finish, otherwise - it will use a new one, if exist, otherwise it is expected to run with the saved one.

jammerdan · April 23, 2024, 8:38am

No the question was about the used space filewalker.
If it should have a working top-resume implementation already or planned or anything.

Because we see this: storagenode/pieces: save-state-resume feature for used space filewalker

Alexey · April 23, 2024, 8:45am

It should too, but I do not see it in the 1.101.3

jammerdan · April 23, 2024, 8:55am

Not good. That feature is really needed to help when a used space filewalker does not finish for whatever reason.

Alexey · April 23, 2024, 8:58am

I think so, but it’s not merged, so… Please do not restart your node before it would be completed.

jammerdan · April 23, 2024, 9:10am

My problem is that the node finishes used space filewalker for AP1 and Saltlake but not for US1 and EU1 and then start from the scratch.

jammerdan · April 23, 2024, 9:38am

Is there a way to see some kind of ETA or for what version it is planned?
I think I have to turn off used filewalker completely until this feature is available.

Andrew · April 23, 2024, 10:17am

We don’t restart, but the darn auto-updater in Docker does it for us. I haven’t touched my nodes for 6 days since the update to 1.101.3, but the used-space-filewalker still hasn’t completed on many nodes. And now, some of them were restarted by the storagenode-updater because version 1.102.3 started to roll out.

jammerdan · April 24, 2024, 2:45am

Maybe @clement can tell us more when we can expect the used-spacae filewalker to be able to resume as well as he seems to be assigned to that.

github.com/storj/storj

storagenode/{pieces,blobstore}: save-state-resume feature for GC filewalker

committed 09:56PM - 02 Apr 24 UTC

profclems

+276 -77

The bloomfilter is now stored on the disk from this change https://review.dev.st…orj.io/c/storj/storj/+/12651, so we don't lose the bloomfilter on restart. But that means that when the bloomfilter on the disk is loaded and added to the retain queue, it will start scanning the directories all over again. This will continuously be the behavior till the node is able to complete a full scan which might never happen on a large node. In this patch, we make sure the node can resume from where it left off. We read the two-letter directory names (or prefixes) from the satellite blobs folder and sort them alphabetically before scanning. During the scan, the last scanned prefix is stored in the db. Updates https://github.com/storj/storj/issues/6708 Change-Id: Icb32cdc7dd49ef8ce44f6d771e4e33045078ed55

I have turned off used-space filewalker now on the nodes where it does not finish. That cannot be the solution and I hope the resume feature for used-space filewalker will fix that.

Also maybe an idea to make it configurable how often the space calculation filewalker should run. Is it required on every restart? No. Maybe it is enough to run it once a week. Probably no need to run it multiple times a day.

jammerdan · April 24, 2024, 3:00am

I don’t know exactly how to read this, but it looks like progress is happening.

https://review.dev.storj.io/c/storj/storj/+/12806

build verify is finished successfully: storj-gerrit-verify #12921 [Jenkins]

Whatever that means.

elek · April 24, 2024, 10:01am

Quick intro to the development process:

We merge all commits after +2 approval of two independent reviewers + green build
stroj-gerrit-verify is the first past of the build, we have an other round after the two reviews (pre-merge)
usually we cut release branches bi-weekly, unless there are urgent changes

Looks like the initial build is passed, so this patch seems to be ready to review. Now, we need two reviewers / all the suggested modification before a release cut. Because the preparation is the next release already started, the realistic expectations is including this one in one release later (104) – if we can get all the required changes done by that time.

But no promise. It depends on other factors as well, for example fixing / debugging other important issues / emergencies.

Also: if you are developers, I encourage you to review code, and feel free to add your comments in gerrit. Every comment can be helpful…

jammerdan · April 25, 2024, 4:05am

At least I thought I did:

elek · April 25, 2024, 11:48am

Fair point, but this is just because it was written by somebody with developer glasses (=me). Shame on me, I ignored these details:

Release cut happens usually at middle/early week. After that the release candidate is deployed to QA (you can see it if you join to the QA satellites or use the QA versioning server). Final deployment typically happens next week Monday (or later, if QA/devs are not fully confident).

BTW, you can see a huge number of integration tests in code itself (search for testplanet), which are automatically executed for every commit. They also implement full e2e test scenarios (but not on separated deployments).

Alexey · April 26, 2024, 7:39am

I thought It’s resolved?

Alexey · April 26, 2024, 7:43am

Seems there:

Alexey · April 26, 2024, 7:46am

Depends on does you have other load on this disk (e.g. Chia) and/or have a discrepancy.
If the answer is “Yes” on any of these questions - you must keep it turned on, unless you are ok with a discrepancy of course.