ERROR lazyfilewalker.gc-filewalker.subprocess failed to save progress in the database

redgyuf · June 10, 2024, 8:03am

A little bit connecting here, I was limiting my Node’s max usable space, and when i restarted the node, the used space always got lower on the dashboard but in reality it stayed the same.

My problem is, with this much data it is possible getting full in a day or two, but if I restart the node to change the config, it will get even more desynced and possibly node will thing it has more free space than in reality he has.

For example for my 8TB node i reduced it to 7.5TB, but the node from 6.78TB went to 5.93TB after the restart, now with the 7.5TB limit i would probably run out of free space before the node would realize it.

But if i again restart it maybe it will gets reduced again…

What should I do?

Qwinn · June 11, 2024, 4:30am

I have the same problem. Each time I’ve restarted, the used space it sees (in the piechart) drops, this last time from 11.06TB to 10.55TB, not an insignificant amount. And then it also says 1.2TB in trash, which should mean that 11.75TB are used. But, in fact, df says 13TB are used.

The node says 2.63TB are free, df says 2.06TB are free on the disk.

So, in fact, at this rate, the node is going to be filled right up to the end of the disk and actively run out of room. It will even have consumed my 10% buffer (I allocated 14.5 TB out of 16) and beyond.

I am quite afraid now. Do I need to shut it down and leave it off long term rather than have this crazy ingress overflow the disk?

God knows the lazy filewalker is no help, I’ve had it run for 2 days (about the longest I’ve been able to keep it up without a restart, either intentional or automatic) and not finish even just Salt Lake. And I would run the non-lazy version but I can’t even see log messages to see its progress. I think my node is doomed.

EDIT: I guess I can drop the allocated space. Is that my only option? And what will that do considering I actually have more data stored than my new allocation would account for?

This is all docker on Linux Mint 21.3, btw.

Alexey · June 11, 2024, 6:08am

If the usage on the dashboard always drops after restart, then slowly updated to the close to the actual usage, this is usually mean, that databases are not updated.
I would recommend to check them

and make sure that the scan on startup is enabled and it’s successfully finished for each trusted satellite.
It’s also possible that you have errors in the logs related to database is locked, in that case databases will not be updated too.

redgyuf · June 11, 2024, 6:32am

But sadly the non lazy filewalker has a bug to not show anything in the logs right?

My DB is on a nvme SSD seperated from the disk, so speed/access should not be the problem.

I think this should be a hot topic for the developers as with this much incoming data, waiting multiple days for the filewalker to finish can easily eat up the “spare” spaces.

Some people has close to 1TB of ingress/day, which is the 10% safety for a 10TB disk, and with one restart he could be rolled back in the DB a few days, meaning storj will be behind 2-3TB and possibly kill the whole node if I understand what happens if the disk gets completely full.

Alexey · June 11, 2024, 7:24am

yes.

yes, it could be, if it was affect most of nodes, which is not the case. So, unfortunately you need to try to fix issues with your setup. Developers are working on improvements though anyway.

Qwinn · June 11, 2024, 2:43pm

“I would recommend to check them”

Grepping logs for “malformed” yields no results. Running your test regardless means shutting down my node yet again and making the situation significantly worse, and I’m willing to bet my node that it would yield no positive results.

“It’s also possible that you have errors in the logs related to database is locked , in that case databases will not be updated too”"

I’ve received the database is locked error 7 times in the last 12 hours. Is that sufficient to explain this issue? Are you saying that at the current level of ingress the average node is seeing less than that?

“and make sure that the scan on startup is enabled and it’s successfully finished for each trusted satellite.”

You’ve got to be kidding. When it can’t even do one satellite in over 2 days, and at any time the watchtower/auto-updater can shut it down and restart the process, it’s virtually impossible to do as you suggest. And I really wonder how exactly it’s ever capable of calculating used space correctly when the used space increases multiple terabytes from the moment the process starts to when it finishes.

There is literally nothing wrong with my node atm, except for the issues listed here. You seem to think this issue is some rare edge case. I submit that I think it’s affecting everyone. The only difference I think in my case is that due to other issues I’ve had to do more than my normal number of restarts, but we’re talking stopping the node, rebooting, and restarting it, nothing more. That alone is causing my used space to drop by half a terabyte, and it’s cumulative.

I guess I just won’t bother doing anything. If the software winds up overfilling the drive and crashing the node, well, if you guys don’t care, I guess I won’t either.

redgyuf · June 11, 2024, 3:14pm

Yes, probably if a new update would roll out today, there would be thousands of nearly full nodes affected.

As for example I had close to a TB missing, after a restart on a CMR disk dedicated for storj, with NVME for the DB with a quite powerfull CPU, so filewalkers should have a nice environment to run, not like a slow PI or something

Alexey · June 12, 2024, 4:42am

They are not always throws an error, this is why I suggested to use an article.

Yes. And you need to improve the performance of the disk if that possible or move databases to SSD (or even to the USB stick, if you wouldn’t mind to lose a stat if it dies).

I’m not. This is the only way to make databases to be updated, if you experiencing a disk/trash usage discrepancy.

Unfortunately only a few slow or overprovisioned setups (Windows VM on VMWare is one of the examples). If you may improve it - please do it.

Because your databases keeps not updated and perhaps corrupted, you still did not check them.
On the version 1.105.x the lazy mode and the startup scan should less affect the nodes, we improved how data is stored, but in expense of more RAM usage under the load.

this is very depends on the how you did a setup.

Qwinn · June 12, 2024, 5:02pm

Nope. Sorry. Really only been doing this for fun and I like the project in theory, but this is very rapidly becoming not fun.

I think you misunderstand the situation. What I’m describing here can ONLY be a bug.

The title of this thread suggests that it’s the lazy filewalker that’s not saving progress, but that’s not what I’m experiencing. Even when the filewalker isn’t running, the Used Space in the piechart on the node page still increases as ingress comes in. It’s seems more or less reasonably accurate as long as you don’t restart. But THAT figure, the Used Space that the node is publishing to the node page sans any completed filewalker, is what is getting deducted from when the node is restarted. Now, what possible legitimate reason could there be for that? My databases are not locking at the moment of shutdown. So, if I shut down the node when that piechart is reporting 11.2TB, and when I restart the piechart is reporting 10.5TB… why is the node not saving that 11.2TB figure from its previous run, and overwriting it with a lower figure? If it is fetching the result of the last successful filewalker or something, that’s terrible design. It should keep the value of the most recent calculation, which would be from the point of the last shutdown. Where it gets this dramatically lower figure on restart from, I have no idea. But simply restarting should not completely overwrite that figure to a clearly lower and incorrect value.

If you’re going to continue to insist that that’s working as intended, that just restarting the node SHOULD lower that used space value dramatically despite there being no db locking at the point of shutdown and nothing discernibly wrong on a node running an absolutely basic, vanilla docker-on-linux install, then yeah, this is no longer a fun experiment.

df is showing I have 1.1TB left free space on my disk, while the node page thinks there’s 1.74TB of free space. Used space filewalker for Salt Lake began 36 hours ago with no visible progress and plenty of idle CPU to spare. It’s a race. If the software can fix itself before this node implodes due to running out of real space, then fine, I’ll let it keep going. If it can’t, and reporting the problem does nothing but elicit “Your equipment must be bad” when I know that’s not the case, then let it burn. Sorry. Had enough.

(I mean, seriously, why would you not have code that runs an instant df and makes sure that the free space as perceived by the node CANNOT POSSIBLY be higher than the free space per df, under any circumstance? We’ve talked about this before, and I honestly cannot understand these design oversights.)

Qwinn · June 12, 2024, 8:03pm

BTW, yes, I do remember that the reason is because the container can’t see outside itself to do the df. I just still think it should at least be an option in the docker set up to give the container access to the full disk device and software that can take advantage of it if it has been set up that way. That’s the part that I can’t understand why t hasn’t been implemented.

Alexey · June 13, 2024, 4:49am

The usage is stored in memory and flushed to the databases. Databases are used as a cache. If the node cannot add usage to the database (because it’s malformed, corrupted or locked), this usage will gone with the restart. This is a separate issue from the not finished filewalkers.
If filewalkers are failed too (here the collected stat is lost unless there is a resume on the next start, it’s not used to show a usage, until they update a database) and/or did not update the database (here could be the same issue with the malformed, corrupted or locked databases), the stat will not be corrected, but still can show the last collected by other functions in-memory stat. Until restart.
So, in both cases you could hit the same issue with databases but with different functions/processes and incorrect usage stat.

In short - if your databases doesn’t have issues, they will keep the progressing stat between restarts. Of course something can be missed, if the node was restarted abruptly (i.e. FATAL error) or in a wrong time (between flush intervals) - however if only the last stat is lost since the last flush, the discrepancy should not be so large, and here is a used space filewalker which should fill this gap.

I believe that your databases have issues, but you argue with me instead of check. I’m trying to help, please check logs for errors related to databases (any “ERROR|error” and “database”) and run the check script while node is stopped to make sure that all databases are OK.

redgyuf · June 13, 2024, 7:57am

Since i restarted my node a few days ago to change the storagelimit, my node got off with 900GB, so today the day arrived, storj thinks he got 300G more freespace than in real life. DBs are in a seperate drive so wonder if the node gonna crash totally or not.

The restart was 3 days ago… i wonder if it gets back to the normal state.

It would be nice to have like an indicator to know how much data the filewalker already parsed in its current iteration to know if we can restart the node or not.

Alexey · June 13, 2024, 8:01am

You may easily check that:

Qwinn · June 13, 2024, 3:22pm

I am annoyed, but I am not refusing to run your check just for spite. I don’t want to do it because the act of doing the test is sure to make my situation even worse, once again reducing my used space and making it even more likely that my node will implode. I would do it if I thought it could possibly improve my chances of helping my node survive, but I don’t see how.

If I am right, and it is a bug, then running the test won’t help.

If I am wrong, and the database is corrupt and I fix it now, the issue still won’t be resolved and it will still require what looks like at least 3 days of filewalker activity just to do the Salt Lake satellite, at which point due to the very very high ingress levels lately I am sure to run out of space (I am below 1TB left of actual free space on the disk already). And as far as I know I cannot tell my node to JUST run filewalker and not continue conducting ingress.

At this point I think the only hope I have of my node not imploding is to try to get this filewalker run, which started 3 days ago and has not yet produced any other useful log message, to finish.

Qwinn · June 13, 2024, 4:38pm

“Started” and “finished” messages aren’t helpful. We already knew how to get that. What I think he is asking for, and I would like the same, is some indication BETWEEN “started” and “finished” to, for example, tell me how much progress my Salt Lake used space filewalk has made in 3 days and if I can expect it to finish in one day or one week.

Considering the staggering mass of logs that a node otherwise produces, you have to understand that it’s very frustrating that what is potentially a critical function like this, the main thing any actual node operator ever bothers checking the logs for, produces nothing but “started” and “finished” (many days apart) for the lazy walker, and not even that for the non-lazy version.

Krawi · June 13, 2024, 6:05pm

Solution for Linux:
When you redirect your dbs you can use the foldername to insert some Unique ID to the node.
This ID will appear in the output of ‘ps’ for --info & --info2 parameters.

ps -ef | grep ‘used-space-filewalker’ | grep ‘/app/’ | grep ‘nodeid’

Output of ps:

root 11556 10062 7 10:04 ? 00:44:25 /app/storagenode used-space-filewalker
 --storage config/storage
 --info dbs-nodeid/piecestore.db
 --info2 dbs-nodeid/info.db
 --pieces config/storage
 --driver
 --filestore.write-buffer-size 128.0 KiB
 --filestore.force-sync=false 
 --log.output stderr
 --log.encoding json
 --lower-io-priority=true

Get the PID from ‘ps’ output and use ‘ls’ with this PID, in this example 11556:

ls -l /proc/11556/fd

Output of ls:

total 0
lr-x------ 1 root root 64 Jun 13 15:20 0 -> 'pipe:[151847]'
l-wx------ 1 root root 64 Jun 13 15:20 1 -> 'pipe:[151848]'
l-wx------ 1 root root 64 Jun 13 15:20 2 -> 'pipe:[151849]'
lr-x------ 1 root root 64 Jun 13 15:20 3 -> /app/config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/jx
lrwx------ 1 root root 64 Jun 13 15:20 4 -> 'anon_inode:[eventpoll]'
lr-x------ 1 root root 64 Jun 13 15:20 5 -> 'pipe:[149965]'
lr-x------ 1 root root 64 Jun 13 15:20 54 -> /app/config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa
l-wx------ 1 root root 64 Jun 13 15:20 6 -> 'pipe:[149965]'

ukfu… is US1 and the filewalker scans currently subfolder ‘jx’.

The subfolders are scanned in alphabetical order, but the satellites are unordered, so you need to use the logs.

Qwinn · June 13, 2024, 7:09pm

THANK you! That indeed seems very useful.

So here’s my result:

l-wx------ 1 qwinn qwinn 64 Jun 13 15:04 1 -> 'pipe:[49461]'
l-wx------ 1 qwinn qwinn 64 Jun 13 15:04 2 -> 'pipe:[49462]'
lr-x------ 1 qwinn qwinn 64 Jun 13 15:04 3 -> /app/config/storage/blobs/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa/tz
lrwx------ 1 qwinn qwinn 64 Jun 13 15:04 4 -> 'anon_inode:[eventpoll]'
lr-x------ 1 qwinn qwinn 64 Jun 13 15:04 5 -> 'pipe:[49464]'
lr-x------ 1 qwinn qwinn 64 Jun 13 15:04 54 -> /app/config/storage/blobs/pmw6tvzmf2jv6giyybmmvl4o2ahqlaldsaeha4yx74n5aaaaaaaa
l-wx------ 1 qwinn qwinn 64 Jun 13 15:04 6 -> 'pipe:[49464]'

if I’m interpreting what you’ve outlined correctly, that suggests it’s at subfolder “yx”? If so, thank God hopefully the end is in sight.

EDIT: Oops, no, it’s at “tz”. Okay. Not quite as good but at least it’s possible it’ll end within a day.

Krawi · June 13, 2024, 7:35pm

But alphabetical order means here: folders with leading characters first, followed by folders with leading numbers, so start with a2 to zz, then 22 to 7z.

Folder tz is #640 out of #1024.

Qwinn · June 13, 2024, 8:00pm

AH. Yuck. I’m up to v6 now but still, yeah, gonna be a while.

I’ve done some more thorough analysis of the logs during this, and it turns out I did in fact have more database locked messages than I previously noted.

But here’s the thing. I was previously running a job that used about 40% of my CPU. That job makes me a good bit of money, way way way more than Storj makes me. About 4 hours ago, just hoping to get through this crisis, I turned that job off. I haven’t gotten a database locked message since.

Which means that the database lock issue - which I have whined about before in its own thread titled “Database locking madness” - can be caused simply by running pretty much anything else on the CPU, which is an i9 10900 capable of 5.2Ghz on 10 cores and 20 threads. That job didn’t touch the storj disk. No impact on the disk whatsoever. So apparently, my single Storj node requires a fully dedicated 10900 to run without nearly fatal errors.

And… no. I didn’t realize the CPU requirement was that brutally high. I cannot spare the entire processing power of a 10900, among the top tier of the 10th gen intel line that was released only 4 years ago in Q2 2020, one that can make me far more money servicing several other cryptos, to maintain a single Storj node.

If only having 40% usage to other processes is enough to cripple the Storj node this completely… then I’m no longer going to consider this a long term project to involve myself in. I’ll try to get this filewalker process finished today so the node doesn’t implode, and then I’ll let it run alongside far more important usage with the apparently inevitable constant database locking issues until I get to the I believe 10 month mark where I can do a graceful exit (I’m in my 7th month now), and then bow out.

Thanks for the help.

Alexey · June 14, 2024, 6:54am

Would you mind to add it to that wiki?