Blobscache errors in log

support26 · July 16, 2024, 10:38pm

I’ve been monitoring the problematic node and now it is constantly resetting / restarting the used space value in the dashboard…

checking the recent log entires - it shows a lot of these errors…

I have restarted the node - but in that time - it keeps dropping that used pace to zero then tries to build it up… currently showing 29.07MB after 1h23m uptime…

Any help appreciated (note - only this node - others are fine)

2024-07-16T18:32:19-04:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -11520}
2024-07-16T18:32:19-04:00	ERROR	blobscache	satPiecesTotal < 0	{"satPiecesTotal": -12032}
2024-07-16T18:32:19-04:00	ERROR	blobscache	satPiecesContentSize < 0	{"satPiecesContentSize": -11520}
2024-07-16T18:32:19-04:00	ERROR	blobscache	piecesTotal < 0	{"piecesTotal": -6144}
2024-07-16T18:32:19-04:00	ERROR	blobscache	piecesContentSize < 0	{"piecesContentSize": -5632}

Alexey · July 17, 2024, 3:49am

These errors are a consequence of not updated databases.
If the storagenode service/container is crashing, you need to figure out why and fix the issue.
Search for FATAL and Unrecoverable errors in your logs, they may explain the reason for the crash.

support26 · July 17, 2024, 12:18pm

That is what I have been trying to figure out and fix, but I can’t seem to get assistance from this forum to get it fixed.

I have found FATAL errors - but you did not want the logs posted as per my other post you replied to.

Alexey · July 18, 2024, 7:20am

Please, post them here (I never ask to do not post them), just these FATAL/Unrecoverable errors.

support26 · July 18, 2024, 6:55pm

You responded “not necessarily” to my question for posting the logs.

This is what I found when I searched the log for FATAL. I do not know how to output from the log anything with FATAL, so that was just using the find feature and I copied the log entry line.

2024-07-16T20:05:47-04:00	FATAL	Unrecoverable error	{"error": "piecestore monitor: timed out after 1m0s while verifying writability of storage directory", "errorVerbose": "piecestore monitor: timed out after 1m0s while verifying writability of storage directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2.1:175\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2:164\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}

Alexey · July 19, 2024, 7:39am

Perhaps a translation error, I’m sorry!

This error requires to optimize the filesystem, if you already did so, but still get this error, seems your disk subsystem just slow. You may increase a timeout for this check to do not crash your node, with a some drawbacks, of course:

You could increase it by 30s seconds until the node would not stop to crash.
However, if you would be forced to increase it more than 5m0s (by the way, please check, what exact check it’s failed - the writeable or a readable?), then you likely have a bigger problem with this disk, maybe it’s dying. And also you would need to increase a writeable check interval too (it’s 5m0s by default).

Alexey · July 20, 2024, 6:51am

A post was merged into an existing topic: Fatal Error on my Node

Alexey · July 20, 2024, 6:47am

I mean that:

Disk usage discrepancy?

Optimizing the filesystem includes:

check and fix errors on the disk

disable 8dot3 for NTFS: NTFS Disable 8dot3name

disable atime (for NTFS: [Solved] Win10 20GB Ram Usage - #17 by arrogantrabbit), for Linux - please, use the search here or in the internet

do a defragmentation if NTFS and enable the automatic defragmentation, if you disabled it (it’s enabled by default)

Disable indexing (Windows only)

Move databases to SSD (for Windows: Move databases on Windows storagenode - #2 by Alexey, for docker: How to move DB’s to SSD on Docker)

If you have a managed UPS, enable the write cache (for Windows - in the disk volume Policy, you need to select both checkboxes)

Add more RAM, if possible. Or add SSD cache before the disk subsystem (for Windows it’s possible too, but you need to use a tiered Storage).

Since you have

then you need to increase the

Since you increasing it on 30s, to have 1m30s, then you do not need to increase the writeable check interval (it’s 5m0s by default, which is less than 1m30s).