Further, this is the report from runs of the chkdsk
Stage 1: Examining basic file system structure ...
128893184 file records processed.
File verification completed.
151879 large file records processed.
0 bad file records processed.
Stage 2: Examining file name linkage ...
4 reparse records processed.
128911258 index entries processed.
Index verification completed.
0 unindexed files scanned.
0 unindexed files recovered to lost and found.
4 reparse records processed.
Stage 3: Examining security descriptors ...
Security descriptor verification completed.
9037 data files processed.
Windows has scanned the file system and found no problems.
No further action is required.
22890989 MB total disk space.
19346498 MB in 128732185 files.
40976888 KB in 9039 indexes.
0 KB in bad sectors.
129340999 KB in use by the system.
65536 KB occupied by the log file.
3378165 MB available on disk.
8192 bytes in each allocation unit.
2930046719 total allocation units on disk.
432405176 allocation units available on disk.
I see youâre using WindowsâŚ
Use Notepad++, it will index a .log file rapidly, so that it can simply page through huge log files. Has fast search/replace, .yaml and other progmatic format visuals, etc., etc., etc.
Your dashboard wouldnât have correct values, until all âused-space-filewalkerâ would successfully finish the scan for all trusted satellites and would update the database without any errors.
If you do not have FATAL or Unrecoverable errors, then dashboard should be available.
If you have only filewalkers errors so far, then you need to disable the lazy mode,
save the config and restart the node.
The progress you may track on the debug port with method /mon/ps or by a Resources Monitor checking which folder in blobs is currently processing.
C:\Program Files\Storj\Storage Node\storagenode.log:392471:2024-07-16T20:05:47-04:00 FATAL Unrecoverable error
{"error": "piecestore monitor: timed out after 1m0s while verifying writability of storage directory", "errorVerbose":
"piecestore monitor: timed out after 1m0s while verifying writability of storage directory\n\tstorj.io/storj/storagenod
e/monitor.(*Service).Run.func2.1:175\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*
Service).Run.func2:164\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78"}
The solution is to optimize your disk subsystem (check & fix, defragmentation, move DB to a system drive/SSD, add an SSD as a storage tier with a PowerShellâŚ). Or increase a timeout for that exact check (and accepting that the node may be disqualified, because of the slowness/hardware issues of this disk, especially for the readability checksâŚ).
then you may add it, save the config and restart the node either from the Services applet or from the elevated PowerShell:
Restart-Service storagenode
P.S. The writeability check only discover that your node would lose most of uploads requests, because itâs too slow to save pieces⌠However, if you would be forced to increase the timeout for this check above a 5m0s just indicates that you have much more great issue with your hardware, than just crashes⌠You need to check this disk (S.M.A.R.T. in particular), it could be dying.
Uhm, you already have these errors. Seems like a good reason to me.
You may or you may not. Given the time-out errors you are having, your drive is either (1) too slow to begin with, (2) has hardware issues or (3) is overused. Are each of your nodes running on different hardware and different drives?
First and foremost, I think everyone wants you to have a nice experience with STORJ. On the other harnd, people have been helpful while you seem to dismiss the possibility that any of your drives are faulty or too slow. How will be ever get to a solution then? (a question we all now the answer to). You have to do something about the time-out errors or this node will stay problematic, potential solutions have been provided more than once already.
Of course. Because the used-space-filewalkers didnât update the databases with the current usage.
When itâs interrupted, it will not continue (all past progress will be lost), it will start from the scratch.
So, please, try to fix the underlaying issues first (to do not allow the node to crash).
this is result the node to crash (to try to save it from a disqualification, if your hardware is started to fail. The node doesnât know, is it an intermediate issue or permanent, so it will stop to allow the Operator to decide, what they would do next).
So, you need to fix the disk or the configuration to do not fail a writeability checks.
Try to optimize the filesystem
If this wouldnât help and you still have crashes because of a failed writeability/readability checks, then you may increase the related timeout on 30s, save the config and restart the node.