The hard drive usage is very high. Specifically, the service stopped because of it last night. What can be done to limit it? My other question is, what causes the decrease in occupied space, and why is the trash size so large? Is this normal? There are too many small files on the hard drive (millions), and moving them all at once is overloading it, leaving no real speed
Storj dices everything up into small files: it’s normal to see 3-4million files per TB. And by default any time it restarts (such as an automatic upgrade every two weeks or so) it will rescan those files to find up how much space they use. Depending on how many files and the speed of your HDD… that scan can take a couple minutes, or a couple hours, or a couple days… and while it’s running your disk will be used 100%.
Storj nodes also periodically get notices from the servers about which files they should be keeping (so they know which can be deleted)… so it’s also normal to have some trash hanging around. Each batch of trash is kept seven days before it’s deleted… in case there are any problems and the files need to be recovered. And every time files get moved from your regular space to trash… your used-space number goes down and trash goes up.
So everything you described so far sounds normal: and if you wait a day or two your HDD should go back down closer to idle. And unless more trash is created… the trash count will go down in a week.
Thank you for your response, I will wait patiently. This state has been ongoing for 7 days. I hope it gets resolved soon
yea, as roxor said, in addition You can try new feature with whole disk space acknowledge, to avoid that files scanning in every restart,
aaaaaand the badger catch, just look around the forum after those phrases and You will find topics.
Something is really off here. This is not normal. The service has crashed again with an error. The disk is at 100% usage, and the program cannot access it.
Could you please search for FATAL/Unrecoverable error somewhere earlier in your logs?
You may use the PowerShell:
sls "FATAL|Unrecoverable" "C:\Program Files\Storj\Storage Node\storagenode.log" | select -last 10
You likely have a readability or writeability check timeout. This is usually mean that the disk is too slow to read or write a small file. There are several ways to improve the speed of the disk:
- Perform a defragmentation
- disable atime: [Solved] Win10 20GB Ram Usage - #17 by arrogantrabbit
- NTFS Disable 8dot3name
- Add more RAM
- Use tiered storage and add SSD as a cache
The last few days only, as I deleted the log content, it was almost 10GB already.
Does regular defragmentation cause any problems?
Specifically, the entire disk is completely fragmented.
I would move the database to an SSD, could you help with that? It seems that due to the load, it couldn’t read it, and that’s probably why it’s crashing.
Unfortunately, I’m not a programmer, but if there’s some guidance, I can follow it.
The regular defragmentation would keep your filesystem fast, so the node shouldn’t crash because of a failed check. However, on the photo of your logs there is an error, that it cannot open a database. I would recommend to check your disk for errors and fix them.
You do not need to be a programmer to configure the node.
To move databases to SSD you may follow this guide:
The hard drive is not showing any errors; it’s in 100% condition. In my opinion, the reason it couldn’t read the database is that it is under so much load at that moment. Do the highlighted files in the picture belong to the database? Should I copy these to the SSD drive?
Yes, all files with the *.db
extension are databases. You need to copy/move them to SSD when the node is stopped. You also need to specify a new path to the databases with the storage2.database-dir:
parameter in your config.yaml
file, save it and start the node.
You should not have errors in your logs after that.
is this correct?
or
storage2.database-dir: C:\Program Files\Storj\database
Both should work. However, it’s better to have it with quotes, because the path contains the space.
Unfortunately, it didn’t work. If I make a modification in this line, the service fails to start and gives an error. I tried with and without quotes. The path is correct. Could it be a permissions issue? I will try putting it on another SSD drive that is not a system partition.
So it’s as if it can’t find the database files, and that’s why it can’t start. This must be the issue for sure.
Please make sure, that the database files in this folder. You also need to change the owner to the SYSTEM user, it should have write, delete and edit permissions.
Unfortunately, we were still unable to transfer it. The permissions are in order. The database is not visible. The service crashed again at 4 a.m. today
Then, please, post the error which you receive.
What was the reason? It should be an FATAL or Unrecoverable errors.
I managed to transfer the database. There was a permission issue. The defragmentation is also complete. We’ll see how things change now. Disk usage has significantly decreased, and data traffic has started as well.
2024-10-19T15:18:43+02:00 WARN collector unable to delete piece {“Satellite ID”: "12EayRS2…
It’s unbelievable that it always has some problem. Right now, it can’t delete files. The log size is 123 GB!!! It filled up the system drive. Is this program really that amateurish?
It can’t stop the service; it only works with force or restart. Unbelievable… unbelievable…
2024-10-19T15:12:03+02:00 WARN collector unable to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “SH6TQADNSVCSA7BQRGV24CVFVJVLDY6SZJTD3YWSMAB4SWURV3OQ”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:126\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:350\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).DeleteWithStorageFormat:329\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteSkipV0:375\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:111\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:68\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/collector.(*Service).Run:64\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:44\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78”}