2nd node failed to start after restart

billium_g · October 5, 2024, 11:30pm

after restart of the docker node I get an error:
2024-10-05T23:18:10Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode”, “Version”: “v1.113.2”}
2024-10-05T23:18:10Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode”}
2024-10-05T23:18:10Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.113.2”}
2024-10-05T23:18:10Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}
2024-10-05 23:18:11,909 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-10-05 23:18:11,911 INFO spawned: ‘storagenode’ with pid 56
2024-10-05 23:18:11,911 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2024-10-05T23:18:11Z INFO Configuration loaded {“Process”: “storagenode”, “Location”: “/app/config/config.yaml”}
2024-10-05T23:18:11Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2024-10-05T23:18:11Z INFO Operator email {“Process”: “storagenode”, “Address”: “bgerth@gmail.com”}
2024-10-05T23:18:11Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “0x3eD2d2986543F1A0732e418b76569e74B128c450”}
2024-10-05T23:18:11Z ERROR failure during run {“Process”: “storagenode”, “error”: “Error opening database on storagenode: database: piece_expiration opening file "config/storage/piece_expiration.db" failed: unable to open database file: no such file or directory\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabaseWithStat:407\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:384\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:379\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:354\n\tstorj.io/storj/storagenode/storagenodedb.OpenExisting:319\n\tmain.cmdRun:67\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:271”, “errorVerbose”: “Error opening database on storagenode: database: piece_expiration opening file "config/storage/piece_expiration.db" failed: unable to open database file: no such file or directory\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabaseWithStat:407\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:384\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:379\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:354\n\tstorj.io/storj/storagenode/storagenodedb.OpenExisting:319\n\tmain.cmdRun:67\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:271\n\tmain.cmdRun:69\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:392\n\tstorj.io/common/process.cleanup.func1:410\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:271”}
Error: Error opening database on storagenode: database: piece_expiration opening file “config/storage/piece_expiration.db” failed: unable to open database file: no such file or directory
storj.io/storj/storagenode/storagenodedb.(*DB).openDatabaseWithStat:407
storj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:384
storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:379
storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:354
storj.io/storj/storagenode/storagenodedb.OpenExisting:319
main.cmdRun:67
main.newRunCmd.func1:33
storj.io/common/process.cleanup.func1.4:392
storj.io/common/process.cleanup.func1:410
github.com/spf13/cobra.(*Command).execute:983
github.com/spf13/cobra.(*Command).ExecuteC:1115
github.com/spf13/cobra.(*Command).Execute:1039
storj.io/common/process.ExecWithCustomOptions:112
main.main:34
runtime.main:271
2024-10-05 23:18:11,968 INFO exited: storagenode (exit status 1; not expected)

then when try to fix…says there is no disk space left

/dev/sdd 7.4T 414G 7.0T 6% /mnt/storj02

I tried doing the log thing and no change. fsck.ext4 finds no errors.

Proxmox says no space issues either.
586GB used out of 7.93TB

This is running on Proxmox in Ubuntu. This is my second node on different ports. Was up for more than 25 days now will not start.

I have the same issue with my 3rd node, but gave up on it, thinking may be some conflict.

there are no permissions issues.

Help…looking for assistance for resolution.

billium_g · October 6, 2024, 2:31am

i think my problem has to do with inodes my 4TB drive has 245104640
my 8TB drive says 1930240.

How do I fix this?

arrogantrabbit · October 6, 2024, 3:42am

Please format your logs properly, use triple backticks, my eyes are bleeding trying to decipher that mess. Which also caused you yourself to miss this line:

2024-10-05T23:18:11Z ERROR failure during run {“Process”: “storagenode”, “error”: “Error opening database on storagenode: database: piece_expiration opening file "config/storage/piece_expiration.db" failed: unable to open database file: no such file or directory
\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabaseWithStat:407
\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:384
\tstorj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:379
\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:354
\tstorj.io/storj/storagenode/storagenodedb.OpenExisting:319
\tmain.cmdRun:67
\tmain.newRunCmd.func1:33
\tstorj.io/common/process.cleanup.func1.4:392
\tstorj.io/common/process.cleanup.func1:410
\tgithub.com/spf13/cobra.(*Command).execute:983
\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115
\tgithub.com/spf13/cobra.(*Command).Execute:1039
\tstorj.io/common/process.ExecWithCustomOptions:112
\tmain.main:34

Fix that.

How did you arrive to this conclusion?

Did you check everything else, like permissions, ACLs, etc? What troubleshooting have you done so far? Without knowing what you did, what worked, what didn’t, and where are you stuck it is not possible to provide any useful advice.

Alexey · October 6, 2024, 4:21am

Hello @billium_g,
Welcome back!

You need to check your --mount option for the /app/config in your docker run command, seems either path is wrong or the disk is dismounted or you have permissions issue (if you used --user option in your docker run command, you need to change the owner of the data location to that user and group or, if you always run the container with sudo you shouldn’t use the --user option and need to change the owner to root).

billium_g · October 7, 2024, 2:15am

so the solution was the formatting of the drive.
originally I was using
sudo mkfs.ext4 -m 0 -T largefile4
since these are all small files it was running out inodes. searching on the net found.
mke2fs -t ext4 -m 0 -i 65536 -I 128 -J size=128 -O sparse_super2

so far so good and issue closed.

Toyoo · October 7, 2024, 7:32pm

@ptdatta, this might be something for you, I recall you’ve been looking for tasks to code. Storage nodes now check for the amount of disk space, but they do not check the free inode count. While this is not a very often encountered problem, it should be trivial to implement.

Alexey · October 8, 2024, 3:34am

not all FSes have inodes, this also wouldn’t be trivial for multiarch.

Default format options for ext4 seems also doesn’t have this issue, perhaps only the custom ones.

ptdatta · October 8, 2024, 3:41am

Ok, looking into it.