New node. Worked a few hours and now "no such file or directory"

ACarneiro · June 7, 2024, 10:07pm

Yes. On the very last line you can put --name nodeX storjlabs/storagenode:latest

(You guessed it, I called my node sequential numbers from 1 to 10 to… keep it simple )

suthekev · June 7, 2024, 10:21pm

perfect thanks.
For some reason i missed it. i had name there.
just my brain failing.

when you have 10+ nodes, is there a tool to dashboard all their statuses on a single dashboard?
flipping through 10+ tabs every once in a While will be daunting .

would rather see a list of them with stats

ACarneiro · June 7, 2024, 10:34pm

There is a multinode dashboard that you can deploy. You can search this forum for that, but I found it complicated and I’m not sure if it’s very stable so haven’t really looked into that properly.

Also, don’t think you’re going to be deploying a lot of nodes in a hurry. The nodes take time to fill (what size are your HDDs and how big will each node be?), and all the nodes behind the same IP address share the ingress between themselves.

What most of us do is deploy one node, once that is 3/4 full we start another one (so that it’s vetted by the time the previous actually fills up), when that second one is 3/4 full deploy a third one and so on…

It will take time for your nodes to start generating any significant traffic or income. Probably over a year. So you’ll need to be patient.

Alexey · June 8, 2024, 3:14am

Hello @suthekev ,
Welcome to the forum!

suthekev · June 8, 2024, 4:53pm

hey! thanks appreciate it.

Still having issues.
i made a new identity off a new token but I’m still running into issues after about 200gb is downloaded.

then it goes into a reboot loop with the following error :

2024-06-08T07:14:50Z ERROR failure during run {“Process”: “storagenode”, “error”: “Error opening database on storagenode: database: satellites opening file "config/storage/satellites.db" failed: unable to open database file: no such file or directory\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:364\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:341\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:316\n\tstorj.io/storj/storagenode/storagenodedb.OpenExisting:281\n\tmain.cmdRun:65\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:393\n\tstorj.io/common/process.cleanup.func1:411\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:267”, “errorVerbose”: “Error opening database on storagenode: database: satellites opening file "config/storage/satellites.db" failed: unable to open database file: no such file or directory\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:364\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:341\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:316\n\tstorj.io/storj/storagenode/storagenodedb.OpenExisting:281\n\tmain.cmdRun:65\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:393\n\tstorj.io/common/process.cleanup.func1:411\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:267\n\tmain.cmdRun:67\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:393\n\tstorj.io/common/process.cleanup.func1:411\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:34\n\truntime.main:267”}
Error: Error opening database on storagenode: database: satellites opening file “config/storage/satellites.db” failed: unable to open database file: no such file or directory
storj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:364
storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:341
storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:316
storj.io/storj/storagenode/storagenodedb.OpenExisting:281
main.cmdRun:65
main.newRunCmd.func1:33
storj.io/common/process.cleanup.func1.4:393
storj.io/common/process.cleanup.func1:411
github.com/spf13/cobra.(*Command).execute:983
github.com/spf13/cobra.(*Command).ExecuteC:1115
github.com/spf13/cobra.(*Command).Execute:1039
storj.io/common/process.ExecWithCustomOptions:112
main.main:34
runtime.main:267
2024-06-08 07:14:50,267 INFO exited: storagenode (exit status 1; not expected)
2024-06-08 07:14:51,270 INFO gave up: storagenode entered FATAL state, too many start retries too quickly
2024-06-08 07:14:52,272 WARN received SIGQUIT indicating exit request
2024-06-08 07:14:52,272 INFO waiting for processes-exit-eventlistener, storagenode-updater to die
2024-06-08T07:14:52Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode-updater”}
2024-06-08 07:14:52,278 INFO stopped: storagenode-updater (exit status 0)
2024-06-08 07:14:53,279 INFO stopped: processes-exit-eventliste

ACarneiro · June 8, 2024, 4:54pm

Can you tell us a bit about your setup?
What hardware? What drives? How are they connected? What O/S? Did you do anything to the machine during those 2 hours (restarts or reboots, for example)?

suthekev · June 8, 2024, 5:07pm

Epyc 7702p processor
Using hba’s in IT mode to JBOD.
Netapp storage array.

Sas 14tb drives (I’ve tried multiple drives)
Formatted as ext4.
Mounted via fstab.

Using Ubuntu Desktop.

I did absolutely nothing during the 4h (this time) it took to reach 200gb other than refresh the http page. It ran uninterrupted. No reboots.

These drives haven’t had any issues doing other projects for the last 2 years.

ACarneiro · June 8, 2024, 5:15pm

This is really odd. It sounds like the docker instance is losing access to the database files.
Out of other ideas of things to try, have can you fschk the offending disk, perhaps?

JWvdV · June 8, 2024, 6:28pm

Checking the file system and see whether disk isn’t full, are the first steps in my opinion.

suthekev · June 8, 2024, 9:01pm

I can still create files and copy files to the drive. Drive shows 2tb available.

I can also navigate to the storj directory and physically see the Files that “don’t exist”

Not sure what’s going on.

ACarneiro · June 8, 2024, 9:02pm

Did you run a fschk?

suthekev · June 8, 2024, 9:16pm

not yet. but having the same issues on multiple hard drives makes me think it’s not that.

i just noticed further up the logs it shows:
2024-06-08T21:10:39Z INFO Configuration loaded {“Process”: “storagenode-updater”, “Location”: “/app/config/config.yaml”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-disk-space”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “console.address”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “server.address”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “contact.external-address”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “healthcheck.details”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-bandwidth”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “operator.wallet”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “operator.email”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “server.private-address”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “operator.wallet-features”}
2024-06-08T21:10:39Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “healthcheck.enabled”}
2024-06-08T21:10:39Z INFO Anonymized tracing enabled {“Process”: “storagenode-updater”}
2024-06-08T21:10:39Z INFO Running on version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.104.5”}
2024-06-08T21:10:39Z INFO Downloading versions. {“Process”: “storagenode-updater”, “Server Address”: “https://version.storj.io”}
2024-06-08T21:10:39Z INFO Configuration loaded {“Process”: “storagenode”, “Location”: “/app/config/config.yaml”}

maybe that’s related? they’re info’s not errors but seems concerning

ACarneiro · June 8, 2024, 9:20pm

No, those are fine. Don’t worry. The storagenode-updater process also looks at config.yaml at startup and sees a lot of parameters it doesn’t understand. Don’t worry.

I really am confused about what could be causing your issues. I agree that the same issue in multiple drive makes it less likely to be an error but… you never know!

suthekev · June 9, 2024, 12:55am

Trying again on a sata drive (instead of my sas drives) in a different netapp box.

Hoping maybe third times a charm.
Like every other time, starts up ok at first.

Dashboard seems fine. Quic ok. Etc.

Guess we’ll see what happens in 2-6 hours when it reaches 200gb.

Alexey · June 9, 2024, 3:44am

You need to run fsck for that drive, then check permissions, especially if you used --user $(id -u):$(id -g) in your docker run command, files should be owned by $(id -u):$(id -g).
You may check:

ls -l /mnt/storj/storagenode/storage

In that case it’s a hardware issue. And you need to run fsck to make sure that the filesystem is clean (likely not). I hope that you dropped the idea to use a mergefs…

suthekev · June 9, 2024, 6:36am

tried on a sata hdd. 16tb drive. 2tb node.
Lots of free space.
crashed around 200gb… again.
r
this is now 3 different hard drives all with identical results.

i did catch this in my logs:
2024-06-09T06:16:52Z ERROR failure during run {“Process”: “storagenode”, “error”: “piecestore monitor: error verifying writability of storage directory: open config/storage/write-test35559395: no space left on device”, “errorVerbose”: “piecestore monitor: error verifying writability of storage directory: open config/storage/write-test35559395: no space left on device\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2.1:184\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func2:167\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:78”}
Error: piecestore monitor: error verifying writability of storage directory: open config/storage/write-test35559395: no space left on device

it gave this error prior to the reboot loops that then say no files found.
It’s like it just entirely loses the passed through mount point.

is there a way to just run this native in Ubuntu in terminal rather than a container?

Alexey · June 9, 2024, 7:18am

Perhaps this device has issues?

Could you please check your --mount option in the docker run command?
By the way, how is this command look now?

ACarneiro · June 9, 2024, 7:28am

It’s weird that it works initially and consistently fails after 200GB, though. On three different drives.
If it was a mount or permissions issue surely it should fail from the very start?
I’m quite stumped…

Alexey · June 9, 2024, 7:31am

Or there is a mistake in the --mount option (or used -v), and it creates a docker volume (which is deleted with the container).
The another alternative is to use a mountpoint instead of the disk. At the end it’s just a folder.
You may also try to use a disk path (i.e. /dev/sda1) instead of a mountpoint, a lot of options how to fail there.

suthekev · June 9, 2024, 4:38pm

This is my setup command: (only ran once)

docker run --rm -e SETUP=“true”
–user $(id -u):$(id -g)
–mount type=bind,source=“/mnt/MySATA_01/ID01”,destination=/app/identity
–mount type=bind,source=“/mnt/MySATA_01/storj”,destination=/app/config
–name 01 storjlabs/storagenode:latest

This is my run command:
docker run -d --restart unless-stopped --stop-timeout 300
-p 28901:28967/tcp
-p 28901:28967/udp
-p 14001:14002
-e WALLET=“removedwalletinfo”
-e EMAIL=“removed.Email.ca”
-e ADDRESS=“removed_info.ddns.net:28901”
-e STORAGE=“2TB”
–user $(id -u):$(id -g)
–mount type=bind,source=“/mnt/MySATA_01/ID01”,destination=/app/identity
–mount type=bind,source=“/mnt/MySATA_01/storj”,destination=/app/config
–name 01 storjlabs/storagenode:latest
–operator.wallet-features=zksync