Additional HDD nodes are shown offline in the dashboard

Hello, Operator.

I have added a second node to another HDD on a computer that is already running as a node, but the dashboard shows it as offline and I’m having trouble.

We have confirmed that port 28968 is accessible from the outside.

docker run -d --restart unless-stopped --stop-timeout 300
-p 28968:28967
-p 100.100.0.0:14003:14002
-e WALLET=“0x000000000000000000000”
-e EMAIL="example@example.com"
-e ADDRESS=example.com: 28968"
-e STORAGE=“1.5TB”
-v “/storj/identity/storagenode2”:/app/identity
-v “/storj/file_storage”:/app/config
–name storagenode2 storjlabs/storagenode:latest

last 20 lines of the log

2020-08-28T12:27:35.228Z INFO Configuration loaded {“Location”: “/app/config/config.yaml”}
2020-08-28T12:27:35.315Z INFO Operator email {“Address”: “example@example.com”}
2020-08-28T12:27:35.315Z INFO Operator wallet {“Address”: “0x000000000000000000000”}
2020-08-28T12:27:36.539Z INFO Telemetry enabled
2020-08-28T12:27:36.546Z INFO db.migration Database Version {“version”: 43}
2020-08-28T12:27:37.534Z INFO preflight:localtime start checking local system clock with trusted satellites’ system clock.
2020-08-28T12:27:38.391Z INFO preflight:localtime local system clock is in sync with trusted satellites’ system clock.
2020-08-28T12:27:38.391Z INFO bandwidth Performing bandwidth usage rollups
2020-08-28T12:27:38.392Z INFO Node 10000000000000000 started
2020-08-28T12:27:38.392Z INFO Public server started on 0.0.0.0:28967
2020-08-28T12:27:38.392Z INFO Private server started on 127.0.0.1:7778
2020-08-28T12:27:38.392Z INFO trust Scheduling next refresh {“after”: “4h3m35.08540843s”}

I see a space between the : and 28968 in your run command. Did you run it with this space, or was that just from when you removed your address/ip?

Thanks for the comment.

The spaces were put in after editing, not really.

Okay. You should verify that your new identity was signed properly.

https://documentation.storj.io/dependencies/identity#confirm-the-identity

Thank you.
The ID was properly proven.

I checked the logs for the first time in several hours and found an error. What does this mean?

2020-08-28T12:27:38.392Z	INFO	trust	Scheduling next refresh	{"after": "4h3m35.08540843s"}
2020-08-28T13:27:38.392Z	INFO	bandwidth	Performing bandwidth usage rollups
2020-08-28T13:27:38.445Z	ERROR	contact:service	ping satellite failed 	{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 1, "error": "ping satellite error: rpccompat: dial tcp: lookup asia-east-1.tardigrade.io on 100.100.0.0: no such host", "errorVerbose": "ping satellite error: rpccompat: dial tcp: lookup asia-east-1.tardigrade.io on 100.100.0.0: no such host\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-08-28T14:27:38.392Z	INFO	bandwidth	Performing bandwidth usage rollups
2020-08-28T14:27:38.446Z	ERROR	contact:service	ping satellite failed 	{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "attempts": 1, "error": "ping satellite error: rpccompat: dial tcp: lookup asia-east-1.tardigrade.io on 100.100.0.0: no such host", "errorVerbose": "ping satellite error: rpccompat: dial tcp: lookup asia-east-1.tardigrade.io on 100.100.0.0: no such host\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:152\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

s

This suggests that the container cannot reach the outside internet. It is trying to resolve the address of the satellite, but the DNS query fails. Have you tried restarting your system?

1 Like

Rebooting the system did not solve the problem.
Another node running on the same machine is working fine, so is it something else?

The identity path was incorrect and did not return the correct number.
Get the identity again.

That probably just means it was not signed. You don’t need to regenerate it, you just need to do the signing part of the instructions.

STOP your node immediately and update your command as per documentation

docker run -d --restart unless-stopped --stop-timeout 300 \
    -p 28967:28967 \
    -p 127.0.0.1:14002:14002 \
    -e WALLET="0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
    -e EMAIL="user@example.com" \
    -e ADDRESS="domain.ddns.net:28967" \
    -e STORAGE="2TB" \
    --mount type=bind,source="<identity-dir>",destination=/app/identity \
    --mount type=bind,source="<storage-dir>",destination=/app/config \
    --name storagenode storjlabs/storagenode:latest

3 Likes

In my environment, --mount is not recognized and not available, so I use the previous -v to mount it instead.

After replacing the identity correctly, the storagenode2 container is now repeatedly restarting.

2020-08-28T21:36:20.558Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "context canceled"}
2020-08-28T21:36:20.752Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "context canceled"}
2020-08-28T21:36:20.826Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2020-08-28T21:36:21.144Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "context canceled"}
2020-08-28T21:36:21.218Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "error": "context canceled"}
2020-08-28T21:36:21.241Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "context canceled"}
2020-08-28T21:36:21.241Z	FATAL	Failed preflight check.	{"error": "system clock is out of sync: system clock is out of sync with all trusted satellites", "errorVerbose": "system clock is out of sync: system clock is out of sync with all trusted satellites\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check:96\n\tstorj.io/storj/storagenode.(*Peer).Run:712\n\tmain.cmdRun:204\n\tstorj.io/private/process.cleanup.func1.4:353\n\tstorj.io/private/process.cleanup.func1:371\n\tgithub.com/spf13/cobra.(*Command).execute:840\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:945\n\tgithub.com/spf13/cobra.(*Command).Execute:885\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.ExecCustomDebug:70\n\tmain.main:330\n\truntime.main:203"}
2020-08-29T01:14:49.019Z	INFO	Configuration loaded	{"Location": "/app/config/config.yaml"}
2020-08-29T01:14:49.039Z	INFO	Operator email	{"Address": "example@exampe.com;}
2020-08-29T01:14:49.039Z	INFO	Operator wallet	{"Address": "0x000000000000000001"}
2020-08-29T01:14:50.762Z	INFO	Telemetry enabled
2020-08-29T01:14:50.766Z	INFO	db.migration	Database Version	{"version": 43}
2020-08-29T01:14:51.198Z	INFO	preflight:localtime	start checking local system clock with trusted satellites' system clock.
2020-08-29T01:14:51.247Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "error": "rpccompat: dial tcp: lookup europe-west-1.tardigrade.io on 192.168.0.1:53: no such host", "errorVerbose": "rpccompat: dial tcp: lookup europe-west-1.tardigrade.io on 192.168.0.1:53: no such host\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-08-29T01:14:51.250Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "rpccompat: dial tcp: lookup saltlake.tardigrade.io on 192.168.0.1:53: no such host", "errorVerbose": "rpccompat: dial tcp: lookup saltlake.tardigrade.io on 192.168.0.1:53: no such host\n\tstorj.io/common/rpc.Dialer.dialTransport:211\n\tstorj.io/common/rpc.Dialer.dial:188\n\tstorj.io/common/rpc.Dialer.DialNodeURL:148\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).getSatelliteTime:110\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check.func1:67\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-08-29T01:14:51.377Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "error": "context canceled"}
2020-08-29T01:14:51.651Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "context canceled"}
2020-08-29T01:14:52.070Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB", "error": "context canceled"}
2020-08-29T01:14:52.645Z	ERROR	preflight:localtime	unable to get satellite system time	{"Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "context canceled"}
2020-08-29T01:14:52.645Z	FATAL	Failed preflight check.	{"error": "system clock is out of sync: system clock is out of sync with all trusted satellites", "errorVerbose": "system clock is out of sync: system clock is out of sync with all trusted satellites\n\tstorj.io/storj/storagenode/preflight.(*LocalTime).Check:96\n\tstorj.io/storj/storagenode.(*Peer).Run:712\n\tmain.cmdRun:204\n\tstorj.io/private/process.cleanup.func1.4:353\n\tstorj.io/private/process.cleanup.func1:371\n\tgithub.com/spf13/cobra.(*Command).execute:840\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:945\n\tgithub.com/spf13/cobra.(*Command).Execute:885\n\tstorj.io/private/process.ExecWithCustomConfig:88\n\tstorj.io/private/process.ExecCustomDebug:70\n\tmain.main:330\n\truntime.main:203"}

Can you elaborate ? Which operating system ? What is the version of your docker ?

CentOS 7
Docker ver 1.31.1, API version: 1.26 (minimum version 1.12)

You are risking your data by using -v instead of --mount. Your docker version should be at least version 2. Is this docker community edition ?

I’ve updated the docker version and set up using --mount, but the container still keeps restarting.

Try this checklist

1 Like

Please, check your logs for the reason. I think you have used curly quotes somewhere instead of straight ones (you should use these ones: " instead of and ) or hyphenation instead of double dashes --.
Also, it’s possible, that you hit a problem with a firewall and docker: https://forum.storj.io/tag/centos

IMHO, if you added 2nd node on same OS as 2nd instance, you should check that you didn’t use same ports (used by 1st instance already) for new node, also check if you made correct port forwarding on router for new node.

2 Likes

The correct port forwarding is taking place, but the error continues to be displayed as incorrect clock time.
Another node on the same machine is working fine, so it’s probably a Docker issue.
I’ll give it some time.

I would like to suggest you to take a look on firewall issues with a docker on CentOS: