Docker container keeps restarting - Exec command error

Hi all,
I’ve just set up a Storj node on a Centos 7 machine, all seems to go well, but when I check docker, it’s always rebooting, every 60 seconds or so.
So I checked in the Docker logs, and see this:

aug 11 10:53:43 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T10:53:43.672904148+02:00" level=info msg="ignoring event" container=10feed20305f7cb57bfb24332be25afef9059477f56d336

aug 11 10:53:46 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T10:53:46.945236442+02:00" level=error msg="Error setting up exec command in container storagenode: Container 10feed

aug 11 10:54:44 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T10:54:44.613516099+02:00" level=info msg="ignoring event" container=10feed20305f7cb57bfb24332be25afef9059477f56d336

Any tips on getting this fixed and up and running?

Cheers!

Hi @Borniet and welcome :slight_smile:

Could you post the command you use to get your docker container running? Also, is this all you have in the logs?

Thank you.

Hi Fadila, thanks for your reply!
This is the Storj command I use:

docker run -d --restart unless-stopped --stop-timeout 300 \
-p 28967:28967/tcp \
-p 28967:28967/udp \
-p 127.0.0.1:14002:14002 \
-e WALLET="0xxxxxxxxxxxxxxxxx" \
-e EMAIL=“borniet@mydomain.com” \
-e ADDRESS=“vpstorj.mydomain.com:28967" \
-e STORAGE=“1.8TB" \
--mount type=bind,source="/root/.local/share/storj/identity",destination=/app/identity \
--mount type=bind,source="/mnt/MyDisk/Storj",destination=/app/config \
--name storagenode2 storjlabs/storagenode:latest

As for the logs, they are full of these messages, with some of the errors above inbetween:

aug 11 22:22:18 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:22:18.628700163+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:23:19 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:23:19.425047009+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:24:20 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:24:20.334734506+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:25:21 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:25:21.238277958+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:26:22 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:26:22.035549303+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:27:22 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:27:22.861622483+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef
aug 11 22:28:23 vpstorj.mydomain.com dockerd[11014]: time="2021-08-11T22:28:23.780855342+02:00" level=info msg="ignoring event" container=d853a0402373868fa791575c2478f3668fc0b5bdbe568ef

Thanks!
Tried it with " and ', but both give errors (path incorrect)

Hello @Borniet ,
Welcome to the forum!

Please, give us 20 last lines from the storagenode’s logs (not journalctl): How do I check my logs? - Node Operator

Thank you :wink:
Here’s the output, although I don’t think it is what you are looking for…

[root@vpstorj ~]# docker logs --tail 20 storagenode
      --log.caller                       if true, log function filename and line number
      --log.development                  if true, set logging to development mode
      --log.encoding string              configures log encoding. can either be 'console', 'json', or 'pretty'.
      --log.level Level                  the minimum log level to log (default info)
      --log.output string                can be stdout, stderr, or a filename (default "stderr")
      --log.stack                        if true, log stack traces
      --metrics.addr string              address(es) to send telemetry to (comma-separated) (default "collectora.storj.io:9000")
      --metrics.app string               application name for telemetry identification (default "storagenode")
      --metrics.app-suffix string        application suffix (default "-release")
      --metrics.instance-prefix string   instance id prefix
      --metrics.interval duration        how frequently to send up telemetry (default 1m0s)
      --tracing.agent-addr string        address for jaeger agent (default "agent.tracing.datasci.storj.io:5775")
      --tracing.app string               application name for tracing identification (default "storagenode")
      --tracing.app-suffix string        application suffix (default "-release")
      --tracing.buffer-size int          buffer size for collector batch packet size
      --tracing.enabled                  whether tracing collector is enabled
      --tracing.interval duration        how frequently to flush traces to tracing agent (default 0s)
      --tracing.queue-size int           buffer size for collector queue size
      --tracing.sample float             how frequent to sample traces

[root@vpstorj ~]#

@CutieePie Seems he have two nodes. Let’s fix the first one, this doesn’t work right now because of usage forbidden characters in the docker run command.

@Borniet Please, stop and remove the existing container

docker stop -t 300 storagenode
docker rm storagenode

Open your docker run command in the plain text editor - nano in the bash for example, and replace all curly quotes and to the straight ones ", then run it.
Please, do not use any wordprocessors to form your docker run command, include Notes, they replaces straight quotes to curly ones, double dashes to hyphens and so on, all these fancy symbols are not valid in the terminal. Use only plain text editors, better - in the terminal - nano, vim or vi (more complicated).

Ok, thanks!!
One step further… It still restarts constantly, but now the logs show something useful (I found out in the meantime that the previous entry WAS indeed from the logs, but it was due to the quotes thing).


[root@vpstorj ~]# docker logs --tail 20 storagenode

Error: piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory

2021-08-13T06:11:22.142Z INFO Configuration loaded {"Location": "/app/config/config.yaml"}

2021-08-13T06:11:22.157Z INFO Operator email {"Address": "bjorn@beheydt.be"}

2021-08-13T06:11:22.158Z INFO Operator wallet {"Address": "0x5494ac56086076B6EEcD25b0cc744D9F103c0f35"}

2021-08-13T06:11:22.654Z INFO Telemetry enabled {"instance ID": "12M7DbKjwUme5sn8gfCQusGDeFvmDbcergnopTFzADuKAFeeeAo"}

2021-08-13T06:11:22.698Z INFO db.migration Database Version {"version": 53}

2021-08-13T06:11:23.388Z INFO preflight:localtime start checking local system clock with trusted satellites' system clock.

2021-08-13T06:11:24.174Z INFO preflight:localtime local system clock is in sync with trusted satellites' system clock.

2021-08-13T06:11:24.175Z INFO bandwidth Performing bandwidth usage rollups

2021-08-13T06:11:24.176Z INFO Node 12M7DbKjwUme5sn8gfCQusGDeFvmDbcergnopTFzADuKAFeeeAo started

2021-08-13T06:11:24.176Z INFO Public server started on [::]:28967

2021-08-13T06:11:24.176Z INFO Private server started on 127.0.0.1:7778

2021-08-13T06:11:24.177Z INFO trust Scheduling next refresh {"after": "6h37m8.962618175s"}

2021-08-13T06:11:24.179Z ERROR services unexpected shutdown of a runner {"name": "piecestore:monitor", "error": "piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory", "errorVerbose": "piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1.1:131\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1:128\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

2021-08-13T06:11:24.179Z ERROR nodestats:cache Get pricing-model/join date failed {"error": "context canceled"}

2021-08-13T06:11:24.180Z ERROR bandwidth Could not rollup bandwidth usage {"error": "context canceled"}

2021-08-13T06:11:24.181Z ERROR gracefulexit:chore error retrieving satellites. {"error": "satellitesdb: context canceled", "errorVerbose": "satellitesdb: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*satellitesDB).ListGracefulExits.func1:152\n\tstorj.io/storj/storagenode/storagenodedb.(*satellitesDB).ListGracefulExits:164\n\tstorj.io/storj/storagenode/gracefulexit.(*service).ListPendingExits:89\n\tstorj.io/storj/storagenode/gracefulexit.(*Chore).Run.func1:53\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/storj/storagenode/gracefulexit.(*Chore).Run:50\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:40\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

2021-08-13T06:11:24.181Z ERROR gracefulexit:blobscleaner couldn't receive satellite's GE status {"error": "context canceled"}

2021-08-13T06:11:24.182Z ERROR collector error during collecting pieces: {"error": "context canceled"}

Error: piecestore monitor: error verifying location and/or readability of storage directory: open config/storage/storage-dir-verification: no such file or directory

means that you did not setup the storagenode: Storage Node - Storj Docs

I did… But I tried it again:

[root@vpstorj ~]# docker run --rm -e SETUP="true" \
>     --mount type=bind,source="/root/.local/share/storj/identity",destination=/app/identity \
>     --mount type=bind,source="/mnt/MyDisk/Storj",destination=/app/config \
>     --name storagenode storjlabs/storagenode:latest
2021-08-13T21:27:09.698Z	INFO	Configuration loaded	{"Location": "/app/config/config.yaml"}
Error: storagenode configuration already exists (/app/config)
[root@vpstorj ~]# 

Maybe you did, but then removed data?
If so and if your node were online, then now it will be disqualified, you would need to create a new identity, sign it with a new authorization token and start with clean storage.

If you did not remove data and identity is the same, then you can remove the config.yaml from the data location and repeat the setup step.
You should not use it more than once!
If the file is missing - it should concern you, because it’s used for protection from disqualification due to temporary misconfiguration.

Thanks!!! Deleting the Yaml file and repeating the steps did the job :wink:

Bjorn

1 Like