Need help with relaunched node: tls peer certificate not signed

gesus13 · April 9, 2023, 10:09pm

hey there. my linux operating system nuked. i have saved the identity folder plus the 11.26 TB are save on a second drive.

did a fresh install now and docker is back up and running.
status: offline
quic: misconfigured
bandwith usage, average disk storage, total disk space and payout details show correctly as before!
→ although networkingw as not touched and everything runs back under same ip / subnet / firewall.

if i run docker logs storagenode, i get the output

2023-04-09T21:59:52.495Z        ERROR   contact:service ping satellite failed   {"Process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "attempts": 10, "error": "ping satellite: failed to ping storage node, your node indicated error code: 0, **rpc: tcp connector failed: rpc: tls peer certificate verification: not signed by any CA in the whitelist: CA cert"**, "errorVerbose": "ping satellite: failed to ping storage node, your node indicated error code: 0, rpc: tcp connector failed: rpc: tls peer certificate verification: not signed by any CA in the whitelist: CA cert\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:149\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:102\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

help? node ID: 127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh
was running for about 500 days already and i dont want to lose the data.

i am not a frequent linux user, just to set expectations. any help appreciated!

Stob · April 9, 2023, 10:45pm

Hi @gesus13
Also not a linux user but this error usually means the identity files are not all present. There should be 6 files in the folder.

gesus13 · April 9, 2023, 11:28pm

thanks for your reply stob. from earlier, i had

ca.cert
ca.key
identity.cert
identity.key

but found two extra files in my backuped identity folder. now i end up with the following message:

2023-04-09T23:23:35.208Z        ERROR   services        unexpected shutdown of a runner {"Process": "storagenode", "name": "piecestore:monitor", "error": "piecestore monitor: error verifying location and/or readability of storage directory: node ID in file (127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) does not match running node's ID (12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME)", "errorVerbose": "piecestore monitor: error verifying location and/or readability of storage directory: node ID in file (127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) does not match running node's ID (12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME)\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1.1:139\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1:133\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-04-09T23:23:35.211Z        ERROR   contact:service ping satellite failed   {"Process": "storagenode", "Satellite ID": "12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo", "attempts": 1, "error": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup us2.storj.io: operation was canceled", "errorVerbose": "ping satellite: rpc: tcp connector failed: rpc: dial tcp: lookup us2.storj.io: operation was canceled\n\tstorj.io/common/rpc.HybridConnector.DialContext.func1:190"}

is there a way i can ‘align’ my node-id with the id in the cert?

Stob · April 9, 2023, 11:30pm

Your identity files look fine now. You need to look at folder/user permissions based on the first error and check DNS resolution on your node for the second error.

gesus13 · April 9, 2023, 11:32pm

Thanks Stob. I am afraid that its not a filesystem error but as the message suggests, and i interpret, a identity mismatch encryption error… probably:

node ID in file (127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) does not match running node's ID (12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME)"

i wonder how i can teach my node to use the identity that i have in my identity files…?

Stob · April 9, 2023, 11:35pm

I re-read the errors!

error verifying location and/or readability of storage directory: node ID in file (127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) does not match running node’s ID (12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME)

Did you create a new identity on the fresh docker install??? This should not be done. Stop the node ASAP, delete the new identity files you created, point docker to the correct identity folder (old identity).

gesus13 · April 9, 2023, 11:47pm

i did not create a new authorization token
i did not create a new identity
i preserved the old identity files (dated from 2020)
/mnt/storjdata is present, persistently mounted, and reflects a dedicated disk sdb1. i verified rw access.

i did:

Copy the 6 identity files to /home/storj/identity/storagenode
Start the storagenode container with the command below

docker run -d --restart unless-stopped --stop-timeout 300 \
    -p 28967:28967/tcp \
    -p 28967:28967/udp \
    -p 14002:14002 \
    -e WALLET=" [redacted, unchanged] " \
    -e EMAIL=" [redacted, unchanged] " \
    -e ADDRESS=" [redacted, unchanged] " \
    -e STORAGE="39TB" \
    --mount type=bind,source="/home/storj/identity/storagenode",destination=/app/identity \
    --mount type=bind,source="/mnt/storjdata",destination=/app/config \
    --name storagenode storjlabs/storagenode:latest

same error:

Error: piecestore monitor: error verifying location and/or readability of storage directory: node ID in file (127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) does not match running node's ID (12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME)

can i somehow match the ‘software node ID’ with the ‘identityfile node ID’?

i stopped the node again to avoid automatic deletions.

gesus13 · April 9, 2023, 11:53pm

hehe… fixed it…

renamed the config.yaml to _old_config.yaml

then ran the setup command again:

docker run --rm -e SETUP="true" \
    --user $(id -u):$(id -g) \
    --mount type=bind,source="/home/storj/identity/storagenode",destination=/app/identity \
    --mount type=bind,source="/mnt/storjdata",destination=/app/config \
    --name storagenode storjlabs/storagenode:latest

then ran the container with the command above… now its all nice and green again <3

thanks for your help Stob! you made me think into the right direction

Alexey · April 10, 2023, 4:35am

actually it’s more look like you destroyed your old 127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh node. Your backed up identity is 12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME, and this data not belongs to it.

So, if I’m right, your backed up identity 12CHMvgAEve2QdqznZhrgFuD7VFkHB2xkgHPcJhd15uM5QTjGME will be disqualified for missing data.
From the other side, if this node is never run before, it will survive, but likely will slowly remove the foreign data (belonged to 127FNR2y3r1AYSvyBPeMP439bW6fZRHuNL5516Wr5GkPtkLaUmh) in the next few weeks.

This is why we recommend to move your identity to the disk with data to do not accidentally mix it with another one later like in your case.

snorkel · April 10, 2023, 6:46am

39TB? Why asign so much space to a single node? Make 1 node per HDD. It’s safer.

gesus13 · April 10, 2023, 12:21pm

hi alexey i think i (temporarily) created a new ID because i failed to import the previous ID while running the setup command. through my fix i imported the previous ID again.

to avoid identity problems, i only have one identity and only one location to store it. thanks for the tip of storing my identity on the data partition - will do that when i fintune the new installation (config.yaml, docker without root, watchtower, monitoring…)

according to the dashboard and the logs, the data remained available and the node is operational again.

gesus13 · April 10, 2023, 12:24pm

hi snorkel

its an old backup server with 60 tb disks in raid - therefore the 39TB are already protected against disk loss lets hope that the other components remain up and running

Alexey · April 11, 2023, 4:57am

You must never run a setup command for the old node, you may destroy it if you would use a wrong identity and/or path to data. So simple - never run the setup command if the node is worked before.
If you provided a correct path to your identity and data in your regular docker run command it will just start. If you messed up something - it will crash to protect you from disqualification.
Removing/renaming config and forcible running setup command just forfeit all protection and now you on your own: if paths were correct and you provided a path to the correct identity, then it will survive (however in this case you didn’t need to run setup at all, it should just work), otherwise it will be disqualified.

please make sure it’s a correct one - you may check your old logs, NodeID must be the same.