Swaping nodes - only specified storage node can settle order

Hello dear SNO’s,

I believe that maybe i messed up with my certificate files.
I swapped the nodes. Old trusted node from weak computer moved to the server with raid, while new born node from raid moved to the weak computer.
Copied the certificate files, copied data, but something went wrong now, on both nodes. Now i get only errors.

My question would be, is there any way to know what is the right combination of registerd e-mail + certificate files.

i mean email1@email.com works with certificate files that are ca.11111.cert while email2@email.com works only with certificate files ca.22222.cert ? If i have messed up with the files, how should i now which of the files to what e-mail registered? :expressionless: because it looks like my problem is here.

Thank you.

p.s. strange thing is that both nodes where running for more then ±20 hours with no problems. And only later it stucked.

2020-01-08T19:07:14.599Z INFO orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 finished
2020-01-08T19:07:14.599Z ERROR orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 failed to settle orders {“error”: “order: failed to receive settlement response: only specified storage node can settle order; order: sending settlement agreements returned an error: EOF”, “errorVerbose”: “group:\n— order: failed to receive settlement response: only specified storage node can settle order\n[tstorj.io/storj/storagenode/orders.(*Service](http://tstorj.io/storj/storagenode/orders.(*Service)).settle:304\n[tstorj.io/storj/storagenode/orders.(*Service](http://tstorj.io/storj/storagenode/orders.(*Service)).Settle:195\n[tstorj.io/storj/storagenode/orders.(*Service](http://tstorj.io/storj/storagenode/orders.(*Service)).sendOrders.func2:174\n[tgolang.org/x/sync/errgroup.(*Group](http://tgolang.org/x/sync/errgroup.(*Group)).Go.func1:57\n— order: sending settlement agreements returned an error: EOF\n[tstorj.io/storj/storagenode/orders.(*Service](http://tstorj.io/storj/storagenode/orders.(*Service)).settle.func2:276\n[tgolang.org/x/sync/errgroup.(*Group](http://tgolang.org/x/sync/errgroup.(*Group)).Go.func1:57”}”

You can examine the public keys and the signatures on the certs using the openssl verify command set.

Something like this may work:

openssl verify -CAfile ca.cert -no_check_time identity.cert

to check the consistency of the signatures on the identity and the CA.

Old node’s certificates would have older created date which will help you bind it with correct node.

The certificates don’t include any date information. So, the only dates available for inspection are filesystem dates. Depending on the copy/move methodology and underlying filesystem attributes, it’s possible that the certificate file creation dates may be rather misleading.

@anon27637763 thank you, but this command did not worked out :expressionless:

nodex@nodex:~/cert$ ‪openssl verify -CAfile ca.cert -no_check_time identity.cert

Command ‘‪openssl’ not found, did you mean:

command ‘openssl’ from deb openssl

Try: sudo apt install

nodex@nodex:~/cert$ sudo apt install openssl
[sudo] password for nodex:
Reading package lists… Done
Building dependency tree
Reading state information… Done
openssl is already the newest version (1.1.1-1ubuntu2.1~18.04.5).
The following packages were automatically installed and are no longer required:
linux-headers-4.15.0-66 linux-headers-4.15.0-66-generic linux-headers-4.15.0-70 linux-headers-4.15.0-70-generic
linux-image-4.15.0-66-generic linux-image-4.15.0-70-generic linux-modules-4.15.0-66-generic linux-modules-4.15.0-70-generic
linux-modules-extra-4.15.0-66-generic linux-modules-extra-4.15.0-70-generic
Use ‘sudo apt autoremove’ to remove them.
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

Unsure what the problem might be…

If the package has been installed, the binary executable should be installed. It’s included in the package file list.

Perhaps there’s something amiss in your particular installation.

You could try:

sudo which openssl

and

type openssl

as a start to ensure that the executable is somewhere in the system path.

Just to clarify, the cert files with numbers in them are the backups prior to certs being signed by storj. You need to have to correct identity.cert, identity.key and ca.cert for your node to function.

I know this won’t help you much, but please be aware that as long as your node is using a wrong cert it is collecting data for that wrong identity. Meaning your correct node will later be missing that data as it was sent elsewhere. In short, don’t keep your nodes online with this situation.

@BrightSilence thank you. I know, that i have to have 6 files. And all 6 files where moved. I’ve tried to replace them a few times, but looks like it’s not accepting anymore. Could it be, that even the right certificates are not working with these nodes? I’ve replaced all files, it worked for ±20-40 minutes, and then same errors again. New node - does not matter, but the old one… would be very sad to loose it.

@anon27637763 looks all ok.

nodex@nodex:~ sudo which openssl [sudo] password for nodex: /usr/bin/openssl nodex@nodex:~ type openssl
openssl is /usr/bin/openssl
nodex@nodex:~$

If you are certain you have the right identity in place now, you’re likely running into the problem that your node has unsent orders from both identities because you ran the node with a wrong identity before. I’m not entirely sure if this would resolve itself or it will get stuck on this.

It’s possible this could be fixed by one time emptying the unsent_order table. You’ll miss out on payouts of orders that weren’t yet sent to the satellite, but if that revives your node that may be worth it. But before doing something like that someone from Storj should probably say whether this is a good idea as it’s kind of an extreme solution.

@BrightSilence i have totaly confused with identities. It would be so good to have possibility from identity files get information about node ID or email.

In the begining i thought that i’ve made mistake in both nodes. But now i think, that maybe i’ve made mistake only in the small (new node) but that affected old node as well, if small node was running same identity as big/old node. Just at that moment i did not realized this and started to mess up deeper.

New node does not matter at this time, so i’m looking now in to the old (moved to big server). And the situation looks the following:

Got new error:

2020-01-09T18:52:59.092Z WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “MMJ2QDIYUKSYR3NCOXYLA76XJ6PZUXTONURL67N63A3RABG6WYTQ”, “error”: “pieces error: file does not exist”, “errorVerbose”: “pieces error: file does not exist\n\tstorj.io/storj/storagenode/pieces.(*Store).MigrateV0ToV1:392\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:304\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces.func1:388\n\tstorj.io/storj/storagenode/storagenodedb.(*v0PieceInfoDB).WalkSatelliteV0Pieces:108\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSatellitePieces:468\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:364\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:220\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-01-09T18:52:59.092Z DEBUG retain About to delete piece id {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “6L5FQUBOADMHV5H2IAP4EPWIWSEVJMVPZ35ZCLKJYUHVMZF43BVQ”, “Status”: “enabled”}
2020-01-09T18:52:59.092Z WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “6L5FQUBOADMHV5H2IAP4EPWIWSEVJMVPZ35ZCLKJYUHVMZF43BVQ”, “error”: “pieces error: file does not exist”, “errorVerbose”: “pieces error: file does not exist\n\tstorj.io/storj/storagenode/pieces.(*Store).MigrateV0ToV1:392\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:304\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces.func1:388\n\tstorj.io/storj/storagenode/storagenodedb.(*v0PieceInfoDB).WalkSatelliteV0Pieces:108\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSatellitePieces:468\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:364\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:220\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-01-09T18:52:59.092Z DEBUG retain Deleted pieces during retain {“num deleted”: 0, “Retain Status”: “enabled”}
2020-01-09T19:06:07.732Z DEBUG gracefulexit:chore checking pending exits
2020-01-09T19:06:07.732Z DEBUG gracefulexit:chore no satellites found
2020-01-09T19:06:08.332Z DEBUG version allowed minimum version from control server is: v0.28.0
2020-01-09T19:06:08.333Z INFO version running on version v0.28.4

Strange silence for ±45min.

2020-01-09T19:34:21.620Z DEBUG orders cleaning
2020-01-09T19:34:21.857Z DEBUG orders cleanup finished {“items deleted”: 698}
2020-01-09T19:45:26.184Z DEBUG gracefulexit:chore checking pending exits
2020-01-09T19:45:26.184Z DEBUG gracefulexit:chore no satellites found
2020-01-09T19:45:26.782Z DEBUG version allowed minimum version from control server is: v0.28.0
2020-01-09T19:45:26.782Z INFO version running on version v0.29.3
2020-01-09T20:00:26.184Z DEBUG gracefulexit:chore checking pending exits
2020-01-09T20:00:26.184Z DEBUG gracefulexit:chore no satellites found
2020-01-09T20:00:26.781Z DEBUG version allowed minimum version from control server is: v0.28.0
2020-01-09T20:00:26.781Z INFO version running on version v0.29.3
2020-01-09T20:15:26.184Z DEBUG gracefulexit:chore checking pending exits
2020-01-09T20:15:26.184Z DEBUG gracefulexit:chore no satellites found
2020-01-09T20:15:26.782Z DEBUG version allowed minimum version from control server is: v0.28.0
2020-01-09T20:15:26.782Z INFO version running on version v0.29.3

Looks like the garbage collector is trying to clean up pieces that aren’t there. That’s strange but the upside is that those pieces don’t need to be there anyway. Best you can do now is make very sure your node has the correct data and identity. It’s possible some pieces were uploaded to the other node that should have gone to this one, so you may be missing some data. If it hasn’t been online for too long, the impact may be survivable, if you’re missing too many pieces it may get your node disqualified. Not much you can do about it except wait and see I think.

@BrightSilence i scrolled up my e-mail, looked when i’ve got invitation to this node. 95% i believe i’m correct with identity, but have no idea how to prove it or deny it.

After ±1hour of silence have again some clean up, and errors again:

2020-01-09T20:15:26.782Z INFO version running on version v0.29.3
2020-01-09T20:30:26.184Z INFO bandwidth Performing bandwidth usage rollups
2020-01-09T20:30:26.184Z DEBUG gracefulexit:chore checking pending exits
2020-01-09T20:30:26.184Z DEBUG gracefulexit:chore no satellites found
2020-01-09T20:30:26.430Z DEBUG contact:endpoint pinged {“by”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “srcAddr”: “35.195.106.82:4067”}
2020-01-09T20:30:26.526Z DEBUG contact:endpoint pinged {“by”: “118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW”, “srcAddr”: “78.94.240.180:49716”}
2020-01-09T20:30:26.807Z DEBUG version allowed minimum version from control server is: v0.28.0
2020-01-09T20:30:26.807Z INFO version running on version v0.29.3
2020-01-09T20:30:26.939Z DEBUG contact:endpoint pinged {“by”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “srcAddr”: “104.198.145.213:37294”}
2020-01-09T20:30:27.740Z DEBUG contact:endpoint pinged {“by”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “srcAddr”: “35.229.177.202:3583”}
2020-01-09T20:31:06.100Z DEBUG orders cleaning
2020-01-09T20:31:06.173Z DEBUG orders cleanup finished {“items deleted”: 732}
2020-01-09T20:33:19.221Z DEBUG orders sending
2020-01-09T20:33:19.303Z INFO orders.12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S sending {“count”: 895}
2020-01-09T20:33:19.303Z INFO orders.118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW sending {“count”: 7640}
2020-01-09T20:33:19.303Z INFO orders.121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 sending {“count”: 1757}
2020-01-09T20:33:19.303Z INFO orders.12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs sending {“count”: 1491}
2020-01-09T20:33:19.451Z ERROR orders.118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW rpc client error when receiveing new order settlements {“error”: “order: failed to receive settlement response: only specified storage node can settle order”, “errorVerbose”: "order: failed to receive settlement response: only specified storage node can settle order\n\tstorj.io/storj/storagenode/orders.(*Service).settle:329\n\tstorj.io/storj/storagenode/orders.(*Service).Settle:220\n\tstorj.io/storj/storagenode/orders.(*Service).sendOrders.func2:199\n\tgolang.org/x/sync/errgroup.(Group).Go.func1:57"}
2020-01-09T20:33:19.471Z ERROR orders.12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs rpc client error when receiveing new order settlements {“error”: “order: failed to receive settlement response: only specified storage node can settle order”, “errorVerbose”: "order: failed to receive settlement response: only specified storage node can settle order\n\tstorj.io/storj/storagenode/orders.(

You’re not seeing traffic anymore? Could you check the web dashboard and see if it mentions anything about disqualification? Perhaps also have a look at audit scores there.

Thank you!
Problem “solved” at both nodes :smiley:

Case closed.

damn… :wink:

That’s what I feared… damage was already done. Sorry to see that, I hope you’ll try again and just take it as a lesson learned and start a new node.

I’m optimist !
As well i like this project and really like the idea of decentralization.
So two new born nodes started yesterday :slight_smile:

3 Likes