VERY slow graceful exit from eu-north and now disqualified

This is a subset of the message I just sent via a support ticket. It looks to me like I am being penalized for other node operators with bad or unreliable connectivity/nodes. I did replace the receipt IDs from graceful exit with xxx

I have been working to decommission this node (12XHzxeALiNcJdkF7QPn4LDhJMMu12DwGQkggEoeJxVQ44TZiiL) for approaching 2 months. All of the other satellites finished fine and without issue within a few days. Europe north, however has only been moving at less than a percent a week for the last few weeks. when I check the logs, the errors I see are about tcp timeouts being exceeded. There are no disk errors and my internet connection has 1Gb/s unmetered of available upsteam and only a few Mb/s being consumed.

What it looks like to me is that I am being penalized for the poor networks of some of the eu-north nodes. I have pasted a small portion of output of the graceful exit status command, the last several lines from my storagenode log below along with some basic network diagnostic tests (ping and tracert) showing how bad the connections seem to be. I can upload the entire log if you would like to see it as well.

C:\Users\byte>“C:\Program Files\Storj\Storage Node\storagenode.exe” exit-status --identity-dir “C:\Users\byte\AppData\Roaming\Storj\Identity\storagenode” --log.output stderr
2022-09-04T09:08:45.721-0500 INFO Anonymized tracing enabled {“Process”: “storagenode”}
2022-09-04T09:08:45.731-0500 INFO Identity loaded. {“Process”: “storagenode”, “Node ID”: “12XHzxeALiNcJdkF7QPn4LDhJMMu12DwGQkggEoeJxVQ44TZiiL”}

Domain Name Node ID Percent Complete Successful Completion Receipt
us2.storj.io:7777 12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo 100.00% Y xxxxxxx
saltlake.tardigrade.io:7777 1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE 100.00% Y 0a47304502203550728377436d3db07916aa3103fb74186f11b3218d9a1780330a8264260b50022100b46275f42cafe9a049051919394579b5469001204abc3463f7ddcf7e9b9adabd12207b2de9d72c2e935f1918c058caaf8ed00f0581639008707317ff1bd0000000001a20c87af2a718960f25d61d04e47f36e25d97a567ffc9c38b716f37d5f100000000220b08f982e2970610b6ffee7d
ap1.storj.io:7777 121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6 100.00% Y xxx
us1.storj.io:7777 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S 100.00% Y xxxxx
eu1.storj.io:7777 12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs 100.00% Y xxxx
europe-north-1.tardigrade.io:7777 12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB 36.78% N xxxx

2022-09-03T11:30:02.028-0500 ERROR piecetransfer failed to put piece {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “LVPBPYZULSE6XBRHN4DNAQ5PUON76J4PG76EKLYDCTVNCJBETJJA”, “Storagenode ID”: “127qK3Mk2xEtahbnmuH8csTYWeViw3XcXvxDdfhq6xqQDM6yzTN”, “error”: “ecclient: failed to dial (node:127qK3Mk2xEtahbnmuH8csTYWeViw3XcXvxDdfhq6xqQDM6yzTN): piecestore: rpc: tcp connector failed: rpc: dial tcp 72.24.148.254:28967: i/o timeout”, “errorVerbose”: “ecclient: failed to dial (node:127qK3Mk2xEtahbnmuH8csTYWeViw3XcXvxDdfhq6xqQDM6yzTN): piecestore: rpc: tcp connector failed: rpc: dial tcp 72.24.148.254:28967: i/o timeout\n\tstorj.io/uplink/private/ecclient.(*ecClient).PutPiece:219\n\tstorj.io/storj/storagenode/piecetransfer.(*service).TransferPiece:148\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run.func3:97\n\tstorj.io/common/sync2.(*Limiter).Go.func1:43”}
2022-09-03T11:30:06.878-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “IIN5W3NAA757UNCROII36DWHPQ54DTEQZKVKSULA3FTCNMLHI2GA”, “Storagenode ID”: “14yqn8PGw6iiAtz8oUDs4qPednvBsD1EvPo3ubyPG3BXfJiFR5”}
2022-09-03T11:30:08.486-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “AYFOBGQPRCD3AU4W5LBYVCVJJ3ONX2FPO5U32MNGX2QDNPBKSFJA”, “Storagenode ID”: “12ceUCe4dvgLDmAKvrLKDTSkPrTTDUoRsc5XSvS5a14uVxdfGwo”}
2022-09-03T11:30:19.839-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “4653BWOURY4YUUPMYYIWA753ENOFOTC7C527VSX7N34DECTZRVCQ”, “Storagenode ID”: “12AYesWYS5uJ2ss7xC9uWHRujMR3YzxTerWVA6SbYDXNkjSWUzH”}
2022-09-03T11:30:21.236-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “JA4T7HZE5JRL57BLD34G3UXTVDG5BIQHJIMIZDBXS3RL2M2VVJTA”, “Storagenode ID”: “1ewiCa3G9VXgbEkXaLkdgcYFSn4ii939LqKZ8FS2nwADezhKm7”}
2022-09-03T11:30:49.724-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “SGG3LR545RGS7I7DYDHZ5GGJBK35L7NTQXISGBU33T2X3VA3POXQ”, “Storagenode ID”: “12a7Un8pzALd8NBLffRrN626yFtPKXhqrUDJJqJTXapCpWdcaXU”}
2022-09-03T11:33:25.491-0500 ERROR piecetransfer failed to put piece {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “FET2YIS25UAKEKPWH2CIY2UYYTLF6CBYC5A5NBGSHCTX5POYGNTQ”, “Storagenode ID”: “1dt6N8f5rXSERZAy8g2wvdk2fSphAvrXZ773G2qrSmPbSLRW9t”, “error”: “ecclient: upload failed (node:1dt6N8f5rXSERZAy8g2wvdk2fSphAvrXZ773G2qrSmPbSLRW9t, address:93.51.20.175:28967): protocol: expected piece hash; context deadline exceeded; EOF”, “errorVerbose”: “ecclient: upload failed (node:1dt6N8f5rXSERZAy8g2wvdk2fSphAvrXZ773G2qrSmPbSLRW9t, address:93.51.20.175:28967): protocol: expected piece hash; context deadline exceeded; EOF\n\tstorj.io/uplink/private/ecclient.(*ecClient).PutPiece:242\n\tstorj.io/storj/storagenode/piecetransfer.(*service).TransferPiece:148\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run.func3:97\n\tstorj.io/common/sync2.(*Limiter).Go.func1:43”}
2022-09-03T11:33:26.357-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “ZH33KUU67QHJ54JOOCSBBXLTDKOHEDFD7IHWB6YNKH7J4VM3MSMA”, “Storagenode ID”: “12qcjBQd4sfxZRQtciYp66GyskzKFMVQCBe7wciDidvXT4w9szd”}
2022-09-03T11:33:31.735-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “FET2YIS25UAKEKPWH2CIY2UYYTLF6CBYC5A5NBGSHCTX5POYGNTQ”, “Storagenode ID”: “12VJtJkSyY2yPQvGKcuArYfxTvgURU9RS1QFL8WC21E7bJXT12w”}
2022-09-03T11:34:31.230-0500 INFO piecetransfer piece transferred to new storagenode {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “Piece ID”: “LVPBPYZULSE6XBRHN4DNAQ5PUON76J4PG76EKLYDCTVNCJBETJJA”, “Storagenode ID”: “1ZEkHhz3REMX9Xq77bfekXiVkDdHNoV18nSuv9iYAWEXYjHy4f”}
2022-09-03T11:34:31.477-0500 ERROR gracefulexit:chore graceful exit failed. {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “reason”: “OVERALL_FAILURE_PERCENTAGE_EXCEEDED”}

C:\Users\byte>ping 72.24.148.254

Pinging 72.24.148.254 with 32 bytes of data:
Request timed out.

Ping statistics for 72.24.148.254:
Packets: Sent = 1, Received = 0, Lost = 1 (100% loss),
Control-C
^C
C:\Users\byte>tracert -d 72.24.148.254

Tracing route to 72.24.148.254 over a maximum of 30 hops

1 <1 ms <1 ms <1 ms 172.16.253.1
2 * * * Request timed out.
3 10 ms 7 ms 7 ms 172.102.50.236
4 2 ms 2 ms 3 ms 74.40.3.17
5 12 ms 7 ms 8 ms 45.52.201.121
6 * * * Request timed out.
7 58 ms 42 ms 42 ms 4.69.219.210
8 42 ms 42 ms 43 ms 4.35.245.194
9 * * * Request timed out.
10 * * * Request timed out.
11 * * * Request timed out.
12 * * * Request timed out.
13 * * * Request timed out.

C:\Users\byte>ping 8.8.8.8

Pinging 8.8.8.8 with 32 bytes of data:
Reply from 8.8.8.8: bytes=32 time=2ms TTL=117
Reply from 8.8.8.8: bytes=32 time=3ms TTL=117
Reply from 8.8.8.8: bytes=32 time=2ms TTL=117
Reply from 8.8.8.8: bytes=32 time=2ms TTL=117

Ping statistics for 8.8.8.8:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 2ms, Maximum = 3ms, Average = 2ms

C:\Users\byte>tracert -d 95.51.20.175

Tracing route to 95.51.20.175 over a maximum of 30 hops

1 <1 ms <1 ms <1 ms 172.16.253.1
2 * * * Request timed out.
3 4 ms 4 ms 3 ms 172.102.50.208
4 2 ms 3 ms 2 ms 74.40.3.17
5 4 ms 4 ms 2 ms 45.52.201.125
6 * * * Request timed out.
7 3 ms 3 ms 3 ms 154.54.47.213
8 13 ms 14 ms 13 ms 154.54.3.214
9 24 ms 25 ms 25 ms 154.54.44.170
10 26 ms 25 ms 25 ms 154.54.46.178
11 24 ms 24 ms 24 ms 154.54.11.182
12 38 ms 38 ms 38 ms 193.251.128.15
13 109 ms 109 ms 109 ms 193.251.151.34
14 * * * Request timed out.
15 * * * Request timed out.
16 * * * Request timed out.
17 141 ms 141 ms 141 ms 80.50.159.46
18 * * * Request timed out.
19 * * * Request timed out.
20 * * * Request timed out.
21 * * * Request timed out.
22 ^C
C:\Users\byte>ping 95.51.20.175

Pinging 95.51.20.175 with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 95.51.20.175:
Packets: Sent = 3, Received = 0, Lost = 3 (100% loss),
Control-C

Your node has been disqualified for 126,287 failed pieces transfers, which is more than 10% of total pieces (734,429), so this is not a one node.
Your node have more than 3 attempts to transfer each failed piece to more than 3 nodes. Your node have 58,799 failed transfers pieces on Saltlake as well, but fortunately it was finished successfully.

The disqualification is permanent and not reversible, I’m sorry about that.

Support has been looking at my logs for weeks now with no information back. It really seems quite unfair that I have had this node on since before the service launched and I am DQ’d for what looks to be application coding, bad nodes, and/or international network hop issues when I have 154GB from what was almost 7TB that hasn’t been fully exited. I have not identified a single read error on my side and my network is fine. I have another node running that has 99% availability, which is what this node had until I triggered the graceful exit and EU North began dropping.

The disqualification is related to the too many failed transfers (more than 10%) during GE, not availability.
I asked the team to take a look on your case one more time. I hope you still have this node and data.

1 Like