Alexey
October 15, 2023, 1:33am
10
Disqualification during Graceful Exit is happened because your node is failed to transfer more than 10% of pieces (each transfer takes 5 attempts to transfer a piece to different nodes before considered as failed).
it’s not deployed yet, it should be deployed soon:
QA finished. We tested what happens if a node starts a graceful exit while the old code is active and is currently transferring pieces the old way. With the deployment, they will stop transfering any pieces and if they started graceful exit more than 30 days ago it will finish right away.
It will be enabled with the next satellite deployment early next week.
Pac:
2023-10-11T23:32:47Z ERROR piecetransfer failed to put piece {"process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Piece ID": "3M5STYLAPU4QEIIZMHXSBICTOL2QRP42PYOAFZGYQTIBH5I3NCBA", "Storagenode ID": "12A2ntyfgmDMmGDvpxwCy2ZDWgxB38MYNedn6rGYkYrPJYHMDBD", "error": "ecclient: upload failed (node:12A2ntyfgmDMmGDvpxwCy2ZDWgxB38MYNedn6rGYkYrPJYHMDBD, address:175.158.56.27:28968): protocol: write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection; write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection; piecestore: piecestore close: write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection", "errorVerbose": "ecclient: upload failed (node:12A2ntyfgmDMmGDvpxwCy2ZDWgxB38MYNedn6rGYkYrPJYHMDBD, address:175.158.56.27:28968): protocol: write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection; write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection; piecestore: piecestore close: write tcp 172.17.0.5:56956->175.158.56.27:28968: use of closed network connection\n\tstorj.io/uplink/private/ecclient.(*ecClient).PutPiece:244\n\tstorj.io/storj/storagenode/piecetransfer.(*service).TransferPiece:148\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run.func3:100\n\tstorj.io/common/sync2.(*Limiter).Go.func1:49"}
2023-10-11T23:32:47Z ERROR gracefulexit:chore.1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE@saltlake.tardigrade.io:7777 failed to send notification about piece transfer. {"process": "storagenode", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "error": "EOF", "errorVerbose": "EOF\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run.func3:105\n\tstorj.io/common/sync2.(*Limiter).Go.func1:49"}
2023-10-11T23:32:47Z ERROR gracefulexit:chore worker failed {"process": "storagenode", "error": "gracefulexit: context canceled while waiting to receive message from storagenode", "errorVerbose": "gracefulexit: context canceled while waiting to receive message from storagenode\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run:90\n\tstorj.io/storj/storagenode/gracefulexit.(*Chore).AddMissing.func1:82\n\tstorj.io/common/sync2.(*Limiter).Go.func1:49"}
These are communication errors, so likely your router wasn’t able to handle a lot of parallel transfers.
This is another reason why we want to change the complicated Graceful Exit.
1 Like