I’ve just realized that a few days after I started Graceful Exit on July, my node was disqualified on Saltlake satellite and I don’t know why. Anyone could help me?
Besides, I would like to finalize the Graceful Exit as soon as possible and, apart of that satellite, it’s pending for europe-north-1 one. Anyone knows why it’s still pending?
So the reason is likely that your node has failed more than 10% of transfers.
As @Stob suggested, you need to search for transfer errors and errors like “file not found” (search for “graceful”/“piecetransfer” and “ERROR”/“failed”).
To calculate a number of failed transfers you need to filter logs by “piecetransfer” and group them by PieceID and “ERROR”/“transferred”. Please note - the node will attempt to transfer each piece at least 5 times before it will be considered as a failed transfer, so you need to calculate only failed transfers which have 5 attempts for the same PieceID.
You may also give me a NodeID and I can provide you with exact numbers, however I cannot provide a reason - it’s stated only in your logs.
Thanks, @Alexey, for your answer, as well. Since, I’m not an expert, here you have the NodeID: 12oV5bdzCtFrxTequKDkkTBjH8Pd4Cf62sEfDNBPAJUUDuTT5zo. With the information you provide me, I’ll try to search for on logs.
In any case, it’s clear for me I’ve lost the held amount in that satellite. Well, it’s a pity!
It’s transferred 10,820 pieces and failed to transfer 2,235 pieces, pretty high fail rate I would say. Either your node has corrupted data, or your connection was terrible bad.
For example, for US1 it’s transferred 494,039 pieces and 0 failed.
Thanks for info. The problem was with my connection. I’ve suffering issues when my telco provider changed my public IP. Thanks for the support. Cheers!
I’m facing the same issue on a small node (ID: 1G7CA8T8NwUYLibFqR85TUpXEXKKbwFDLz93srFMPgRDpcvzFj) that I’m currently graceful-exiting: It got disqualified a few days ago on Saltlake:
Disqualification during Graceful Exit is happened because your node is failed to transfer more than 10% of pieces (each transfer takes 5 attempts to transfer a piece to different nodes before considered as failed).
it’s not deployed yet, it should be deployed soon:
These are communication errors, so likely your router wasn’t able to handle a lot of parallel transfers.
This is another reason why we want to change the complicated Graceful Exit.
Another node got disqualified this morning which feels unfair if it’s because my router can’t handle the load… Especially as I’m trying to help the network here by gracefully exiting. Is there anything I can do to slow down those transfers and stop nodes that are exiting from being disqualified?
I guess changing these parameters:
# number of concurrent transfers per graceful exit worker
# graceful-exit.num-concurrent-transfers: 5
# number of workers to handle satellite exits
# graceful-exit.num-workers: 4
Thanks for your unfailing help, as always @Alexey.
Tried with 2 and 2. Still had quite a lot of errors.
I’m now down to 1 and 1, and although logs look better than with default values, there are still many errors. RaspbianOS is pretty chill right now (load average: 0.36, 0.26, 0.28 and an average constant upload of 2MiB/s). I don’t know what else I can do…
This graceful-exit mechanism seems pretty unreliable, I’m glad we’re moving to the shiny new one soon!
Too bad I misunderstood that it wasn’t live yet… that’s my bad.