Graceful Exit Guide

littleskunk · January 24, 2020, 5:40pm

You have to be a healty node to finish graceful exit. Getting disqualified at 99% does mean you are not a healty node. The definition of healty and unhealty has not changed.

Move the less paid garbage from my storage node to your storage node. That will allow me to accept more data that generate a lot of download traffic. I can maximize my income while you will get pushed to 0 traffic.

Mad_Max · January 24, 2020, 10:57pm

Definition of “healthy” has not changed. But way it checked did.
Node with say 99.9% correct pieces of data and 0.1% corrupted/missed will pass regular audits and counted as “healthy” by satellites. At least for a some time.
But even single corrupted piece of data can and probable will fail the GE process and so same node is not “healthy” for GE. Refer to doc: storj/docs/blueprints/storagenode-graceful-exit/protocol.md at fe8d556b4ef04299cc83c0e2faf7eebd31f5634b · storj/storj · GitHub - single “bad” pieces is enough to fail entire GE at least for one satellite.

How you suppose to do such trick with option i described above? You do not understand that i have wrote?

You will get only exactly the SAME data back on your node. And will pay a LOT to get it back.

littleskunk · January 24, 2020, 11:03pm

This is correct.

Not correct. I never ever said that. Again the rules for getting disqualified are still the same. You are a good or a bad node depending on how many pieces you have lost. A few pieces are fine. Too many are not acceptable. Calling graceful exit doesn’t change anything on that definition.

Mad_Max · January 24, 2020, 11:05pm

Are you sure?
Refer to doc: storj/docs/blueprints/storagenode-graceful-exit/protocol.md at fe8d556b4ef04299cc83c0e2faf7eebd31f5634b · storj/storj · GitHub - single “bad” pieces is enough to fail entire GE at least for one satellite.

BrightSilence · January 24, 2020, 11:40pm

Where does it say that a single bad piece would fail the GE? Please quote the specific part, because I don’t see it on that page. In fact, it ends with the question “how many failed is too many?” suggesting pretty clearly that 1 failure is not fatal.

Also

That’s directly from someone in the know. Unless you have specific documentation or code that contradicts that, I would say you can trust this.

Mad_Max · January 25, 2020, 12:21am

Here:

and here

If node is missing piece it will fail transfer requests each time satellite ask this particular piece until trigger max treshhold of retries and ExitFailed. Satellite will retry failed pieces some times in case of transient errors but if file containing piece is absent or data in it damaged - no matter how may tries satellite gives to node - all tries will fail.

Same apply with one or few corrupted pieces: if node does not know what one of pieces stored have been corrupted it can sent such piece to another (replacement) node. BUT satellite ask both nodes (exiting and replacing) to calculate hashes of each transferred piece and compare it to reference hash stored in DB. If a single bit in stored piece is altered doesn’t matter for what reason - the hashes will never match after transfer at every attempt. And the satellite will disqualify the node for sending incorrect data during GE.

“how many failed is too many” is for transient errors. Like node can not transfer piece at first try (but still can do it on x try) or doing it too slow.

BrightSilence · January 25, 2020, 8:11am

I think you misinterpreted that, but I would agree that the wording is confusing.

If the storage node has failed too many transfers overall, failed the same piece over a certain threshold, or has sent incorrect data, the satellite will send an ExitFailed message.

This sentence has 2 parts.

If the storage node has failed too many transfers overall the satellite will send an ExitFailed message.

You need to fail multiple pieces to fail the exit.

failed the same piece over a certain threshold, or has sent incorrect data

Clarification of what it means to fail a piece.

The way they’re mushed together is confusing. But I think this is the intended interpretation.

I’m not sure about your second quote. I actually think that should say TransferFailed and not ExitFailed. At other places in the document it refers several times to a failed pieces count and a threshold for that. But now we’re talking about a difference in interpretation so maybe someone could clarify the actual intention.

Konard · March 6, 2020, 6:09pm

In 5 weeks I’ll have to move to different location where I cannot operate the node. I`m node operator for 3 months. How I can do graceful exit right now? Looks like the rules just force me to be disqualified.

nerdatwork · March 6, 2020, 6:34pm

3 months is less than the minimum to initiate graceful exit. You can still start the process and it will inform you the date when you can start it successfully.

Konard · March 7, 2020, 9:37am

I have tried to exit from one satellite

Domain Name                      Node ID                                             Percent Complete  Successful  Completion Receipt
satellite.stefan-benten.de:7777  118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW  0.00%             N           N/A

But I didn`t receive any dates, using both commands:

storagenode.exe exit-satellite --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stderr
storagenode.exe exit-status --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stderr

Alexey · March 7, 2020, 10:16am

Please, redirect the log output to the stdout, not stderr, or do not redirect it at all and take a look into your logs

Konard · March 8, 2020, 12:41pm

Redirection does not help:

I:\storj-node>storagenode.exe exit-satellite --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stdout
2020-03-08T08:10:31.323+0300    INFO    Configuration loaded from: I:\storj-node\config.yaml
2020-03-08T08:10:31.356+0300    INFO    Node ID: 19VMAP19VrPNySLytKbn362phZQqTGKDGwWgCqEsd8JYg5i7w3
Please be aware that by starting a graceful exit from a satellite, you will no longer be allowed to participate in repairs or uploads from that satellite. This action can not be undone. Are you sure you want to continue? y/n
: y
Domain Name                       Node ID                                              Space Used
satellite.stefan-benten.de:7777   118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW   119.1 GB
saltlake.tardigrade.io:7777       1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE   4.9 GB
asia-east-1.tardigrade.io:7777    121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6  108.0 GB
us-central-1.tardigrade.io:7777   12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S  121.7 GB
europe-west-1.tardigrade.io:7777  12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs  143.2 GB
Please enter a space delimited list of satellite domain names you would like to gracefully exit. Press enter to continue:
satellite.stefan-benten.de:7777

Domain Name                      Node ID                                             Percent Complete  Successful  Completion Receipt
satellite.stefan-benten.de:7777  118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW  0.00%             N           N/A

I:\storj-node>

nerdatwork · March 8, 2020, 12:52pm

What does this command show?

storagenode exit-status

Konard · March 8, 2020, 2:58pm

If it is executed just after the exit-satellite it displays the same status.

I:\storj-node>storagenode.exe exit-satellite --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stdout
2020-03-08T17:57:20.883+0300    INFO    Configuration loaded from: I:\storj-node\config.yaml
2020-03-08T17:57:20.916+0300    INFO    Node ID: 19VMAP19VrPNySLytKbn362phZQqTGKDGwWgCqEsd8JYg5i7w3
Please be aware that by starting a graceful exit from a satellite, you will no longer be allowed to participate in repairs or uploads from that satellite. This action can not be undone. Are you sure you want to continue? y/n
: y
Domain Name                       Node ID                                              Space Used
satellite.stefan-benten.de:7777   118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW   119.1 GB
saltlake.tardigrade.io:7777       1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE   4.9 GB
asia-east-1.tardigrade.io:7777    121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6  108.0 GB
us-central-1.tardigrade.io:7777   12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S  121.7 GB
europe-west-1.tardigrade.io:7777  12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs  143.2 GB
Please enter a space delimited list of satellite domain names you would like to gracefully exit. Press enter to continue:
satellite.stefan-benten.de:7777

Domain Name                      Node ID                                             Percent Complete  Successful  Completion Receipt
satellite.stefan-benten.de:7777  118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW  0.00%             N           N/A

I:\storj-node>storagenode.exe exit-status --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stdout
2020-03-08T17:57:41.811+0300    INFO    Configuration loaded from: I:\storj-node\config.yaml
2020-03-08T17:57:41.845+0300    INFO    Node ID: 19VMAP19VrPNySLytKbn362phZQqTGKDGwWgCqEsd8JYg5i7w3

Domain Name                      Node ID                                             Percent Complete  Successful  Completion Receipt
satellite.stefan-benten.de:7777  118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW  0.00%             N           N/A

I:\storj-node>

But after about 10 minutes I get different result:

I:\storj-node>storagenode.exe exit-status --identity-dir "I:\identity\storagenode" --config-dir "I:\storj-node" --log.output stdout
2020-03-08T17:57:11.939+0300    INFO    Configuration loaded from: I:\storj-node\config.yaml
2020-03-08T17:57:11.974+0300    INFO    Node ID: 19VMAP19VrPNySLytKbn362phZQqTGKDGwWgCqEsd8JYg5i7w3
No graceful exit in progress.

deathlessdd · March 8, 2020, 3:03pm

Looks like you havent been running your node long enough for a graceful exit. 6months is the minimal amount of time. If you view your log when you start a graceful exit you will see that it says you need to run your node till this “day”
They put these rules into place so people couldnt just run graceful exit right away so they can get all the escrow right away.

Konard · March 8, 2020, 3:07pm

There is nothing in the log at the time the commands are executed. And yes, I need graceful exit as soon as possible, but I only operate the node only for 3 months. I also using the minimum of 500 GB. So is where any chance I can make it without DQ? In 4 weeks I will not be able to operate the node anymore for at least of 6 months (I have to move to different location).

deathlessdd · March 8, 2020, 3:08pm

Doesnt happen right away, it takes time for it to start. If you watch your logs it will show it in a bit of time. Plus after 3 months of running a node you will not be able to run a graceful exit.
You cannot force a graceful exit to happen its 6 months and its 6 months only that is the minimal amount of time. Can’t bend the rules for one person then they would need to do it for everyone.

If your going to be moving an it takes 4 weeks and they haven’t added the uptime DQ you would be fine, But if they do it will be DQed for sure. But other then that you don’t really have much of a choice.

Konard · March 8, 2020, 3:46pm

After 4 weeks, I will be downtime for about 6 months. What is the minimum amount of availability before DQ? My friend was DQ for long downtime.

deathlessdd · March 8, 2020, 3:51pm

Well I highly doubt uptime for 6 months wont get your DQed im pretty sure in that amount of time uptime DQ will be back into place. Your real only option would be to try to move your node to a datacenter for the 6 months if you want to save your node. Other then that you dont have a choice you either have to risk getting DQed if they put uptime into place or move the node to a datacenter to host while you are gone for the 6 months time. Or just start over when you get back after 6 months.

Konard · March 8, 2020, 4:00pm

What the DQ actually mean for me as user? Will I be able to participate in the StorJ network after that but with a different node?