Speed of graceful exit

beli · January 22, 2020, 9:42am

Hey!

Is there a possibility to speed up the graceful exit?
Actually there is more than 80% of bandwidth available. The graceful-transfers occur just every few minutes.

Greetings
beli

nerdatwork · January 22, 2020, 10:54am

Welcome to the forum @beli!

Personally I would like to know why did you opt for graceful exit. Are you leaving Storj or just want to create a new node?

Odmin · January 22, 2020, 11:08am

Please look into this post:

beli · January 22, 2020, 12:18pm

graceful-exit.num-concurrent-transfers did it for me - actually i think that many of the threads of GE were blocked thru very slow transfers.

this would explain also why i get many more transfers after a storagenode restart.

littleskunk · January 22, 2020, 12:23pm

I would like to collect some feedback about graceful exit. Basically I need a pi4 and/or a slow internet connection as a guinea pig. We have to balance some config settings. The target is to get max throughput on slow nodes without overwhelming them. That should be a good baseline we can use as default setting.

Additional to that we can also collect recommended settings for other setups.

The 3 config options with the current default values are:

graceful-exit.chore-interval: 15m0s
graceful-exit.num-concurrent-transfers: 1
graceful-exit.num-workers 3

Increase the concurrent transfers should speed up the process but don’t push it too high because that might increase the error rate. On a slow node something like 5 should be fine to start. A faster node should be able to handle 20 or more concurrent connections.

For tweaking the interval please use Stefans satellite because it is currently running with a higher order batch size. The other satellites should send you less orders which should lead to more frequent breaks. Please don’t go below 1m0s for now.

littleskunk · January 22, 2020, 12:27pm

Next time this happens please don’t restart. Show us the /mon/ps and /mon/funcs output of your storagenode. Guide is here: Guide to debug my storage node, uplink, s3 gateway, satellite

My expectation is that the storage node will kill the slow connections. Lets check that and fix it if we are missing a timeout.

Odmin · January 22, 2020, 12:34pm

Is it possible to simulate GE on storj-sim ? I would like to help, and have all necessary equipment and knowledge, but my plan will be a long term and resilient network member and I would not want to do GE on my good nodes

beli · January 22, 2020, 12:37pm

How you want to get the debug informations?

beli · January 22, 2020, 12:43pm

Aditional i got following error recurring. i hope i’m interpreting right that the target is out of space.

2020-01-22T12:41:59.997Z        ERROR   gracefulexit:chore      failed to transfer piece.       {"Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "protocol: piecestore internal: pieces error: filestore error: truncate E:\\STORJ\\temp\\blob-723862426.partial: There is not enough space on the disk.", "errorVerbose": "protocol: piecestore internal: pieces error: filestore error: truncate E:\\STORJ\\temp\\blob-723862426.partial: There is not enough space on the disk.\n\tstorj.io/drpc/drpcwire.UnmarshalError:26\n\tstorj.io/drpc/drpcstream.(*Stream).HandlePacket:130\n\tstorj.io/drpc/drpcmanager.(*Manager).manageStreamPackets:263"}

littleskunk · January 22, 2020, 12:44pm

Yes but simulating slow storage nodes is difficult on localhost. I had to insert a few sleep statements to create a slow node. I could show you how to do it if you need it.

For reference this is what we have tested. You will notice that my focus was to create a robust graceful exit without any loop holes. Let me know if something is missing.

Node selection
1. (Risk: 1x3) A graceful exit node is not getting selected for uplink uploads.

2. (Risk: 1x2) A graceful exit node is not getting selected for repair uploads (should be the same query).

3. (Risk: 1x1) A graceful exit node is getting selected for downloads.

4. (Risk: 2x2) The target node is getting selected in a way that 2 pieces of the same segment are not stored on the same ip. See V3-3011: Enhance the repair process to ensure we are not repairing pieces to nodes on the IP if the other nodes on that IP are storing pieces of that segmentWAITING FOR TEST PLAN

Orders
5. (Risk: 1x1) Graceful exit traffic can be submitted to the satellite.

6. (Risk: 1x2) Graceful exit traffic is shown on the satellite payment report but unpaid.

7. (Risk: 1x2) Graceful exit traffic is shown on the storagenode CLI and Web dashboard.

8. (Risk: 1x3) Graceful exit node can still submit normal download traffic and will get paid for it.

Queue
9. (Risk: 2x2) Upload 3 files. Corrupt a piece of the second file. Graceful exit should repair that one first because it is closer to the repair threshold.

Hold back amount
10. (Risk: 1x2) Satellite payment report has a flag about finished graceful exit.

11. (Risk: 1x3) Datascince generates a payment CSV file that pays the hold back amount including the hold back amount of the current month.

Storage node concurrency limit
12. (Risk: 1x3) A graceful exit node with a fast 1 GBit/s connection wants to run maybe 20 concurrent transfers to avoid waiting for slow nodes.

13. (Risk: 1x2) A graceful exit node with a slow 10 MBit/s connection wants to run only 1 concurrent transfer to finish the transfers one by one.

Cheater
14. (Risk: 2x3) A graceful exit node is getting audits.

15. (Risk: 2x3) A graceful exit node can get disqualified. Graceful exit should stop if the node gets disqualified. (It makes no sense to waste bandwith if the result will be no payment because of disqualification. Alternative: Finish graceful exit and pay the storage node a percentage based on the number of files it has transferd vs lost. NoGo: Finish grafull exit and don't pay the storage node. I would call that a bug!)

16. (Risk: 3x3) The graceful exit node has dropped a piece and tries to hide that by uploading a different piece to the target storage node

a. (Risk: 2x3) Graceful exit node submits correct uplink signature but a wrong storage node signature.

b. (Risk: 2x3) Graceful exit node submits a different uplink signature and a matching storage node signature.

c. (Risk: 2x3) Graceful exit node submits correct uplink signature and a self signed storage node signature (signed by the wrong node)

d. (Risk: 2x3) Graceful exit node submits a self signed uplink signature and a matching storage node signature.

e. (Risk: ?x?) Any other combination that might be missing?

f. (Risk: 2x3) Graceful exit node is running a second storage node that will return a signed hash even if it hasn't received the piece. Graceful exit node returns random errors until it hits the manipulated storage node.

g. (Risk: 3x3) Same game but this time the graceful exit node is mixing the errors with successful uploads for a different pieceID. Fail on the pieceID that was dropped, success on one of the other 99 pieces, go offline so that the orders 98 expire, request a new set of 100 orders, repeat until hitting the manipulated storage node.

17. (Risk: 2x3) A bad uplink has uploaded a correct piece + hash but it doesn’t fit in terms of reed solomon. The graceful exit node has stored the uplink signature as a proof for audit and repair. Which signature does the target node store?

Not a cheater
18. (Risk: 2x2) A graceful exit node can hit a bad target node (no space left, clock out of syn, database corrupted etc), return an error, retry, hit a few other bad nodes, retry a few more times and finally a success. At the end no penalty.

19. (Risk: 2x2) A graceful exit node can submit 1 success go offline, the other 99 orders expire, come back online and still finish graceful exit without getting a penalty for the 99 expired orders. A storage node can do that multiple times in a row and still no penalty.

20. (Risk: 3x2) A graceful exit node is not getting any penalty for our bugs. If a segment doesn't have the hash validation flag the graceful exit node is allowed to return "sorry I don't have that piece". Remove it from the pointer and don't retry.

21. (Risk: 2x3) A graceful exit node had a bit flip. The piece is corrupted. Is the graceful exit node checking the piece hash itself or will it send the piece to the target node? The target node will have to refuse the upload because the piece hash is not matching the uplink signature. The graceful exit node will retry and we will detect that as a cheater.

Satellite concurrency
22. (Risk: 1x2) A segment gets deleted and a new segment with the same path gets uploaded while it is queued for graceful exit on the satellite. The satellite shouldn’t create an order.

23. (Risk: 2x2) A segment gets deleted and a new segment with the same path gets uploaded while it is queued for graceful exit on the storage node. Storage node returns an error that it doesn’t has the piece anymore.

24. (Risk: 2x3) A segment gets deleted and a new segment with the same path gets uploaded while it is queued for graceful exit on the storage node. Storage node has the piece (missed the delete command) and will return a success. Satellite should notice that the segment doesn’t exist anymore and drop it.

25. (Risk: 3x3) A segment was updated by the repair job or audit job while it is queue for graceful exit on the satellite.

26. (Risk: 3x3) A segment was updated by the repair job or audit job while it is queue for graceful exit on the storage node. Storage node returns success.

27. (Risk: 2x3) Graceful exit updates the pointer first and the repair or audit job has to deal with it.

Status proof
28. (Risk: 1x3) Storage node has finished graceful exit, satellite confirms it by sending a signed message to the storage node. The storage node stores that signatures somewhere (not only in the logfile). We forget to pay the storage node for some reason and the storage node opens a support ticket and appends the satellite signature as a proof.

Edit: Sorry for the format. You might want to copy it into an text editor with line wrapping.

littleskunk · January 22, 2020, 12:47pm

Just post them here. We can move it into a different thread later.

littleskunk · January 22, 2020, 12:51pm

Yes that is not your fault. Don’t worry you are allowed to fail a few transfers. You get a penalty if you fail to transfer the same piece multiple times. You get disqualified if that happens on multimple pieces.

beli · January 22, 2020, 1:12pm

arg
Body is limited to 32000 characters; you entered 110520.

https://alpha.transfer.sh/HP8kl/mfuncs
https://alpha.transfer.sh/9z4h0/mps

littleskunk · January 22, 2020, 1:29pm

  [2865949657098692379] storj.io/storj/storagenode/gracefulexit.(*Chore).Run() (elapsed: 26m14.91758481s)
   [3145588689140490362] storj.io/storj/storagenode/gracefulexit.(*Chore).Run.func1() (elapsed: 26m14.917553802s)
    [4916635892071877584] storj.io/storj/storagenode/gracefulexit.(*Worker).Run() (elapsed: 26m14.917099326s)
     [686115803749859474] storj.io/storj/storagenode/pieces.(*Store).WalkSatellitePieces() (elapsed: 4m8.872534887s)
      [872541825111058129] storj.io/storj/storage/filestore.(*Dir).WalkNamespace() (elapsed: 4m8.872525263s)
       [1058967846472256784] storj.io/storj/storage/filestore.(*Dir).walkNamespaceInPath() (elapsed: 4m8.872514343s)

Thank you. This looks good so far. Graceful exit is running for 26 minutes but non of the transfers is stuck. Some of them are slow but they should all finish or timeout. This would be the expected behavior.

If you see a drop in throughput you might want to increase the concurrency a bit to compensate a few slow connections. There is an option to decrease the timeout but I would not recommend doing that. A successful slow transfer is better than a fast timeout with a retry and the risk to get disqualfied for beeing too agressive.

Odmin · January 22, 2020, 1:30pm

Thanks, agree with you. I just will try to move one storage node from localhost to real slow hardware (RPI) and will simulate only throughput and internet channel quality.

beli · January 22, 2020, 1:58pm

What about many of this errors? They occur very often.

ERROR   gracefulexit:chore      failed to transfer piece.       {"Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "error": "write tcp 10.0.3.3:42022->78.XX.XX.189:7777: write: connection reset by peer", "errorVerbose": "write tcp 10.0.3.3:42022->78.XX.XXX.189:7777: write: connection reset by peer\n\tstorj.io/drpc/drpcstream.(*Stream).RawFlush:241\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:269\n\tstorj.io/common/pb.(*drpcSatelliteGracefulExitProcessClient).Send:1345\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).transferPiece:275\n\tstorj.io/storj/storagenode/gracefulexit.(*Worker).Run.func2:111\n\tstorj.io/common/sync2.(*Limiter).Go.func1:41"}

beli · January 22, 2020, 2:51pm

After raising values actual no more transfers

https://alpha.transfer.sh/UcLHc/mfuncs
https://alpha.transfer.sh/F4kIz/mps

Actual values:
graceful-exit.num-concurrent-transfers: 25
graceful-exit.chore-interval: 2m30s
graceful-exit.num-workers: 10

littleskunk · January 22, 2020, 2:54pm

 [548304328105843134] storj.io/storj/uplink/ecclient.(*ecClient).PutPiece("node: 1ut1rtbj") (elapsed: 37m0.366134875s)
  [3482868822966642259] storj.io/storj/uplink/piecestore.(*LockingUpload).Commit() (elapsed: 6m9.5905966s)
   [2011481416130107222] storj.io/storj/uplink/piecestore.(*BufferedUpload).Commit() (elapsed: 6m9.59059222s)
    [757699962305026799] storj.io/storj/uplink/piecestore.(*Upload).Commit("node: 1ut1rtbj") (elapsed: 6m9.588272873s)
 [8641129188856984759] storj.io/storj/uplink/ecclient.(*ecClient).PutPiece("node: 1eTQmtD9") (elapsed: 36m28.291353721s)
  [8045020309783779756] storj.io/storj/uplink/piecestore.(*LockingUpload).Commit() (elapsed: 20m51.887246963s)
   [6573632902947244719] storj.io/storj/uplink/piecestore.(*BufferedUpload).Commit() (elapsed: 20m51.887241323s)
    [1661465087124498525] storj.io/storj/uplink/piecestore.(*Upload).Commit("node: 1eTQmtD9") (elapsed: 20m51.885270684s)

Awesome. That is what I need to open an issue. Thank you.

littleskunk · January 22, 2020, 2:54pm

Could you please upload your logfile as well?

Vadim · January 22, 2020, 4:20pm

How GE is working, does sattelite gives for piece only 1 ip to upload or list of ip where can uploade piece?