Very slow graceful exit process

PocketSam · July 30, 2022, 4:47pm

For some reason the process is extremely slow. For now it’s left to exit just one node europe-north-1.tardigrade.io:7777.
The node is at 13.33% now.
It seemes strange to me that transfers happen for a short periods of time and I don’t get if I can speed this up.
Here is how transfer graph looks like.

Parameters related to the GE are:
graceful-exit.chore-interval: 1m0s
graceful-exit.min-download-timeout: 0m30s
graceful-exit.num-concurrent-transfers: 25
graceful-exit.num-workers: 4

I would be extremely glad to get any help.
Thanks!

Knowledge · July 30, 2022, 5:20pm

Hi PocketSam,

I’ve inquired with the team, we’ll get you some helpful info soon.

PocketSam · July 30, 2022, 8:49pm

I guess node ID may be useful for collecting troubleshoot information.
1gE3ZWT3gTp1JiQKmQnVvdPshTbtzEJ5hYijPieegd2YUeH6zq

SGC · July 30, 2022, 9:22pm

there was an event on the network some 17 days ago… i was also stuck with GE on europe-north at 56% for two weeks.
so i complained about it and within like 12 hours it had finished the remaining 3.7TB.
seems a bit to exact to be a coincidence…

@PocketSam
i’m sure your GE will pick up speed in the near future

@Knowledge
and for storjlabs, this seems like it might be affecting multiple GE’s…
maybe somebody should check up on that.

PocketSam · August 2, 2022, 9:04am

Hello, @Knowledge. Any news on my request?
If I can provide any additional information to simplify troubleshooting I would be glad to do it.

Knowledge · August 2, 2022, 2:51pm

So, I talked to LittleSkunk, one of the engineers here at Storj, and after looking at your issue he noted the following…

"I think it works as designed. The storage node will ask the satellite for a list of transfers. Than it starts transferring the pieces one by one with a configurable concurrency. Don’t increase that concurrency too high because of the risk to get disqualified.

At the end of the list the storage node waits for the last transfer to finish before requesting the next list.

So there will be a natural pause between these batches. All it needs is a slow target node that needs minutes to finish or timeout the last transfer.

I think the timeout is configurable as well so the storage node could shorten the time. Again with with the risk to get disqualified if the timeout is too short to finish the transfer in time."

So, based on what he is saying, you may have a bottleneck there with slow target nodes. Adjusting the settings “may” speed things up slightly but at a risk of being DQ’d.

SGC · August 2, 2022, 6:01pm

my storage and such wasn’t a bottleneck and i tried changing basically all the value… nothing seemed to make it deviate from how it worked.

but my GE did take like 3 weeks or more… but it was a big node
also keep in mind europa-north will be one of the largest data stores, so its natural that it will be the last remaining satellite to finish.

i wouldn’t bother changing the value, i tried so much initially… and my final conclusion was the effects was mostly imagined… lol it seems like it has some default, min and max values and it won’t deviate much from them.

but maybe in some cases it can help… spent the first good week tinkering with settings and yeah… ended up at the defaults because it didn’t seem to do anything anyways… and i like defaults, because they are usually defaults for a reason.

PocketSam · August 3, 2022, 9:45am

My europe-north storage volume was about 3TB at the beginning of GE. The node had transfered 0.8 TB in two weeks. It seems too slow for me. But for now I have no other way than just wait.

SGC · August 3, 2022, 11:45am

only comfort i can give is that from what i experienced it was very erratic.
some days it would hardly get rid of any data even tho it was still doing stuff, when i checked the logs…
and then other days it would get rid of multiple TB’s of data within a day or hours…
i chalked that up to that maybe it was IO related… so when it would hit a cluster of larger pieces it would move faster and much slower when dealing with small pieces…

and there are a lot of small pieces.

Toyoo · August 3, 2022, 5:13pm

Is the node also serving normal customer traffic and getting paid for it during graceful shutdown?

Knowledge · August 3, 2022, 5:35pm

Hi Toyoo,

As per LittleSkunk, you do continue to get download requests and storage payout, just no new uploads to your node. Similar to being suspended.