Too many requests not finishing

Storgeez · July 12, 2019, 3:48pm

Only getting the below error messages since yesterday:

2019-07-12T15:46:20.820Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}
2019-07-12T15:46:20.907Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}
2019-07-12T15:46:22.055Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}
2019-07-12T15:46:22.393Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}
2019-07-12T15:46:23.990Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}
2019-07-12T15:46:24.506Z ERROR piecestore upload rejected, too many requests {“live requests”: 11}

I got 2.8MB ingress total with these errors showing - basically no ingress. It seems like all the transfers are pending but none are actually active (no disk activity).
Also container never shuts down with the graceful shutdown (-t infinity).
Almost none of the transfers ever get canceled, they all stay in queue for hours, piling up.

Adavan · July 12, 2019, 3:55pm

@Storgeez You can attempt rise this value (start at ~15-20). I have actually placed 50.

Storgeez · July 12, 2019, 3:56pm

Problem is not that I cannot have enough transfers, problem is that none of them finished. I tried 3000 and I had 3000 pending transfers, no traffic was going in or out. On container restart, there is traffic for a very short while and then no further. 3.5MB ingress at the moment.

littleskunk · July 12, 2019, 4:10pm

The used space calculation needs a few minutes. After that it should run fine. We are working on a better used space calculation but that will take more time.

Storgeez · July 12, 2019, 4:12pm

It was running overnight, after I restarted it yesterday, it amassed 3k pending transfers, I reduced it when I saw it today. But as I mentioned, there is no network activity and no disk activity, it is just collecting pending transfers and weirdly enough, none are getting canceled…
Tried creating a file from within the container, that seems to have gone through.

I’m out of clues here.

Storgeez · July 12, 2019, 7:45pm

It doesn’t seem to be a permission issue since I’m having some uploads but only a few MB since this manifested.

Storgeez · July 12, 2019, 8:11pm

|2019-07-12T19:43:20.535Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|---|---|---|---|---|
|2019-07-12T19:43:21.116Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:23.097Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:24.169Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:28.292Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:31.708Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:33.632Z|INFO|piecestore|uploaded|{"Piece ID": "CZ5BFBEST3FDF33IVE4HPEYNEAIJASIWQBXNWB4J4BJOS2I3X6JQ", "SatelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT"}|
|2019-07-12T19:43:35.048Z|INFO|piecestore|upload started|{"Piece ID": "I5YIWDLUKPSCH6F5D464J7DLK3SFKNU47G32UXODMHJM32FY7YMQ", "SatelliteID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT"}|
|2019-07-12T19:43:38.605Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:39.631Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:39.995Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:40.712Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:41.294Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|
|2019-07-12T19:43:43.727Z|ERROR|piecestore|upload rejected, too many requests|{"live requests": 11}|

KernelPanick · July 13, 2019, 10:50am

How do you detect pending xfers?

Storgeez · July 13, 2019, 1:55pm

Detected pending transfers by seeing they started and never finished in the log.

I have figured out what is wrong. The recent update broke NFS compatibility. Probably the dreaded file lock incompatibility that’s well known with NFS.

I know remote storage isn’t supported, but it would be helpful if developers could notify when something might break unsupported stuff that people are using. Nearly lost my node because of it.

littleskunk · July 13, 2019, 1:55pm

Like this: storj/storagenode/piecestore/endpoint.go at d52f764e54731603cb669dbab0ae48ddde5566cf · storj/storj · GitHub

Same for downloads and delets. They are not getting limited but they will increase and decrease the counter.

littleskunk · July 13, 2019, 2:01pm

Support something that is unsupported? It is called unsupported for a good reason…

That makes no sense. The database locking was implemented from the beginning. If you nearly lost your node you will most likely lose it anway and it would be time to change your setup. Don’t blame us for breaking unsupported setups. You can blame us for breaking supported setups ofc.

Storgeez · July 13, 2019, 2:06pm

I’m not talking about database locking, but file locking.

But your response is pretty much what I expected.

I nearly lost it because of the update. It worked for 4 months up to this point.

littleskunk · July 13, 2019, 2:17pm

We haven’t added any additional locking!

That is no proof! An emtpy storage node will work on all supported setups. At some point the storage node has collected too much data and the unsupported setups start failing for different reasons.

If you like we can talk about the problems you get with NFS and about possible solutions. If you prefer to blame the update for it I would suggest we just wait for the final disqualification, fix your setup and retry. Your decision.

ThermicYeti · July 14, 2019, 3:48am

I’m having this same issue after the update to 0.14.13, and I’m not using NFS at all. I really don’t want to get disqualified before I can fix this, as you suggested to Storgeez.

Alexey · July 14, 2019, 2:14pm

You shouldn’t, if you would not loose audits.
If you referring to the “rejected”, then it’s a different story.
You can adjust your settings by adding these lines to the config.yaml (it’s in the data folder of your storagenode; for Windows better to use a Notepad++, for MacOS TextNote, don’t use Notepad and Notes on those platforms):

# Maximum number of simultaneous transfers
storage2.max-concurrent-requests: 7

Add these lines at the end of the config.yaml and play with the number a bit. You have to stop the node, make the change, then start the node again.

jackserippl · July 19, 2019, 4:28pm

Here I found a Link with a Tutorial to fix this:
https://beyond.lol/storj-upload-rejected-too-many-requests-upload-tuning/

Alexey · July 19, 2019, 7:12pm

There is nothing to fix If your setup can handle more requests - just increase it

jackserippl · July 19, 2019, 7:31pm

I am sorry, I used the wrong word, its just a tutorial how to increase the value

Alexey · July 19, 2019, 7:35pm

No problem!
By the way, welcome to the forum, @jackserippl!

jackserippl · July 19, 2019, 7:37pm

thank you, will take a look around in the forum