Hi! What does this error mean? I have to increase files limits for the user running the node?
Set fs.file-max = 16777216
(was 1048576) in /etc/sysctl.conf and applied with sysctl -p
Seems fine now. Will monitor, if it comes up again, as I did not increase the limits per user. Running nodes as service (one node per one different user) on Ubuntu. Have 7 nodes on same machine (different /24 each). 4 nodes on one storage array and 3 nodes on another. Each storage array of 12 HDD of 4 TB - both storage arrays RAID 60.
So, that did not work at all. Had to go to /etc/systemd/system.conf and set DefaultLimitNOFILE=16777216
then rebooted since session required pam_limits.so
was not previously added to /etc/pam.d/common-session and the change could not apply.
Monitoring, if that will solve the problem…
Not sure how long this has been going on for, but it obviously caused issues on some nodes. Example:
Can surely say it is not v1.66.1 related.
It seems this solves it. Hope it helps anyone, who experiences this particular problem, too. Cheers!
@svet0slav Thanks for posting this issue and the solution! This helped my node that suffered from the same problem.
Hello @svet0slav,
for security and scope limiting purposes, you should not change the default LimitNOFILE system wide rather than applying it to the systemd service that runs your storagenode and suffers from insufficient resources.
You can add this line gateway-st/gateway.service at 6acd5fd2fdd6ffeaeb145b13f388226dad9af44c · storj/gateway-st · GitHub
and potentially even dial down. (default is 1k, this setting bumps it by a factor of 64x!)
Makes sense. Thanks!
Stefan’s way is better. Please check. I forgot about this method, indeed. Since the updater does the update on the core files, not the service file itself and does not change it on update, his method is better. But for other services, which do update the service file, e.g. MySQL, etc. it is not recommended, else you have to update the service file on every update. All this if you run the nodes as services. Not sure how this could solve the problem for those, who use the container versions with docker… Can this be applied per container somehow?
For docker containers you can specify the limit on runtime using the ulimit flag:
--ulimit nofile=262144:262144
.
If you have systemd service files that get overriden, you can at any time make an override folder for them, ie something like /etc/systemd/system/storagenode.service.d/override.conf
and put the changes in there. You can achieve this as well using the following command: systemctl edit storagenode.service
This has to be done for the user running the node.
Not really. This way you have to run systemctl daemon-reload
and also restart the service for this to have an effect. Your previous sentence describes the process, which will make it work without having to do so every time, but just the first time, no?
Thanks for your help! We appreciate it! Cheers!
@svet0slav I applied @stefanbenten’s suggestion when he posted it, and so far, it’s all good.
That is correct, but should be sticky and fine
This should be a permanent change and be the same as the manual step outlined before.
Hi All,
Just received this messages:
2024-01-14T08:56:29Z | INFO | http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s | {process: storagenode} |
---|---|---|---|
2024-01-14T08:56:30Z | INFO | http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s | {process: storagenode} |
2024-01-14T08:56:31Z | INFO | http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s | {process: storagenode} |
2024-01-14T08:56:32Z | INFO | http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s | {process: storagenode} |
2024-01-14T08:56:33Z | INFO | http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s | {process: storagenode} |
Tried to set up @stefanbenten flag like " --ulimit nofile=262144:262144
. but It seems didn’t help. is this a docker run flag?
Yes, it’s a docker flag, so it should be provided before the image name.
See
Thx @Alexey, Strange but it didn’t work for me.
This docker run I used, seams to be correct:
docker run -d --restart unless-stopped --stop-timeout 300
-p 28967:28967/tcp
-p 28967:28967/udp
-p 14002:14002
-e WALLET=“0xDXXXXXXXXXX”
-e EMAIL=“XXX@XXX”
-e ADDRESS=“XXXXXX:28967”
-e STORAGE=“11TB”
–user $(id -u):$(id -g)
–mount type=bind,source=“/volume2/storj/ID2/storagenode/”,destination=/app/identity
–mount type=bind,source=“/volume2/storj/DATA2”,destination=/app/config
–ulimit nofile=262144:262144
–name storagenode storjlabs/storagenode:latest
but today I restart the node twice per day due to:
2024-02-05T05:02:40Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:41Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:42Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:43Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:44Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:45Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:46Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:47Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:48Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:49Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
2024-02-05T05:02:50Z INFO http: Accept error: accept tcp [::]:28967: accept4: too many open files; retrying in 1s {“process”: “storagenode”}
… etc
put it –ulimit nofile=262144:262144
upper, right after -d
will see it it helps. Also delete all images and re pull fresh image.
it there a way to check is it applied or not?
Yes
docker exec -it storagenode sh -c 'ulimit -n'
However, if you still see this error, then you need to increase it more than 1048576
(it’s the default value), not less.
Try to use 2097152
Thx @Alexey good idea, will try
for the 2097152 @Alexey
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error setting rlimits for ready process: error setting rlimit type 7: operation not permitted: unknown.
also tried –ulimit nofile= 1597152: 1597152
. –ulimit nofile= 1297152: 1297152
all the same results
Then you need to increase ulimits for the docker daemon or a system wide - as did @svet0slav