Node goes down/restarts every 10-15 minutes. thread allocation error in logs

you can do it of course, however, when it would reach the limit the container will be killed by a docker daemon. Usually RAM is consumed when the disk cannot keep up though. My nodes doesn’t consume too much even on load, but my setup is simple 1 disk/1 node.

exactly my situation, two from three are full, the third is almost full as of now (but not when I posted, however it has 1.8TB in the trash, so I guess it would have more than 13GB of free space in the allocation).

jackpot… USB + 2.5" + SMR on top = fatality. For this case you need to have more empty nodes to split the load (if it’s possible), or use some caching on top or reduce a concurrency (unfortunately it will affect customers), not so wide choice, I’m agree.

Yes, this is why we have a forum to get a feedback. And I agree that “works fine for me” is not an answer, but it helps to get the missing information, like you provided above about using an USB 2,5" SMR disk. The worse case could be only to format it to BTRFS also and having only 1GB of total RAM like on Raspberry Pi3. My setup should mimic an average - Windows node, Docker for Windows nodes and I also have had a Pi3, but currently it’s not booting, an SD card is died and I have only remote access to that location.
So we are grateful for your feedback!

We suggesting this on the forum, and the possible workarounds. SMR disks can work, because not all SMR are made equally, and also if you have caching resources, it perhaps can work not worse than CMR.
We prefer to do not put limitations on hardware in our docs, each setup is individual. Thus we would try to give help on the forum.

the truth was that Pi just hangs otherwise, requiring a power recycling after that. So it’s better to kill the container and restart it, than switching off the grid. Especially if that is the remote location.

Oh, I’m sorry if you think that I mean that! We are not blaming, we states, that yes, storagenode can be run on many variations of the hardware, but if it’s not enough to run a node - then perhaps it cannot be used unfortunately.
I know that we have had nodes even on a router:

So, the low limits are truly low.

It is better to leave it default (0), which is mean no limit/automatic. Increasing this value meaning the exact maximum number of the accepted concurrent requests.

yes, but it’s not advertised to the satellites, so your node would be suggested to the customers, because it’s still not full… And the customer will get a response “node is overloaded” with a cancelation. If the resulted working nodes will be less than required minimum, the upload (or download) will fail.

I have to agree. However, at the moment we was unable to foresee all possible combinations. And you found a workaround already by limiting the number of concurrent requests. Well, it should be accepted as an answer on the stress test too.

This Stat doesn’t exist and it’s hard to track and confirm. The performance depends on so many factors, including but not limited to: disk type, how is the disk connected, what’s filesystem, how much RAM, what OS, how the node is running (as a VM on heterogenic platform, like Windows VM on the Linux Hypervisor, or in the container, or on a bare metal), how much nodes or other concurrent processes for the same resources, what’s network connection to your router (wifi, wire, …), what is your internet connection, upstream and download bandwidths, proximity of the customer to your location, etc. This information is simple not collectable in a proven way. So we need something measurable and resistant to cheating also.

On exhausting of resources the node can react only in one way is to crash to be restarted. Not a good solution, I’m agree, but I do not see anything better so far.
I’d like to hear more ideas.

the true answer is

and we are trying to adapt the software to help setups even with slow disks. Just not so fast as we all want.

It’s in the docs, but only for RPI, because otherwise this small device will simple hangs until the power recycling, so a tradeoff between hangs and the restart.
This would be a bad advise for any other setups, because this flag is a hard limit, so as soon the container will reach it - it will be immediately killed. In case of storagenode with losing a progress of filewalkers and trash cleaning (partially fixed, but WIP) and leaving temp files uncleaned. This flag is a much more unclean termination, than the software crash and restart.

hm, that’s idea, thanks!

Please check:

3 Likes

I take back everything I said, I’ll ban myself for 2 months, thanks for listening!

The ideal way would be for the node to have some way to kick back and say “I’m busy, you’d probably have better luck with another node”. Having that request go to an overloaded node would just result in the upload/download hanging for way longer than a user would probably consider to be acceptable.

Short of that, one thing that could help in the short term is that an overloaded node could report that it is full to some of the satellites. This would theoretically kill any ingress traffic to that node from that satellite.

1 Like

Sorry, but my experience over several days doesn’t match. I experimented with docker run -m "1024m" --memory-reservation="768m". What I saw is that node will approach the 1GB limit, then stay at 99-99.9% of the assigned RAM utilization for a period, but without any indication of a problem in the logs, and no crashes. I even tried reducing it further to 768MB. A few hours later, node is still running, sitting at 99% RAM use. So it looks like docker enforces RAM constraint by denying node software request for more RAM, not by simply killing the container. This is my own observation, I’m no docker expert.

On this 4TB node (the one with 2.5" USB SMR HDD), RAM only gets filled up when node reports free space available. If I set -e STORAGE="x.xTB" with x = slightly smaller amount than used, RAM consumption is much, much lower. So it looks like this puny 2.5" SMR drive just won’t be able to handle serving download and upload requests at the same time, even with 2 other nodes online reporting free space. RAM gets filled, and many (mostly) lost races in logs. But with the docker memory constraint, at least it’s stable! (and doesn’t cause my server to use swap) This is the reason I so adamant that it belongs in the main documentation, too.

I see these options for myself:

  • keep running like this, losing most races
  • reduce storage parameter so no free space reported
  • take down node
  • migrate to faster disk
  • bring more nodes online to share the load

Last two options would be better for Storj, but I’m not inclined to commit further resources so long as node software remains “ungraceful” when it comes to handling overload. The juice was barely worth the squeeze without stress testing triggering server problems.

This looks like a great idea! It certainly did the trick for me (see above). No need for any protocol change, would just need a way to identify overloaded state (too great % of recent races lost?).

I think this would be counterproductive for the network at large. Instead of fixing performance issues it would be possible for the node operator to swipe them under the rug indefinitely. Why fix anything if things work “just fine”, aka “don’t crash”?

I prefer node crashing and dying in spectacular explosion instead; this will catch attention of the node operator and they will fix the performance bottleneck.

Any added complexity and roundtrips degrade quality. And for what resason? To keep potato nodes online? What’s the point in nodes on life support who can only answer 10% of requests? Accounting overhead maintaining the node is still the same.

No, let weak nodes die. Some people will be upset; also, let them be upset. The goal is not to please potato farmers, but to build resilient reliable network. There is no need for compromises.

BTW this also applies to strong nodes behind potato routers. Bad network equipment – you are out.

I have to disagree completely with you there.

What mattventura suggested makes perfect sense. Imagine the scenario where the network is under attack. Instead of triggering a repair because the threshold of nodes fell below the repair trigger since they blew up in a fiery mushroom cloud, we could all save the resources being used for said repair and handle actual client requests.

It’s not a “I’m overloaded, thou shall not have my segments!”. It’s more of a “I’m currently busy right now, you should try offering your segment to another node and check back with me in 15 minutes”. If the client manages to hit the 80 required segments without this problematic node, all is well. If not, instead of returning that “but 80 required” or whatever error, it can try uploading a segment once more. What’s the worst that can happen, the client waits 1 second more for his upload to complete?

That’s satellites job to mitigate.

A lot of clients will be waiting a lot longer on a daily basis because a lot of potato nodes will be always overloaded with no plans to fix them.

I understand. I’m not saying this won’t work. I’m saying it will solve the short term nonexistent problem at the expense of long term network performance and quality.

Why are we trying to save potato nodes exactly? There is already oversupply of nodes. Why not use this opportunity to purge the useless weak ones? You know, go full Darwin on them.

This will incentivize operators to fix performance bottleneck. Concocting engineering solutions to pay them money inspite of shit performance — won’t.

I’m not advocating for saving potato nodes. On the contrary, boot every single one of them off the network tomorrow.

I’m advocating for your node to be online when it gets overloaded (ie something abnormal happens).

This is exactly what it’s saying, if you use the concurrency limit. Unfortunately, to the customer. Perhaps the node should report this to the satellite instead, but it shouldn’t do it like “I’m overloaded! Oops, no, I’m OK. Oops again, I overloaded again!”, otherwise the satellite will rate limit it.

Right now it is some kind of happening too, but not regarding the storage, but availability - offline nodes are removed from the hot cache. And your node likely will be rendered as offline if it’s crashed. However, the online checks happens not so often as frequently the node could crash, so it may not notice that (except falling the online score maybe?).

Then it’s nice, perhaps it’s something new in the latest versions of Go and/or Docker, because I didn’t see a such behavior earlier, the container usually got abruptly killed.

this is basically mean - the node is full, of course the RAM usage will be lower - your node will be reported as full to the satellites and the ingress will stop. No ingress - no RAM usage for buffers.

I would wait until it will be confirmed by others. However I would try to suggest to others who have this problem too to try it out, thanks!

This attacker should have an account. And number of RPS is limited. It’s difficult to implement such an attack.

Yes, but it will be useless if it cannot accept or give the piece complaining “I’m overloaded” to the customer…

:cry: Why.

You have inspired people to improve the status quo. You are a leader. You should do more of that, not less.

I sort-of have that. I’ve got my system set up so that it allocates disk space on my drive with some phony files whenever it notices too many upload rejected entries in node logs. Then the node in a natural way sends a signal to the satellite that the node is “full” and not accepting more uploads.

I think I should patent this method, it’s so Rube Goldbergy in nature.

It is not actually a defense for potato hardware (so far this hardware when left alone was capable of handling almost all traffic peaks I’ve seen), but usually triggers when I’m using the same hardware for my personal purposes, using up I/O that would usually be taken by a storage node.

1 Like

2 things could be made unfortunately both require a restart of the node:
Concurrency setting for uploads → This goes to the customer
Report as full → This goes to the satellite

If both were to be made as dynamically changing on realtime, values could change either on load (storagenode: add load monitoring · storj/storj@beddca4 · GitHub) or even on success rate from script of @BrightSilence or if you pull data from debug port any other thing that you would like like number of uploads, number of downloads or other processes running.
That would make the node a bit smarter.

Other idea: If the customers client has to request new nodes from the satellite because too many uploads were broken, maybe it could transmit the some id from the node to the satellite, so the satellite know which nodes have failed.