All of a sudden: Out of memory

jammerdan · September 5, 2021, 10:29am

All of a sudden the node was no longer reachable.
The log showed:

dockerd[416]: runtime: out of memory: cannot allocate 1395810304-byte block (1248329728 in use)
dockerd[416]: fatal error: out of memory
dockerd[416]: runtime stack:
dockerd[416]: runtime.throw(0x22e619c, 0xd)
etc.

It seems that this triggered Docker to restart itself:

systemd[1]: docker.service: Failed with result 'exit-code'.
systemd[1]: docker.service: Service RestartSec=2s expired, scheduling restart.
systemd[1]: docker.service: Scheduled restart job, restart counter is at 1.
systemd[1]: Stopped Docker Application Container Engine.
systemd[1]: Starting Docker Application Container Engine...

So the system almost recovered itself. But it did not work. The node still was not reachable and did not work. I manually stopped and restarted, the container, no avail. I had to remove the container and start it then.

Any ideas how to catch and resolve such a failure and restart the container automatically?

SGC · September 5, 2021, 11:27am

generally the main times the storagenode will use larger amounts of memory is when its doing the filewalker or if the hdd cannot keep up with the io demand.

jammerdan · September 5, 2021, 3:29pm

I just love those silent errors and how it appears that the node is running correctly while in reality it is down.

In my case it seemed that the Docker has crashed so I wonder if this might help: Keep containers alive during daemon downtime | Docker Documentation

SGC · September 5, 2021, 3:39pm

run zfs and ecc ram and don’t forget to turn on the scrub of the ram in bios.
and you ofc need plenty of ram, and a ssd for SLOG

but do that and your system will be stable as a rock… so long as the software doesn’t contain errors ofc…

BrightSilence · September 5, 2021, 8:16pm

So, you’re suggesting rebuilding the entire system? That’s not very helpful. ECC memory will do absolutely nothing for out of memory errors, ZFS is pretty much the biggest memory hog of a file system, so that can only make it worse. And SSDs really should not be necessary at all.

@jammerdan usually high memory usage is related to IO bottlenecks. What kind of HDD are you using (avoid SMR as you probably know already)? Is there other stuff running that might impact IO performance? Also make sure nothing else is gobbling up memory. Good performing nodes shouldn’t use more than 2 to lower 3 digit MB’s of memory.

Have a check on IO wait for your CPU. That’s usually a pretty good indicator that there is an IO bottleneck. Also check that it is actually the node that uses a lot of RAM.

jammerdan · September 6, 2021, 6:33am

No it is not a SMR disk. Now when I think of it, all logs have been vanished afterwards. Maybe log rotation together with high disk usage from storagenode could be a problem.

The free -m for this node looks like this right now:

total        used        free      shared  buff/cache   available
Mem:           1990         785         120          17        1083        1563
Swap:           995          23         971

Certainly not great RAM size but it’s an HC-2 so there is no way to change that. I wonder where the swap space is located and maybe I can either move or increase it.
For now I have added the live-restore option for Docker so hopefully the containers remain active even if Docker dies.

BrightSilence · September 6, 2021, 7:10am

You may want to follow some of the steps outlined for rpi here.

Specifically adding the memory parameter to your run command, but you can double it to --memory 1600m. And adding cgroup memory support.

jammerdan · September 6, 2021, 7:15am

Thanks, yes I have the --memory command already in the run command.

I just learned that there is ZRam configured. I am not sure if that’s really helpful and if the size should be that large. And it seems that there is no additional Swap space.
So I might try to reduce the ZRAM and add an additional swap space on the Sd-Card.

jammerdan · September 6, 2021, 8:22am

No after reducing ZRAM and creating a swapfile after a reboot with storagenode already running it looks like this:

              total        used        free      shared  buff/cache   available
Mem:           1990         253        1024           6         712        1696
Swap:          1422           0        1422

That is not too bad and should be totally sufficient normally. Nothing else is running on that little node.

BrightSilence · September 6, 2021, 8:41am

That should be ok I think. For monitoring it might be useful to keep an eye on iotop. If you don’t have it on your system, you can run it through docker too.

docker run --rm -ti --privileged --net=host --pid=host esnunes/iotop:latest

That’s what I use from time to time on Synology as it doesn’t really allow me to install it.

jammerdan · September 6, 2021, 8:44am

That’s interesting. I will check it out. Thanks.

kevink · September 6, 2021, 3:06pm

netdata + prometheus scraping from different pc. then you can see what happens directly before a freeze.