Out of Memory issue (memory leak?)

dariop84 · April 12, 2020, 5:54pm

Hi All,

I have a brand new node (4 Core / 16GB RAM) where the storagenode container is eating up all the memory until it reaches the limit and it gets killed for OOM…

I’m using the latest version (v1.1.1) of storjlabs/storagenode:beta (is it right to use the beta?)

The machine is fresh installed and super clean, just Centos 7.7.1908 (Core) + ntpd + docker, nothing else.

I saw that several people had the same issues in June 2019, but then it has been fixed, so any idea why is still happening?

deathlessdd · April 12, 2020, 6:45pm

Hi and welcome
Without knowing how your setup is running your node there’s no telling what the issue is without some logs, Is your system caching all data in ram before putting in on the hard drive because your hard drive can’t keep up? Run

docker stats

To get an Idea of what is going on with your node.
The issues in june were more related to rpi with low ram, A system with 16gigs of ram should no where becoming close to running out of memory ever.

dariop84 · April 12, 2020, 7:07pm

Hi, thanks!

Well, there is not much more to say about the setup, it’s a VM in a vSphere 6.7 cluster and the only “special” things is that I’m using a NFS mount to store data, but performance of the NFS is much much higher that the amount of work that my node is doing so it’s should not be an issue at all

this is docker stats

CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
b89860a76b94 watchtower 0.00% 6.504MiB / 15.51GiB 0.04% 1.35kB / 0B 0B / 0B 9
a9ef8a5b9877 storagenode 1.40% 7.394GiB / 15.51GiB 47.67% 11.1GB / 320MB 0B / 0B 67

this is the ram usage

[root@storj01 ~]# free -m
total used free shared buff/cache available
Mem: 15885 12509 2781 8 593 3073
Swap: 6555 0 6555

I don’t see any error in the storagenode container logs, just a ton of:

INFO piecestore upload canceled… “Action”: “PUT”, “error”: “context canceled”, “errorVerbose”:

deathlessdd · April 12, 2020, 7:14pm

Looks like its more of an issue running with vsphere but I dont have much experience with vsphere tho but Im willing to bet its using ram as a cache, also NFS comes to mind where it wasnt recommended to use ever, with a node.

nerdatwork · April 12, 2020, 7:29pm

iops · April 12, 2020, 7:55pm

My issues went away when I started storing the data on a HDD. This is a SQLite issue as opposed to a storj one. My performance is far better, and therefor payments are better, after ditching NFS shares.

dariop84 · April 12, 2020, 9:00pm

Hi All,

OK I have moved the data to an iSCSI (over 10G) datastore, let’s see if things are better!

Odmin · April 12, 2020, 9:22pm

Please move to iSCSI ASAP, NFS have big issues with sqlite and consume much more resources then iSCSI. Also, please reserve all RAM on your VM.

dariop84 · April 13, 2020, 8:29am

Looks like iSCSI solved the issue
Thanks!