I have a Ubuntu VM 18.04 with 4 vCPU (core i7) and 8GB of dedicated RAM with 10TB dedicated HDD, where I’m running docker storagenode, I would like to understand how the Memory in manage because the storagenode crash 3-4 time per day with this error:
In Linux OOM doesn’t necessarily kill the process which is at fault, as this is difficult to establish. It kills the process that happens to require more memory at that specific instant when no free memory is there, even if the process is in general not memory-intensive.
I have never seen storagenode take more than few hundred megabytes of memory, so I’d risk a guess that the storage node is just a victim of some other process. Try figuring out which specific process eats memory on your machine. top/htop might be helpful.
Looking my container log I find this warning message:
level=warning msg="Your kernel does not support swap memory limit" level=warning msg="Your kernel does not support cgroup rt period" level=warning msg="Your kernel does not support cgroup rt runtime"
Linux storagenode 5.3.0-53-generic #47~18.04.1-Ubuntu SMP Thu May 7 13:10:50 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
but instead of solve the problem other 2 warnings has been added:
level=warning msg="Your kernel does not support swap memory limit" level=warning msg="Your kernel does not support cgroup rt period" level=warning msg="Your kernel does not support cgroup rt runtime" level=warning msg="Your kernel does not support cgroup blkio weight" level=warning msg="Your kernel does not support cgroup blkio weight_device"
I have move all my node to windows CLI with no docker and the problem is still the same, but on docker the node automatically restart on windows the service will stop and then u need manual restart !!!
I have try many solution from RAID0 on QNAP connected via ISCSI to direct RAID0 connected to VM, then I move on single HDD and now I have try with Windows Storagenode CLI directly connected to an HDD.
I have made all this test using rsync to move all my data, I have about 7TB used
Yes, but all with the same HDD right? I’m wondering if it’s an SMR model, which could contribute to write delays which could in turn lead to high memory use. Although it doesn’t exactly explain the numbers you are seeing. It would be nice to exclude as a possibility if it doesn’t apply.
No I have try also different HDD, I have check my model and are all WD RED and WD RED PRO with no SRM.
Now my disk is close to be full, so I will start a new node and I will see if the things will change