Node randomly restarting. Possibly high memory usage

Mark · August 10, 2020, 3:26am

I received several uptime robot emails today informing me that my node had gone down and then back up. The node logs don’t say anything about it shutting down, but it does show when it starts back up.
I checked memory usage and at one point my raspberry pi(2 gig version running storage node 1.9.5) indicated that storage node was using 980MB of memory which is higher than normal if I’m not mistaken.

Anybody else experiencing anything similar today?
I’m wondering if certain customer usage patterns can cause a node to do this.
I used the following command to check memory usage:

ps axo rss,comm,pid \
| awk '{ proc_list[$2] += $1; } END \
{ for (proc in proc_list) { printf("%d\t%s\n", proc_list[proc],proc); }}' \
| sort -n | tail -n 10 | sort -rn \
| awk '{$1/=1024;printf "%.0fMB\t",$1}{print $2}'

The memory usage seems to be slowly creeping up over time. As I finish writing this post it’s now at 612MB and climbing

andrew2.hart · August 10, 2020, 5:16am

Check your dmesg output for bad sectors.

Alexey · August 10, 2020, 6:54am

I would like to recommend to check your databases as well

Mark · August 11, 2020, 12:15am

The databases integrity check was “ok” for all the db files.
the storagenode log had several hundred lines that said either

ordersdb error: database is locked
or
bandwidthdb error: database is locked

In the dmesg I didn’t see anything about bad sectors but I’ll do a more complete search later. I did however see some messages that said:

[2196758.706831] Out of memory: Kill process 11607 (storagenode) score 499 or sacrifice child
[2196758.710086] Killed process 11607 (storagenode) total-vm:1806856kB, anon-rss:973880kB, file-rss:14888kB, shmem-rss:0kB

Anyways, last night I stopped the storagenode, removed the container and restarted it with a lower storage limit, so the node is full right now and seems to be working ok. It’s only using 35MB of memory right now. I’ll open it back up to ingress when I have more time to watch it.

andrew2.hart · August 11, 2020, 5:47am

It kinda sounds like your disk is too slow.

You could try setting concurrent requests very low i.e. to 1. This will also greatly affect the ingest speed.

When I had the oom killer on the pi it wasn’t with locked db errors, so probably not a failing disk like I had.

Sorry. I am not being much help

baker · August 11, 2020, 2:15pm

I agree, this seems to be an issue with I/O wait. Do you know if your disk drive is an SMR drive? ~~The pi 2 only has USB 2.0, which could contribute to the problem. Also, the pi 2 (and pi 3) share a single USB bus for the USB ports and ethernet port, which puts the node at further disadvantage.~~

I would advise against a concurrent requests setting as low as 1. This option should really only be used as a last resort as it is not good for the network, or your node’s profitability. If you do try and use the max concurrent requests setting, I would suggest starting in the 50-70 range and lowering it if needed. This could work as a stop-gap measure, allowing your node to finish uploads/downloads, but only limiting when I/O gets high enough to cause a problem.

~~I haven’t heard of anyone using a pi2 for a storagenode, only pi3 or better. It is possible that it is just not powerful enough for this task.~~

–edit–
I misread the OP as using a pi2.

Mark · August 11, 2020, 3:14pm

The drive is an external western digital “My Book” drive 2TB. Linux sees it as a model: WD20EZRZ which I believe is CMR, not SMR.
The pi is a model 4(2 gig memory version) but I think the IO theory is still a possibility and I will try to investigate it. Also I agree that a request limit of 1 is way too low. I will do more testing when the weekend arrives. I appreciate the theories and suggestions.

baker · August 11, 2020, 3:48pm

Re-read your post and now I see that you are using a pi4, not pi2. I run my node on a rock64, which is similar. I had some issues with I/O wait under high network load, leading to OOM events. I eventually determined that the external drive was underpowered (bus powered portable USB3 drive) and suffering from internal resets. So it wouldn’t drop off the system, but definitely caused some sort of I/O interruptions. I solved this by adding a powered USB hub. Is the My Book you are using externally powered?

Mark · August 11, 2020, 3:50pm

It has it’s own power adapter that plugs into the wall.

angelsantillana94 · August 15, 2020, 12:03am

Hi,

I currently have the same problem as described at the beginning

I have a storage node running for 8 months and have not had this problem so far.

My node is running in Docker on a raspberry pi 3 (model 1Gb) and an external hard drive connected by usb.

root@host0:/home/pi # uname -a
Linux host0 5.4.51-v7+ #1333 SMP Mon Aug 10 16:45:19 BST 2020 armv7l GNU/Linux

root@host0:/home/pi # lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description: Raspbian GNU/Linux 10 (buster)
Release: 10
Codename: buster

I have multiple reboots a day, and this may affect my node. Could it be the latest version problem? (v1.9.5)

I have verified if I have any database files with errors in sqlite but everything is correct. The hard drive appears to be in good health.
Any help is welcome.

Thanks in advance!

Mark · August 15, 2020, 5:22am

Are you using BTRFS? I am, and I read that it’s not recommended for storage node. I might try to switch my drive to EXT4. Seems odd that this started near the time of v1.9.5, but my drive is getting more full also, so maybe I actually do have an SMR drive and I’m just now seeing the effects since the drive is probably having to reuse previously written disk space.

Alexey · August 15, 2020, 7:13am

Hello @angelsantillana94,
Welcome to the forum!

I have a raspi3 on the 1.9.5
The usage so far:

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT    MEM %               NET I/O             BLOCK I/O           PIDS
7cb299f04dc8        storagenode         26.02%              30.12MiB / 800MiB    3.77%               0B / 0B

What is the FS type on your drive?
I use ext4 for it

uname -a
Linux raspberrypi 4.19.97-v7+ #1294 SMP Thu Jan 30 13:15:58 GMT 2020 armv7l GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Raspbian
Description:    Raspbian GNU/Linux 10 (buster)
Release:        10
Codename:       buster

angelsantillana94 · August 15, 2020, 8:10am

Hi,

Thank you for your answers.

I am using XFS filesystem.

root@host0:/home/pi # lsblk -fm | grep -i hd
└─sda2      xfs           6bb8b9ca-4818-4495-b4df-467e09783b03  486.9G    90% /mnt/hd     4.6T             brw-rw----

the model of my hard drive is Seagate STGX5000400 5 TB. My node has been at 95% capacity for a few months (with 10% extra for overhead), but so far I have had no problems…

Thanks!

angelsantillana94 · August 15, 2020, 8:15am

I forgot to say that so that my raspberry does not die I have limited the memory of the container Docker. I don’t think this is a good solution though and it may affect node performance.

CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
615ecb0068e8 storagenode 13.71% 396.4MiB / 400MiB 99.09% 14.3GB / 13.4GB 0B / 0B 208
46295dcca2f8 watchtower 0.00% 3.363MiB / 925.9MiB 0.36% 204kB / 1.17kB 0B / 0B 10

Alexey · August 15, 2020, 8:18am

The high memory usage is usually related to the corrupted databases or slow disk subsystem.
There is nothing other what could increase it.

You can search for OOMKill in the system logs

journalctl | grep -i oom

The 400Mi limit is too low for the storagenode for honestly, I have a 800Mi limit. The raspi do not have enough resources to run normally something other, so I have used it only for storagenode, netdata and watchtower, there is no any GUI.
And as a ssh gateway to my network (with fail2ban). But it’s a rare case

angelsantillana94 · August 15, 2020, 8:32am

I attach the logs:

Aug 13 04:51:20 host0 kernel: [146001.860877] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=e1594331988b64162234174a4634334d04b468cfeb726e7a26616b4341938eda,mems_allowed=0,task=storagenode,pid=20095,uid=0

Aug 13 04:51:20 host0 kernel: [146002.056357] oom _reaper: reaped process 20095 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 13 19:36:38 host0 kernel: [199119.565380] storagenode invoked oom -killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom _score_adj=0

Aug 13 19:36:38 host0 kernel: [199119.565489] [<8027a454>] (dump_header) from [<8027a7cc>] ( oom _kill_process+0xe8/0x390)

Aug 13 19:36:38 host0 kernel: [199119.565516] [<8027a6e4>] ( oom _kill_process) from [<8027b42c>] (out_of_memory+0xfc/0x400)

Aug 13 19:36:38 host0 kernel: [199119.566050] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 13 19:36:38 host0 kernel: [199119.566843] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=e1594331988b64162234174a4634334d04b468cfeb726e7a26616b4341938eda,mems_allowed=0,task=storagenode,pid=24363,uid=0

Aug 13 19:36:38 host0 kernel: [199119.762305] oom _reaper: reaped process 24363 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 13 20:49:32 host0 kernel: [203495.068042] python invoked oom -killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom _score_adj=0

Aug 13 20:49:32 host0 kernel: [203495.068148] [<8027a454>] (dump_header) from [<8027a7cc>] ( oom _kill_process+0xe8/0x390)

Aug 13 20:49:32 host0 kernel: [203495.068173] [<8027a6e4>] ( oom _kill_process) from [<8027b42c>] (out_of_memory+0xfc/0x400)

Aug 13 20:49:32 host0 kernel: [203495.068665] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 13 20:49:32 host0 kernel: [203495.069296] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=531550dba5ee1da3e7bb383abd84ca303a2b95ad054e609bbd824125ecc7efd5,mems_allowed=0,task=storagenode,pid=8462,uid=0

Aug 13 20:49:32 host0 kernel: [203495.327046] oom _reaper: reaped process 8462 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 13 22:04:16 host0 kernel: [207978.893318] storagenode invoked oom -killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom _score_adj=0

Aug 13 22:04:16 host0 kernel: [207978.893410] [<8027a454>] (dump_header) from [<8027a7cc>] ( oom _kill_process+0xe8/0x390)

Aug 13 22:04:16 host0 kernel: [207978.893428] [<8027a6e4>] ( oom _kill_process) from [<8027b42c>] (out_of_memory+0xfc/0x400)

Aug 13 22:04:16 host0 kernel: [207978.894067] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 13 22:04:17 host0 kernel: [207978.894526] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=e1594331988b64162234174a4634334d04b468cfeb726e7a26616b4341938eda,mems_allowed=0,task=storagenode,pid=10300,uid=0

Aug 13 22:04:17 host0 kernel: [207979.099676] oom _reaper: reaped process 10300 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 14 22:18:32 host0 kernel: [ 6104.950394] containerd-shim invoked oom -killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom _score_adj=-999

Aug 14 22:18:32 host0 kernel: [ 6104.950473] [<8027b1ec>] (dump_header) from [<8027b8cc>] ( oom _kill_process+0x140/0x188)

Aug 14 22:18:32 host0 kernel: [ 6104.950487] [<8027b78c>] ( oom _kill_process) from [<8027c448>] (out_of_memory+0x120/0x340)

Aug 14 22:18:32 host0 kernel: [ 6104.950808] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 14 22:18:32 host0 kernel: [ 6104.951279] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_ oom ,task_memcg=/,task=storagenode,pid=2752,uid=0

Aug 14 22:18:32 host0 kernel: [ 6105.232557] oom _reaper: reaped process 2752 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 14 23:40:51 host0 kernel: [11044.838820] containerd-shim invoked oom -killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom _score_adj=-999

Aug 14 23:40:51 host0 kernel: [11044.838899] [<8027b1ec>] (dump_header) from [<8027b8cc>] ( oom _kill_process+0x140/0x188)

Aug 14 23:40:51 host0 kernel: [11044.838913] [<8027b78c>] ( oom _kill_process) from [<8027c448>] (out_of_memory+0x120/0x340)

Aug 14 23:40:51 host0 kernel: [11044.839240] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 14 23:40:51 host0 kernel: [11044.839714] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_ oom ,task_memcg=/,task=storagenode,pid=5844,uid=0

Aug 14 23:40:51 host0 kernel: [11045.004738] oom _reaper: reaped process 5844 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

Aug 15 00:46:16 host0 kernel: [14969.243341] sshd invoked oom -killer: gfp_mask=0x400dc0(GFP_KERNEL_ACCOUNT|__GFP_ZERO), order=1, oom _score_adj=-1000

Aug 15 00:46:17 host0 kernel: [14969.243484] [<8027b1ec>] (dump_header) from [<8027b8cc>] ( oom _kill_process+0x140/0x188)

Aug 15 00:46:17 host0 kernel: [14969.243510] [<8027b78c>] ( oom _kill_process) from [<8027c448>] (out_of_memory+0x120/0x340)

Aug 15 00:46:17 host0 kernel: [14969.243968] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom _score_adj name

Aug 15 00:46:17 host0 kernel: [14969.244782] oom -kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_ oom ,task_memcg=/,task=storagenode,pid=7304,uid=0

Aug 15 00:46:17 host0 kernel: [14969.633139] oom _reaper: reaped process 7304 (storagenode), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

About the databases, I have checked them and apparently no errors have been found.

root@host0:/home/pi/storj # find /mnt/hd/storage/ -maxdepth 1 -iname "*.db" -print0 -exec sqlite3 '{}' 'PRAGMA integrity_check;' ';'
/mnt/hd/storage/info.dbok
/mnt/hd/storage/bandwidth.dbok
/mnt/hd/storage/orders.dbok
/mnt/hd/storage/piece_expiration.dbok
/mnt/hd/storage/pieceinfo.dbok
/mnt/hd/storage/piece_spaced_used.dbok
/mnt/hd/storage/reputation.dbok
/mnt/hd/storage/storage_usage.dbok
/mnt/hd/storage/used_serial.dbok
/mnt/hd/storage/satellites.dbok
/mnt/hd/storage/notifications.dbok
/mnt/hd/storage/heldamount.dbok
/mnt/hd/storage/pricing.dbok

Regarding memory, I increased the limit to 800Mb … although until now it was consuming between 30 and 50Mb on average.

angelsantillana94 · August 15, 2020, 8:40am

In the node logs I have found some errors like the following:

> 2020-08-15T08:37:34.161Z ERROR piecestore failed to add bandwidth usage {"error": "bandwidthdb error: database is locked", "errorVerbose": "bandwidthdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:672\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:409\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:996\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:56\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}

I’m not sure if these errors were already showing up before the memory problem.

Alexey · August 15, 2020, 8:45am

https://forum.storj.io/search?q=bandwidth%20database%20is%20locked
could be

or

They are known problems and the solution is just let it run, it resolve itself after a while (or by restart).

However, it could affect the memory usage - records supposed to be written to the database are collected in memory instead. If there is not enough memory (in case of docker container the limit has been reached), then container will be killed with OOM.

So, I should reformulate my sentence to “either problems with write to the database or slow disk subsystem”

angelsantillana94 · August 15, 2020, 11:48am

I have rebooted the node in previous times and the problem has not been solved. Now I have deleted the container and the docker image, and recreated it again.
Database locked errors still persist, but the node appears to be working properly.

On the other hand, I’m checking my hard drive in case it has a problem. So far I have not found any errors.

I will wait to see the node working in the next few hours.

Thanks!

Mark · August 23, 2020, 12:51am

Update. After temporarily moving the customer data to another machine, I got the drive switched over to etx4 from btrfs and the node seems to be working fine so far. The node has been up for about 20 hours and is currently using 58MB of memory.

Using this command:

sudo iostat -x 2 5

My drive IO utilization is only going up to about 10%. Before with BTRFS it was at 100%. I don’t know for sure if btrfs was the actual problem or if it was some other unknown variable causing the IO issues. Time will tell.