Am I using all the RAM?

Aitor · August 7, 2024, 8:49am

Hello,

This is a quick question: my node is always under high load, but CPU usage is low. Swap memory is all used but RAM it is only 5/8. Does this indicate that the bottleneck is the RAM?

Thank you!

Aitor · August 7, 2024, 8:51am

Captura de pantalla 2024-08-07 105108

Swap usage and disk usage.

andrew2.hart · August 7, 2024, 9:36am

Does “steal 120642” imply that you are using a VM?

Aitor · August 7, 2024, 10:06am

No, I’m not. It’s a simple desktop Pc. Ubuntu + Docker all ext4.

pasatmalo · August 7, 2024, 11:32am

While RAM usage is only “5/8”, the rest of the RAM is used up by buffer + cache. These two will usually grow to use as much RAM as they can, and in the case of nodes, it will help by storing metadata of the HDDs in RAM.

If this is a bottleneck or not, it depends. Are you experiencing a lot of races lost? Database locked issues? HDDs pinned at 100%? Very high IOWait? Any other symptoms of an overloaded HDD?

While more RAM is not a solution per se for these issues, the RAM metadata cache will help avoid unecessary operations to the hard drive, therefor freeing IOPS for other activities.

To me it would seem like the system could benefit of more RAM as most of it is eaten up by the “used” segment. The higher HDD capacity you got desginated to Storj, the better it is to have more RAM (or some method of metadata caching). If the capacity designated to the nodes is low, then it probably wont be a big issue either way.

Aitor · August 7, 2024, 1:30pm

I didn’t check any log for races lost, just see that after the test data started, load it is high. Previously, was about 2-3, now it is always around 10.

I will investigate more when I can, thank you.

Vadim · August 7, 2024, 1:38pm

Do you use any alternate config for node file buffer?
like. Filestore.write-buffer-size: ?

tylkomat · August 7, 2024, 2:43pm

I would say it also depends on the swappiness settings. If it’s set to prefer RAM rather that putting it in a swap file and the swap file is still full, then it seems there is not enough RAM.

As a side note, isn’t it recommended to have the swap file size at least the size of the RAM?

EasyRhino · August 7, 2024, 3:35pm

if you are using atop you will see a high “wait” percentage on the cpus. That means the CPU isn’t really doing anything, except waiting for disk IO. that impacts the ‘load’ calculation on top and htop, even though the CPU is basically idle.

are the disks ext4 formatted? Having more ram is nice because it can hold more disk data in cache, and especially the metadata because operations like filewalkers are so hard on lookups. However, there’s not much explicit control over how you use the cache, it’s not really necessary. (things like vfs_cache_pressure haven’t made a noticeable difference for me).

you can mount your drives with noatime. this will shave off a teeny bit of I/O.,

pasatmalo · August 7, 2024, 3:36pm

This is because part of the load metric is iowait. As the node has more activity due to the test data, the number of IOPS to the drive increases. Because hard drives are rather slow, IOPS can queue up quite easily.

Therefore the reason you are seeing an increase in load is simply because there are more IOPS queued up for your hard drive which are waiting its turn because your hard drive is not fast enough to handle all the request.

As you were looking for a “bottleneck”, this “bottleneck” causing the increased load is your hard drive and not RAM. Having said this, as I explained before, having more RAM can help reduce the number of IOPS on your drive, which in turn reduce iowait, and reduce the overall load. Keep in mind that there is a limit to how much RAM is beneficial, at some point adding RAM is useless.

Aitor · August 7, 2024, 3:41pm

Disks are usually at 100% usage, aren’t SMR. Surely it’s that, thank you.