A node with 1,7Gb RAM usage at the moment, tell me what to check (v1.17.4)

Ruskiem · December 4, 2020, 9:18am

Hi. My node suddenly got max ram usage available, and storj process is eating most of it.
The node is online and operates normally i guess, log shows normal work.
Trash is: 1,43GB.
Unset orders are only 10 files.
Suspension & audit are 100% on all satelites

its Win10 pro, so its GUI.
Clipboard02

Toyoo · December 4, 2020, 9:46am

I’ve observed elevated RAM usage when the disk was too slow to handle the traffic. Check your IO.

Ruskiem · December 4, 2020, 10:06am

the hdd seems to be ok, its 8TB ultrastar on varanty, beside theres no big traffic at the moment.

SGC · December 4, 2020, 11:58am

iowait / disk latency is usually the cause… doesn’t take much… even 20% iowait on avg can cause stuff like this… the best way i’ve found to identify problems is looking at the individual disk latency.

from what i’ve seen, been testing this out a bit… the ram usage is allocated by the storagenode for caching, it will take a while for the RAM utilization to go down…

and even tho i try to improve it, then sometimes it will demand a significant amount of memory, however some people run with very low memory amounts, so it might be possible that the storagenode will find another way if it doesn’t have extra memory to use…

i wouldn’t recommend that tho, but that’s more personal preference, cannot say if it actually can do any harm… i’ve been tracking this kind of behavior over 2-3 months and it still keep happening… may also be related to node activity… it seems to come and go, and for me it’s pretty irrelevant as my server has plenty of memory to allocate if need be.

however it does seem to be closely linked to disk activity and storj recommends 1gb pr node, even if it only uses 50-150mb for atleast 95% of the operational time.

not really a big surprise that the storagenode requires memory… are you running 512B sector sizes…

my SSD which is designed as a memory swap drive has some interesting facts about the differences between 512B sector’s vs 4K sector sizes on the hdd / ssd.
it is required to keep a table to keep track of the data, and for 512B sector sizes (which is what i use) if i was using it for swap it would required something like 24GB RAM to keep track of it, while with 4K sector sizes the required RAM would only be 3GB…

something similar might happen with the storagenode, if it’s keeping track of the files location on the hdd, these tables might also take 8 times the memory usage for 512B hdd’s…
which may explain why my storagenode at times goes above 1gb in utilization.
but the hdd’s i got only supports 512B sectors.

ofc it’s not all disadvantage with running 512B, since the possible iops on smaller data blocks is much higher, where 4k would only be able to do 1/8th of the same iops if the data blocks are less than 512B.
not that i have any use for this i think…

anyways it might be an interesting comparison to do.

deathlessdd · December 4, 2020, 12:16pm

Well there’s your problem it’s windows,
Although slow hard drive is usually the cause of high ram usage. Data gets stored into ram when the hard drive can’t keep up.

Have you checked out how delayed the drive is when it’s being used?

Ruskiem · December 4, 2020, 12:32pm

hmm thank You both. SGC and deathlessdd.
hdds latancy seems to be normal.
This hdd is in use only from about 1 year.
So i think a solution should be seek in some storagenode software optimalization.
As i rarely encounter that much ram used, it is somehow novel problem over a period of 12 months

Edit: All i want is to restore the occupated RAM space, i will restart, but im waiting for questions, mayby i can do some checks for investigation first, in case i destory the situation and could not reproduce it more later

SGC · December 4, 2020, 1:59pm

yesterday my primary 14.5tb storagenode was at 500-600MB ram utilization.
today it’s at 350mb, my iowait on the server is about an avg of 1.5% on my 7 day graph, 1 day graph and 1 hour graph is closer to an avg of 1% iowait

and my hdd’s peak latency is 20ms, with an avg of 5ms, running proxmox / debian linux

and then since it’s actually two Raidz arrays working in sync the avg read latency will be lower than that, this also doubles the iops and all writes are cached on an IoMemory PCIe SSD and sequentially written to the array.

i think it’s unrealistic for most to have better numbers than that.
and tho i have had issues with hdd’s from time to time and other issues, it’s been a persistent thing that my storagenode utilizes variable amounts of memory, but it doesn’t really surprise me given the size of what it’s keeping track of.

my system runs for long periods, and the ram utilization will go down again by itself whenever it finishes what it’s doing i guess… just takes a long time…

my OS SSD didn’t seem at all to happy at the moment, so maybe that could affect the storagenode ram utilization.

P.S
i did have some 9 hours of DT recently due to switching ISP
that caused my storagenode to delete 130GB which ended up in trash… i have seen that kind of behavior of high trash and high memory a few times… not sure if they are related at all…

just a note, takes a long time to really get a sense of what is connected when it can take weeks between the behavior showing up.
so no clue if that even could be related, but it’s possible… something to keep an eye out for…
would be nice to know, if the wide range of memory usage is a sign of an issue or simply a standard behavior when working with large amounts of data…

i’m leading towards the latter.

Alexey · December 4, 2020, 9:47pm

The high RAM usage is usually mean that:

the drive is slow or it’s a network attached drive. Even iSCSI have a high latency, the NFS and SMB are not fully compatible;
the database(s) corruption.

Please, make sure that you use a local connected drives, do not use btrfs (Topics tagged btrfs), if you use Unraid - upgrade to the latest version; the disk is not SMR

Then check your databases: https://support.storj.io/hc/en-us/articles/360029309111-How-to-fix-a-database-disk-image-is-malformed-

Ruskiem · December 4, 2020, 9:51pm

I see no signs of database corruption in storagenode.log
sooo i dont know its 1,9GB now hmmm…
hdd is in normal PC, its sata3 .

Alexey · December 4, 2020, 9:51pm

Please, just check it

Ruskiem · December 4, 2020, 9:53pm

lol u replied in moment when i edited, i see no errors or warns in the log, normal INFO piecestore download traffic hmm

Alexey · December 4, 2020, 9:57pm

I can repeat - please, just check it. It’s much faster than write an another argument why not

Ruskiem · December 4, 2020, 9:58pm

oh You mean the link You gave, didnt saw this in first place sorry, i will check it and tell later, thx

Alexey · December 5, 2020, 3:24pm

4 posts were split to a new topic: I can’t install the second node on the same PC, can you help?

clapsyourhand · December 5, 2020, 3:28pm

ALEXEY interesting…thanks for your time
btw
when i have static IP your.ddns.tld:28968 will be 192.168.0.120 or my static IP?

im reinstalling all ubuntu to clean status
will be post here all my step by step guideline