@littleskunk riddle me this…
i’m pondering if this is normal peak ram usage, because it will spike in the GB range at times
and here is a comprehensive look at the system statistics
no current utilization or iowait
weekly iowait also looks fine… i forget what i was doing when it averaged a peak of 18% most likely a scrub of my pool so to be expected, and the start is the system boot, takes a few hours to warm up the arc
but like you can see no considerable iowait
24 hour proxmox cpu graph also looks fine in regard to iowait
was running a find command a little while back… caused the little spike at the end of the graph
still no iowait…
checking the logs with successrate.sh also looks fine
./successrate.sh storagenode-2020-09-23.log storagenode-2020-09-24.log
========== AUDIT ==============
Critically failed: 0
Critical Fail Rate: 0.000%
Recoverable failed: 0
Recoverable Fail Rate: 0.000%
Successful: 2593
Success Rate: 100.000%
========== DOWNLOAD ===========
Failed: 2
Fail Rate: 0.003%
Canceled: 65
Cancel Rate: 0.114%
Successful: 57139
Success Rate: 99.883%
========== UPLOAD =============
Rejected: 0
Acceptance Rate: 100.000%
---------- accepted -----------
Failed: 0
Fail Rate: 0.000%
Canceled: 9
Cancel Rate: 0.036%
Successful: 24657
Success Rate: 99.964%
========== REPAIR DOWNLOAD ====
Failed: 0
Fail Rate: 0.000%
Canceled: 0
Cancel Rate: 0.000%
Successful: 20497
Success Rate: 100.000%
========== REPAIR UPLOAD ======
Failed: 0
Fail Rate: 0.000%
Canceled: 1
Cancel Rate: 0.029%
Successful: 3432
Success Rate: 99.971%
========== DELETE =============
Failed: 0
Fail Rate: 0.000%
Successful: 62558
Success Rate: 100.000%
i could try to go through individual disk latency, but can’t really see anything of note…
pool: tank
state: ONLINE
scan: scrub repaired 0B in 0 days 15:25:56 with 0 errors on Wed Sep 16 13:26:48 2020
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-HGST_HUS726060ALA640_AR11021EH2JDXB ONLINE 0 0 0
ata-HGST_HUS726060ALA640_AR11021EH21JAB ONLINE 0 0 0
ata-HGST_HUS726060ALA640_AR31021EH1P62C ONLINE 0 0 0
raidz1-2 ONLINE 0 0 0
ata-TOSHIBA_DT01ACA300_531RH5DGS ONLINE 0 0 0
ata-TOSHIBA_DT01ACA300_99PGNAYCS ONLINE 0 0 0
ata-TOSHIBA_DT01ACA300_Z252JW8AS ONLINE 0 0 0
raidz1-3 ONLINE 0 0 0
ata-HGST_HUS726060ALA640_AR31051EJS7UEJ ONLINE 0 0 0
ata-HGST_HUS726060ALA640_AR31051EJSAY0J ONLINE 0 0 0
ata-TOSHIBA_DT01ACA300_99QJHASCS ONLINE 0 0 0
logs
fioa2 ONLINE 0 0 0
cache
fioa1 ONLINE 0 0 0
errors: No known data errors
hmmm looks like the first spike was the scrub on the 16th and i cannot remember what i did to cause the other one… doubt it’s relevant… tho last time it took days before my “recorded” storagenode memory usage dropped down to its usual of 70-90mb …
i suppose it could still be the spike from the 20th…
guess ill just have to wait and see if it comes back in the future… ah the 20th is the boot … lol it was reversed, until the arc and l2arc takes over the io load on the pool is a bit heavy…
and ofc the storagenode also boots right with the server…
just added a new PCIe SSD specifically to try and limit my iowait because my old setup of dual sata SSD’s got overworked so bad that ended up with 120ms latency which then affected the hdd’s latency.
but ever since i got that working 4 days ago now, my numbers have been great… ofc haven’t really put some serious load on the system yet… just running 2-3 vm’s but have tested upto 9 without it showing any issues aside from that i don’t have enough ram
i just think it’s weird that i got 1.4gb memory allocated to the storagenode… i need to get my netdata fixed and get my storagenode moved into a container so i can better monitor the memory utilization over weeks and months… netdata is pretty crappy for long term numbers… but nice for the gritty details, when it wants to work…
alas so what is my storagenode memory usage suppose to be and is that stable or changes widely i guess my question is… does it increase with node size and activity level… i suppose it would…
the system has 48gb … so not really a concern that it uses a bit of extra ram now and then, and tho netdata says 90% used, then its most like 85% if you ask proxmox and then the ARC is 21 GB of that which the system will drop if required immediately if something requests more memory than is free…
so not like the storagenode can chew through it quickly… especially without any noticeable ingress