TL;DR
Questions…
Why does this happen…?
How do i solve it?
So i was seeing some latency issues from my SATA SSD’s which i was using for my SLOG device, and then bought an enterprise ssd, which seems to be working great…
i use netdata quite a bit on my server because it gives me access to a ton of information that may be difficult to access otherwise, even tho i kinda hate netdata for being a crappy program… but that doesn’t mean it’s not a useful tool… so i like to see it working correctly.
the issues with my SLOG creating latency is all gone… however me being not exceedingly well versed in linux yet only been using proxmox / debian for some 7 months now… i know my way around and ain’t to afraid of doing deep dives, to try to understand why something isn’t working…
but it’s not always easy to figure out where to begin.
ofc it doesn’t always happen it seems like it’s working correctly tho maybe faster than what netdata can display or something and then when it gets really up to speed the latency is low enough and the speed is high enough that some of the netdata math goes wrong… just kinda guessing…
but like this… is a few minutes later and it correctly shows 500kb /s writes which seems perfectly reasonable numbers…
the device is a “Lenovo” io3 enterprise value
which is really a SX300 card, to make it work with debian i installed so opensource drivers that are community made (update of the original driver package) seems to work without any other issues tho.
but ofc it is possible that the issue is related to this, but i like to think it’s somehow related to netdata’s crappy programming… because it’s usually netdata that gives me grief rather than the ssd drivers which worked flawlessly afaik.
the device is of the fusion ioMemory series and a 1.6TB PCIe 2.0 x8 model because that was the max my mobo supported and didn’t want any incompatibility issues, or potential poor performance due to poor comparability.
works great i’ve tested it to a few GB /s and even up to 4 GB/s but when it’s from caching that’s not really that impressive, i believe hdd’s get close to that range also.
ofc one will run into PCIe bus limitations at one point… alas it all works.
i’m using ioMemory VSL drivers
fio-status output
fio-status -a
Found 1 VSL driver package:
4.3.7 build 1205 Driver: loaded
Found 1 ioMemory device in this system
Adapter: ioMono (driver 4.3.7)
1600GB Enterprise Value io3 Flash Adapter, Product Number:00D8431, SN:11S00D8431Y050EB58T005
ioMemory Adapter Controller, PN:00AE988
Product UUID:8f616656-45e4-5109-a790-6f766c059382
PCIe Bus voltage: avg 12.15V
PCIe Bus current: avg 0.68A
PCIe Bus power: avg 8.21W
PCIe Power limit threshold: 24.75W
PCIe slot available power: 25.00W
PCIe negotiated link: 8 lanes at 5.0 Gt/sec each, 4000.00 MBytes/sec total
Connected ioMemory modules:
fct0: 07:00.0, Product Number:00D8431, SN:11S00D8431Y050EB58T005
fct0 Attached
ioMemory Adapter Controller, Product Number:00D8431, SN:1504G0637
ioMemory Adapter Controller, PN:00AE988
Microcode Versions: App:0.0.15.0
Powerloss protection: protected
PCI:07:00.0, Slot Number:53
Vendor:1aed, Device:3002, Sub vendor:1014, Sub device:4d3
Firmware v8.9.8, rev 20161119 Public
1006.00 GBytes device size
Format: v501, 1964843750 sectors of 512 bytes
PCIe slot available power: 25.00W
PCIe negotiated link: 8 lanes at 5.0 Gt/sec each, 4000.00 MBytes/sec total
Internal temperature: 43.31 degC, max 47.74 degC
Internal voltage: avg 1.01V, max 1.01V
Aux voltage: avg 1.80V, max 1.81V
Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00%
Active media: 100.00%
Rated PBW: 5.50 PB, 99.97% remaining
Lifetime data volumes:
Physical bytes written: 1,455,959,980,072
Physical bytes read : 941,636,499,552
RAM usage:
Current: 786,365,440 bytes
Peak : 803,504,640 bytes
Contained Virtual Partitions:
fioa: ID:0, UUID:94d66bf0-2410-43fe-a33b-ef602e135305
fioa State: Online, Type: block device, Device: /dev/fioa
ID:0, UUID:94d66bf0-2410-43fe-a33b-ef602e135305
1006.00 GBytes device size
Format: 1964843750 sectors of 512 bytes
Sectors In Use: 853700546
Max Physical Sectors Allowed: 1964843750
Min Physical Sectors Reserved: 1964843750
the drive is formatted to about 60% utilization for performance reasons, not sure if i was suppose to leave it unformatted or format it to 100% and then keep the partitions at max of 60% utilization of the total, wasn’t able to find anything on that so i assume it doesn’t matter.
i used fdisk tp make the partitions on the virtual drive the SSD OS / Firmware creates to interact with the Host Bus.
i reformated the drive to run 512byte sector size because the rest of my main pool is running 512B, wasn’t able to run it at 4k without it throwing errors because the rest of the pool was 512…
one idea i got is that maybe the virtual partitions the SSD firmware creates aren’t suppose to be partitioned into smaller partitions, haven’t really tested that theory out, kinda doubt it will lead anywhere…
if anyone got any ideas about whats going on with netdata, i would be very interested in hearing it…
currently the device is connected as slog and l2arc… i can disconnect it without any issue, but i am planning to start running my OS from it, and thus the complete removal or formatting of the card becomes a bit more problematic…
i know i should have a dual boot option… but i currently don’t… and since i don’t have or plan on making it a mirrored solution, then i will need to have some sort of secondary backup boot to a backup copy of the OS in case of failure.
so that will wait for another day… so if anyone is up for the challenge, i’m all ears…
i have been told that it might be some sort of disconnect between the drivers and the OS… but not sure if i buy that, they seem to work just fine…
also if you happen to have a ridiculously fast ssd, then how does that act in netdata… can netdata figure out how to show its super low latency… could it be the parallel access to my ssd that makes it go all crazy… plenty of questions not many answers lol
so let me sum up again