What is Your IOPS with no ingress but during filewalking?

redgyuf · June 21, 2024, 9:44am

Two of my nodes runs under unraid with DBs on NVME cache, disks are running seperate pools.

My CPU is : Intel® Core™ i5-11400
16GB RAM, but only like 60% usage.
Disk1: 8TB ST8000NM0055 XFS fs pieces.enable-lazy-filewalker: true
Disk2: 8TB ST8000VE001 XFS fs pieces.enable-lazy-filewalker: false

Both my node is full and im curious why my filewalking takes long time, yesterday because of maintenance both got restarted, so it started at Jun20 15:24 and since both my drive is still on 100% utilization.

Disk1

Disk2

It seems without the lazyfilewalker the iops is like 50% better (probably gonna test to disable it on disk1 aswell)

So Question is, what are your IOPS numbers, because i just don’t know if i have bad setting, or XFS is just bad or do you guys have any idea how could i optimize my system (without loosing data)

Roxor · June 21, 2024, 11:33am

100’ish IOPs from 7200 RPM HDDs is totally reasonable. I’d leave your filewalkers on default (lazy) as you always want what the node is doing to take priority over any background housekeeping chores.

If your 8TB drives are full you probably have 25-30 million files… and filewalker is checking like 100 of them every second: it’s just going to take awhile.

snorkel · June 21, 2024, 11:54am

Here:
https://forum.storj.io/t/tuning-the-filewalker/19203/221?u=snorkel

See also up and down posts. I did many tests.

pangolin · June 21, 2024, 7:45pm

You just found the reason for calling it lazy…

Mitsos · June 21, 2024, 8:11pm

People are greatly misunderstanding how IO priorities work. I’ve explained it before, but I’ll have another go.

Normal filewalker: “Gimme gimme gimme gimme!!!”. Uses up all of the disk’s IOPS, numbers go up => it must be great!. No IOPS left for the actual storagenode => lower ingress => less stored data. Your disk simply isn’t fast enough and loses races.

Lazy filewalker: “Can I please read some data from the disk?” Host system prioritizes from high priority (actual storagenode process) to low priority (lazy filewalker) and answers “no, not right now, I’m trying to write a file, check back in a bit”. Lower IOPS, numbers go down => it sucks!!!. Most IOPS go to the actual storagenode process => higher ingress => more stored data. Your disk promptly reacts to read/writes from clients, everyone is happy.

As long as the storagenode process isn’t running in a VM (= host system doesn’t understand that a spawned sub-process is actually lower priority), then the best way is to enable the lazy mode. Lower numbers do not mean anything. Context of those numbers is everything.

When a drive eventually (hopefully) fills up, it can’t write any more data. The only IOPS required are reads, so unless you have the luckiest node on the planet and it stores pieces of every single file on the network that is being requested by a client, then it will have plenty of “free” IOPS for the lazy filewalkers. In this instance, the reply to the “Can I please read some data from the disk?” question is a “Absolutely, here you go!”.

TL;DR: Lazy is better. Enable it and forget about it.

redgyuf · June 21, 2024, 8:58pm

Yeah, ofc but I did not known the size of the difference, and since many of us have some troubles with bad used data info, it is now quite good to know the difference.

redgyuf · June 21, 2024, 9:15pm

I see, however, my nodes are currently full, so close to zero disk usage from ingress and like 2-3GB egress / node.

So it seems, lazy is just slow and does not really checks if the disk is really in use, because in my case with no disk usage both node should be relative close to each other.

Also based on node stats the node with the non lazy filewalker had a bit less download/upload cancellation.

So maybe my DISK1 is weaker or lazy aint working that effective as it should, or my 400GB/node ingress was not enough to feel the difference

Mitsos · June 21, 2024, 9:22pm

It’s not a matter of if the lazy is working better or not. The only thing that changes is how the filewalker is requesting IOPS from the host. If the host can (and should) handle high/low priority processes, then it’s the host that is slowing down the filewalker because the disk is busy.

A simpler way of thinking it is that the lazy is only working on free(=idle) IOPS. If you don’t idle the disk, then there isn’t anything to give to the lazy filewalker (exaggeration for dramatization).

Even if a client is downloading/uploading at 1KB/s, that means that the disk is being used by a higher priority process. Lazy will still wait. It’s not a matter of how much ingress or how many connections are currently being served.

I’m oversimplifying it for explanation purposes. The host system knows how to properly prioritize it based on the load, the current position being read, the disk’s readahead cache, if metadata is cached in RAM, and other things.

Alexey · June 22, 2024, 3:04am

The XFS could be a little bit slower for the filewalker processes, as reported by some (see Topics tagged xfs) and it’s mostly based on speed of rsync from XFS (this is the closest process to what’s filewalker doing), so, I cannot confirm that on 100%.

You may try to test: