After a few months of testing things and then migrating all my nodes, the 100% activity on all drives all the time is fixed, IOdelay in Proxmox is much lower and filewalker only takes a few hours for a 7TB node.
After diagnosing that read speeds were reaching about 300KB/s tops for each drive while running Storj (other workloads seemed OK in the same setup) but write speeds were close to bare metal, I took a ~2TB node as a test subject and controlled the conditions so that minimal ingest from Storj would occur to that node.
Filewalker “speed” on a ~2TB node:
~6 days - Disk in passthrough to a windows VM, formatted in NTFS with 4K cluster size, default cache/aio proxmox options and iothread enabled, indexing disabled in Windows.
14 hours - Same disk but passed through via USB. This is what made me understand that the problem had to emerge/worsen on its own maybe due to updates or something, and had to be in the interaction of passthrough with other stuff, seeing as one way to avoid nodes dropping in the past for me was to move the drives from USB in a bare metal machine to becoming passthrough to a VM.
36 Minutes - Same everything of the 6 days test but instead of passthrough it’s LVM. Theory confirmed.
30 Minutes - Same everything but writeback cache in proxmox
27 Minutes - Same everything but writethrough cache in proxmox
Other stuff I’ve tried that had negligible impact on performance or worsened it or had other big drawbacks I won’t get into:
ReFS formatting, aio native with lvm/scsi, cluster size increase, exfat formatting, device caching off on windows.