Tuning the filewalker

snorkel · December 18, 2022, 5:32am

After installing 6GB RAM in other Synology DS220+, for 7.6TB, the FW took 2h 40min, so 0.35 h/TB. Almost half the time of the first setup. The only differences:

it uses Exos X18 16TB instead of X16 16TB.
test done after fresh restart of system.

arrogantrabbit · December 18, 2022, 6:05am

FWIW, in my past experience with Synology, not with storj, but with multiple, often concurrent, Time Machine targets — which have a very similar to filewalker IO characteristics: detailed scan at the beginning of the backup where all files inside the 5-8TB disk image are traversed, and subsequent random sync writes — I’ve noticed that adding memory up to about 24GB provides drastic and incremental performance improvement, but beyond that — through 32GB — diminishing returns. The experience was the same on cheap celeron models such as ds918+ an an a proper appliances like ds1618+.

This is to say 6Gb sounds like very little. of course, if you had 2 before you will see drastically better results with 6, but the significant improvements will continue though 16 with 24GB being the sweet spot.

Another datapoint — my curent nas that runs storj has 32GB of ram, and I don’t notice any performance issues when starting my (albeit small, 2.5TB) node. I learned about the filewalker being a thing from this forum

Alexey · December 19, 2022, 4:01am

8 posts were split to a new topic: Synology upgrade

theurs · December 29, 2022, 12:03am

nobody observed?? millions of small files in any filesystem is a very big trouble

theurs · December 29, 2022, 12:06am

no matter how much memory you have if node restarted after powerloss. it will take many hours for check all files after restart/ no any other file related soft i know do not check all files after restart

arrogantrabbit · December 29, 2022, 1:36am

No, it isn’t. This blanket statement is false.

If customers want to store millions of small files — you will end up with millions of small files. You can’t control what customers do. (And no, bunching small files into big blobs is counterproductive, it will just move a problem to random access inside a blob, but this time without benefiting from metadata caching.

This file-walker problem is a made up issue. Just don’t run a storage node that you expect to serve terabytes of data on a potato. As I said above, I felt no effects of file walker on my nas. None. I learned about it from the forum. Properly configured filesystem can handle millions and billions of files.

Toyoo · December 29, 2022, 3:29pm

For the record, we still keep learning what does «properly configured» mean. We know some file systems are better than others, we know to discourage SMR drives unless extra effort is made, and we know having more RAM helps.

Sure, you can get a not-potato system with hundreds of GB of memory and lots of SSDs and you’ll probably never experience a problem with hosting a Storj node, but it won’t be economical. At some point you’ll need to figure out the trade-off between fast enough and cheap enough. Knowing exactly how to configure a system is very useful, because it allows you to use cheaper hardware. Studying this imaginary bottleneck is exactly what pushes this study forward.

peter_linder · December 29, 2022, 10:34pm

I’m not even sure about the SMR part. They should only be bad at writing, reading shouldn’t be a problem and a node spends almost no critical amount of time writing stuff.

Perhaps if people use the drives for other tasks at the same time it can be a problem, but I don’t see just a storage node creating enough write load for there to be a problem.

Of course, never use an SMR drive in a raid5-6 array.

Toyoo · December 29, 2022, 10:56pm

Per my experiments, only a small part of I/O time goes to actual reads (and writes) of the content. The rest is (assuming that SNO follows the most basic setup) database and file metadata updates. What’s more, some SMR drives seem to be performing additional maintenance writes even when drive is only read from, some kind of imprompu defragmentation, which make the operation even slower.

The extra effort I wrote about is things like moving databases to different storage, which moves some I/O to different units, or adding caches that help the drive defer some writes in hope to reduce write fragmentation.

peter_linder · December 29, 2022, 11:13pm

Yes, being said that I don’t have any real world experience with these drives myself, I’ve read about many shenanigans like this that these drives do.

I feel like if the FW devs had taken some time to make sure to get out of the users way, then the SMR method might not have had its reputation destroyed like it has. Reading user data should always come first. Fix up the internal structure later.

Toyoo · December 29, 2022, 11:26pm

Yeah, they probably indeed read data first and serve it back to the user, only then, taking advantage of the sectors being still in cache, write them in a different place. The problem is that this assumes that the drive is mostly idle, and Storj nodes usually aren’t.

I do have high hopes for host-managed SMR drives, I’ve read btrfs already has decent support for them in new kernels. They aren’t available on the customer market with competitive prices yet, though.

snorkel · December 30, 2022, 7:48am

They will disappear probably soon, when the new above 20TB to 50TB HDDs enters the market. I don’t see any big HDDs using SMR tech either. It was just a temporary solution for giving more space in a limited capacity hardware. RIP SMR!

Toyoo · December 30, 2022, 6:07pm

Uh, not sure about that it’s temporary. WD announced 22 TB CMR drives together with 26 TB SMR drives. If they can reliably convert any CMR drive into a corresponding SMR drive with 20% capacity more, they’ll do so.

TechAUmNu · January 9, 2023, 10:43am

I haven’t really noticed the filewalker running with 14 nodes on a 120TB ZFS system, but I have 256GB ram and metadata special device which probably helps a lot.

It was brutal on my previous windows machine with just bare drives.

KernelPanick · January 14, 2023, 9:07am

On a cold boot my TrueNAS node running iSCSI thin lun @ 72% capacity runs approx for an hour. 3 nodes hitting it with some overlap as well. I say this to show that it’s not even that highly optimized at this point, and it still seems quite strong. It’s an 8 disk array of mirror pairs. None of the data is cached in RAM on the cold boot, so it runs the longest this way. Subsequent container reboots go much faster after metadata is in RAM. Eventually i will move it to native ZFS, truecharts / kubernetes, but i’m starting with a fresh node to be extra cautious…

snorkel · January 15, 2023, 10:32pm

I upgraded my NASes to 18GB RAM and now FW dosen’t bother me anymore.

theurs · February 8, 2023, 11:39am

it does not matter how much memory you have/ it will crawl millions of small files after restart same long time/ and that is the problem, we need regular restarts

TechAUmNu · February 8, 2023, 12:13pm

It does for zfs as all the metadata it will crawl in stored in ram, l2 cache or metadata special devoce if you have them. Only if the data isn’t there will it hit the spinning rust.

Alexey · February 9, 2023, 5:58am

More free RAM usually helps even with ext4. The only problematic filesystems are extFAT, NTFS on Linux and BTRFS (except BTRFS used in Synology - they noticeable optimized it), zfs could be slow without tuning and much more RAM.

xsys · February 10, 2023, 4:57pm

What would you recommend for Windows 10, 20GB RAM, storj node with 10TB of data on NTFS?
After reboot or node update, the storj service is giving the HDD hard time, 99% utilization for over 10 hours, during the time the hdd is shaking and noisy.

I set Performance for Background services, but it didn’t make any singificant change.

I’m hesitating to turn off filewalker, as it may have undesirable consequences, would you remommend it?