Ways of speeding up filewalker on NTFS

HGPlays · July 24, 2024, 7:10am

Thanks. And is drefragmentation at all bad for a storj node or good to do every know and then?

Alexey · July 24, 2024, 7:35am

It must be performed on any NTFS disk to work normally. And there is a default defragment job for each disk (it’s enabled by default). However, it’s run with a low priority (when the disk is not used), so in most cases it’s rarely performed on the disk, used by the node due to the almost permanent activity.
It’s advisable to run it manually in this case, if you have issues either with a filewalker or databases. Especially if you have a failed writeable/readable checks (Unrecoverable errors) in your logs.
It helps, but not too much.

HGPlays · July 24, 2024, 8:21am

Thanks for The answer

pangolin · July 25, 2024, 10:38am

I never do defragmentation. With the current storj workload and largest available HDDs we have 4-5 million deleted files every day and the same number of new files coming in. Do you really think defrag can keep up with that?

Julio · July 25, 2024, 10:41am

Both Alexey and I have told you what ‘analyze’ does… it reads the MFT. And by reading the MFT (in order to determine how fragmented a file system is) it caches it’s values. The OS decides what to keep in ram, instead of caching an entire file, by reading the MFT contiguously it merely caches a ‘handle’ to the files it sees as to goes along; only the information in every 1k MFT file record (without the data, if the file was smaller than ~527 bytes to fit in it, and not even the generic metadata like the several hundred bytes for stupid shit, just the extents - where it is/starts on disk sectors) - not also the data within the file, as it would when reading/writing files. Or the much slower process of touching every file, by having the OS manually traverse files/directories like file walker does. It’s far more efficient than a file-walk, and it will push that info to a an in memory cache called a metafile cache, then quickly push that to virtual memory as more data is sequentially read from the MFT… So no, you won’t see your little 4GB ram used to the max, some will stay in ram, some will be in standby and some pushed deeper into the swap file (but it’ll increase your commit size).

When the OS api calls the name - just the name/path, it can and will intercept it from this cache, and if your OS is on an SSD, even if it is older in it’s life span - ie: it has to reach deep into your swap file to get it, it’s still far faster than pulling it off a hdd. You can use tools to understand how your ram works, you can actually watch the ‘metacache’ stats that windows works with in real time.

So it’ss not quite like… of OS API you want this file xxxooo.iso, I happen to have that in memory - but it can and will say to the file walker - you just want the date time and size - np, that’s here in memory (be it active memory, standby memory or virtual - it’ll pull it from ssd) - here you go… and I got lots more index info on other files still in ram here - what’s next? In fact, it’s actually only the directory entries storagenode needs cached, and if you’re clever enough to only touch them - to cache those often enough - to update & keep them in active ram cache, you could be having exponential filewalking fun! Because those entries contain everything filewalker wants to know.

You can marvel at say … try a program called ‘Everything’, it makes Windows search look like a piece of sh#t, cuz it only re-interprets the MFT. And a good starter to understanding your cache would be checking out some System Internals apps, like RamMap.

Hope that helps clear up the wizardry.

My 2 cents for today

Julio · July 25, 2024, 11:23am

Also, if it’s running by schedule, it will only operate when it thinks the operator is away - ie: no mouse movement. If run on demand it’ll grind it out faster.

jeesh… about 6 cents given today

Ruskiem · July 25, 2024, 11:32am

Thx!
And do You think it should be performed periodically? (that “analyze”)
i mean, if files being deleted in millions daily, the cache become quickly obsolete, right?
Is there some other way to do that caching of MFT just like windows “analyze” does? but by some 3rd party program? so it can be configured to perform like every other day to actualize the cache? (sorry i know nothing about caches)

well i know that script in .ps1 can be set in task scheduler, to run that windows tool.
But by the way maybe some 3rd party tool can do it even better than windows, idk?

Also i would like to say, that yes, @pangolin defragmentation seems to not help much.
But if MFT files are fragmented all over the 14TB disk, then keeping them together Should help!
I would also give a virtual gold chest to know how to do it, how to successfully reserve like 100GB for MFT, because i saw no commands for that to force windows to listen. Defragmenting MFT alone is a thing in UltraDefrag 7.14, but windows seems to have only some MB reserved, and in reality it takes for example: 66GB for a full 16TB disk, UltraDefrag shows me, and any new files after defraging this, are landing all over the disk again.

pangolin · July 25, 2024, 11:55am

In the long run it might be better to go the zfs path. This seems to be the only file system which (by coincidence) just has the settings available we want for storj. We can have all metadata + databases on ssd while any other files are on hdd. Also for the RAM cache we can set exactly what we want to be cached. I will move my remaining NTFS nodes to ZFS.

batelis · July 25, 2024, 5:18pm

I just analyzed my ntfs disk and one of them is at 39% fragmentation do I defrag it or it doesnt matter?

xsys · July 25, 2024, 6:56pm

BTW, on Windows Server 2022 there is a GUI for Tiered storage:

and, reportedly, you can use Windows Server for 3 years for free (half year trial + 5x extend time by slmgr /rearm command)

I will be testing this setup and will report how it works…

Alexey · July 26, 2024, 6:18am

It could be performed automatically, unless you explicitly disabled it.
However, you may disable it and then have performance issues, when the drive would be almost full.

Vadim · July 26, 2024, 6:26am

I just added RAM to my servers and event NTFS working stable with heavy load.

Alexey · July 26, 2024, 6:26am

It would help to speed up a filesystem, if you do this. At least if you already have issues either with a failed filewalker and/or writeability/readability checks.
You may also report how long the filewalker would work after.
For that you need to find all finished filewalkers in your logs and write their times to somewhere to then compare.

Alexey · July 26, 2024, 6:28am

You likely switched the policy to the give more priority to the services, am I correct? Or do you use a Windows Server?

batelis · July 26, 2024, 6:28am

I would try to do it but my disks are 16tb each one disk will take like 12 hours

Vadim · July 26, 2024, 6:33am

It is windows 10, not switched anything, just 17 nodes and 56GB of RAM, I would Add more but z170 can support only up to 64GB, I found one app that can setup minimum and max, but i se by default max iz 0, that mean unlimidet, and i also made bigger minimum.
I also searching if it possible to use page files for storing more, my page files on NVME so will work fast.

Julio · July 26, 2024, 6:58am

This is amusing…

I’ll save you some effort, and give you 2 cents of guidance to ponder.
But in the future, I’d suggest you understand the documentation, and what processes are in play.

Tiered space… yes, will hold 1 TB of the most used files in the tier!!! - woot! Super awesome! Not… There will be no popular files regurgitated all the time for upload from your node to Storj, that aren’t already held in cache and thus on offer. So no… bad luck for you. A complete waste of SSD space, amounting to 7.5% of your space offering a super fast retrieval for anything Storj may want uploaded, win a few races… but rarely, if that. Why?
Because… In order to keep those files in the ‘hot’ tier, every 4 hours your OS is going to grind over your entire disk - yes, the whole thing. In order to understand and reproduce a ‘heat map’ of the most popular files to move to the faster tier. So … your server is going to go NON-STOP, reading every file on the file system, every 4 hours. All the while… oh here comes another file-walker… walking ALL the files at the same time - oh jeesh’us - WTF! Right? … it’ll never finish. Then guess what happens… ok it constantly pulls the data off the standard tier, pushes it to the faster tier and pushes the repaced data back to the standard tier…

YOU’RE ABOUT TO DISCOVER REAL INSANE FRAGMENTATION.

Cuz you’ve made a system bound and determined to do that 24/7 - just to f# s#it up… 24/7. MS recommends at least a 20% allocation, just to alieviate some of that pain you will endure. Are you prepared with an equivalent to SSD speed: RAID 10 HDD array with REAL substantial WIDTH as the Standard tier? If not, you’re going down with this ship, straight to the bottom.

Sorry to be a d#ck, but I’d rather prevent you from banging your head against a wall than to think I might have offended you. And others could learn from this.

Seriously, you’ve been told new test data, and future customer use is focusing on 30-45 days TTL, what a desperate never ending plunge into a hell scape pointlessly tiering that data.

Please just report back, that you thought of a better plan. Or that you at least understand the concepts of adjusting a write back cache, and pinning, using an overlay or anything else in context.

Maybe start checking into a storage bus cache, if you’ve got a lot of equipment to burn.

My 2 cents

Alexey · July 26, 2024, 7:42am

The meaning of your message is correct.
The tiered Storage could help a little bit but in expense of a constant IO for your disks.
However, is it possible to configure it work like ZFS? I think - yes. Just do not know how. I know how to configure the tiered storages, and stuff like this, but how to make it work as close to ZFS as possible?

Julio · July 26, 2024, 7:59am

You think right! There are.

pangolin · July 26, 2024, 12:04pm

If it is possible to make windows moving only metadata to fast tier, please enlighten me. I am not aware of such option.