Fluctuation of audit scores, recovery

Hello!

Two weeks ago we had a prolonged power outage in my area due to storms. Unfortunately, it happened while defragmentation was running on my node, which led to corruption of the NFT file. I was able to recover the system using CHKDSK, but I suspect that some parts of the NFT may still be damaged.

Since then, my Audit Score has been fluctuating — sometimes it improves slightly, then drops again. Is there any way to fully restore it? Any suggestions for stabilizing node reputation would be greatly appreciated!

Windows server 2022, 1 HDD/20TB (Exos), NTFS

Thanks in advance!

I think all you can do is wait, and cross your fingers. The audits are probably occasionally asking for a .sj1 file that got damaged: which decreases your score - but as long as you stay over 96% you won’t get disqualified.

Although you can’t repair things yourself… data is constantly being slowly deleted and new data is uploaded… so statistically over time your damaged files would probably be deleted anyways. So… your score should improve.

4 Likes

I’m hoping for the same. Thank you very much for the response!

@havri1 Let me share a little real-life story..

I’ve had a similar situation on a node running on some potato hardware about a year back.
It experienced a hard crash with power failure, likely corrupting a few files.
I’ve since moved it to another server with UPS and ZFS, and even migrated it to hashstore (why not test on it now that it was somewhat doomed anyway :wink: )

However - It’s still hoovering at about the same score it was after the crash.

Here’s the last 90 days:

Today it’s 5.5TB data and receives about 3000 audits/day, so plenty of cause for disqualification if it was going to.

You might be just as lucky.. perhaps this can bring some peace of mind :slight_smile:

3 Likes

Thanks for the info! Let’s hope they hold up and improve over time, especially if the corrupted content does actually get deleted.** I’ve started to second-guess whether I should keep a 20TB node running like this. I’m planning for the long term. Right now, it’s 5 months old and has about 3TB of data on it. I’ll report back later with updates — for now, I’m leaving it as is. (It just refreshed and improved again, but we both know… it’s not that simple.)

Of course, now is the time to consider mitigating strategies for such another occurrence.
Perhaps you might wish to consider a UPS? :slight_smile:

1 Like

Unfortunately, it’s not a solution in this case. I do use a UPS. A defragmentation process takes a very long time. The uninterruptible power supply has limited capacity, and if it reaches a critical charge level, it shuts down the computer—even if defragmentation is in progress. My current setup can bridge roughly 25–30 minutes (140W). The power outages caused by the storm lasted 12 hours, 14 hours, and 7 hours, respectively.
Scheduled electrical maintenance typically occurs twice a year; it’s announced in advance, and I always prepare for it, of course. The storm (here in Hungary) was unexpected and caused significant damage in many areas—nationwide.

The situation is slowly improving, and I hope it won’t get any worse or result in the node being excluded.

Why do you need to defragment?

1 Like

Exactly. Don’t use windows. Don’t use NTFS. Don’t defragment anything.

UPS is a good idea, just make sure the timeouts and delays are configured properly to allow the system to shut down fully before power is pulled. Default ones may not be sufficient. Powerchute is horrific trash, use something else. In the same vein, back UPS are also horrific garbage, so I guess no harm there. I strongly recommend replacing them whth SmartUPS. They are entirely different class products and it shows.

On the other hand, note, windows is known to corrupt the filesystem even on clean shutdown (easy to check — schedule boot time chkdsk and reboot. On my pc I always used to see garbage and corruption.)

For node performance (and by that I mean for the node to not be noticeable) you shall have enough ram to cache all metadata and use filesystem and OS willing to do so. Nothing else would work. Windows and NTFS are not such os and filesystem. When you find yourself defragmenting disks — something already went wrong. It makes no sense to do so. It’s raking attempting to fix missing limb with a bandaid.

2 Likes

I agree. I’ve just purchased this UPS, which is classified in a higher category than Smart. For that level of capacity, I couldn’t find a better price-to-performance ratio. It wasn’t cheap. The server that runs the node also serves other purposes, and I’m most comfortable using Windows—switching to another OS would be too big of a task for me at this time. I’m sure it’s doable, just not by me, unfortunately.

As for defragmentation, I perform it to keep the node speed satisfactory. My first node, after three months, became so fragmented that it couldn’t handle its tasks properly and kept failing. Currently, defragmentation is set up to run a quick process whenever the disk reaches 1% fragmentation. It usually takes about an hour and happens once or twice a week.

But are you saying it’s unnecessary?

The issue mentioned above is actually more complex. I used software (OO Defrag Server) that also defragmented the NFT. I even read about this here on the forum. Unfortunately, I experienced an unexpected power outage right in the middle of the process, which was supposed to take around 4 days. The interruption may have caused system damage, which CHKDSK only partially repaired.
RAM: 384GB - Windows Server 2022.
I welcome any advice! (I’m using PrimoCache software to accelerate the HDDs.)

I have not defragmented a hard drive in over 30 years. Never really was of any real improvement compared to time/risk/HDD wear.

If your running into speed issues, especially at 1% fragmentation, add more memory to server.

Ow, you already have a fair bit.
What else runs on it?

Torrent, databases, Pi Node (Docker), Plex server, and the Storj node.

I’d look at CPU load, memory utilisation, and network load.
Disk fragmentation is very unlikely your problem. (But the cause of your corruption)

1 Like

Unfortunately for NTFS it’s necessary, at least deafault automatic tasks, especially if you use a piecestore backend on the node. This is proved by many people here, however tools defragmenting MFT can be dangerous.

Did you disable the automatic default tasks too? Because if yes, then it will have performance issues.

This :up_arrow: makes 99% of fragmentation usually.

1 Like

Hopefully you’re only using PrimoCache ONLY for READ acceleration, if you have ANY WRITE cache that would cause corruption for EVERYTHING in that portion of the cache. Thus your current audit stats. Even a 64GB write cache would be: 64/4% = 1,600 GB or 1.56 TBs, and you’d be disqualified in less than an hour after re-boot, as the entire amount of that cache would be gone.

Keep with occasionally defragmenting the MFT.

You have more than enough ram to cache the entirety of the MFT: 3TB @avg 192k blob size would be ~16,777,216 files, thus using 1k records that’s 16GB (5.33 GB/TB).
Go figure out and ensure that your Windows server is using large cache mode.

I also use Windows Servers, among other OSes on a cluster of vSan nodes.
I set my MTF on a pre-constructed SSD image within the NTFS drive itself, while all large record data remains on the HDD. This technique forces all small records to be accessed with full acceleration, and more than DOUBLES the HDD iops. of the NTFS (for any usage pattern whatsoever, as half of every record access NTFS does involves touching the MFT; which resolves near instantly; therefore the HDD head stays where it is and continues to read data contiguously without having the reference the MFT on every next block). Using 4k cluster size, having a max of 16TBs per node/HDD, anything higher for Storj node use would be counter productive, IMO. This manually perforated MFT of 1 meg contiguous segments reaches ~160GB, 86GB (good for 5.375 GB of MFT entries x 16 [TB] = 86 million files) of which denote the actual MFT the remainder supports all files <16k records to reside either within the MFT records themselves (<4k), or to be set within these 1 meg segments in between the contiguous portions of the MFT; thus forcing all > 1meg data segments onto the HDD itself. It’s not actually much different than ZFS as many here use here, just less rigid.
If I ever ever ever need to defragment, it would only be to defrag files less than 16k-32k. As it only would pull and sort data onto the Nvme; from the Nvme area or the HDD area one way - it’s quick and efficient. 4k cluster also chosen to match native 4kn & Nvme, additionally allows the use of compression, which allows a segmented advantage to denote specific directories for further ease of block manipulation when it comes to any fragmentation. In the case of Storj, think trash directory and having advanced control to pre-defrag future trash removal as advanced cleaning/maintenance of main data. Obviously useful for other things too, as you can defrag only files flagged compression only, much like record size.
BTW, it kicks ZFS ass. I also have ZFS servers with all the bells and whistles employed, and it’s certainly useful for everything as well.
Not here to argue, not here to make a guide/tutorial or anything. Just giving a HINT of what’s possible if you actually think outside the box.

Other ideas:
You might consider employing CASS, Intel’s old failed/dead open source cache software, it’s free but very stable. Its a far better choice than trying to get any productivity out of PrimoCache, and much much much safer.

You could also try V-locity, it’s excellent for real time fragmentation prevention and background defragmentation. It’s ounce of prevention is worth a pound of cure. It’s also useful in cluster file systems, as it also configures as a client to and as part of a network server head end, which coordinates various participating nodes.

You could also use REFS, it will provide far less fragmentation for this use case.

After about 4TBs NTFS will have scattered 200 meg segments randomly everywhere, and yes often people must defrag, as the aformentioned record access hdd head movement becomes unbearably slow to access stuff, because you didn’t plan ahead.

Good luck!

25 cents,
Julio

2 Likes

I have a 60-second delay configured, and that hasn’t caused any issues so far. But if I only use it for read acceleration, it’s not effective because the number of requests is unpredictable—it doesn’t consistently read the same files. At least that’s what I’ve noticed.

What you wrote was very helpful, thank you. I’m interested in moving the MFT to the SSD, I’ll look into that.

1 Like

Really NTFS and Windows Server is this bad today. Does it need to defrag SSD and nvme or just hard drives?
Not worked on MS Server for many years.

Basically, Windows’ built-in defragmenter works in the background. Third-party programs can be more efficient, but in some cases they may also be risky — as in my case. SSD drives don’t need to be defragmented (in fact, programs usually prevent it), only optimized. This is essentially the TRIM operation.

2 Likes

They are good enough if you didn’t disable these tasks:

and regarding SSD it’s not needed, because it works physically different, so the fragmentation doesn’t affect it, so usually you do not need to do anything with them. Even TRIM they can do themselves internally.