Windows Node keeps crashing

I made not typical defragmentation, but mft record defrag. usual defrag in windows dont do it as far as i know.

2 Likes

This makes much more sense – defragmenting MFT saves IOPS for metadata fetches. However, with enough RAM this also becomes irrelevant, once all metadata is ingested.

Yes. i meant the rotation hdd without dbs.
As dbs could boost fragmenting. In ext4 and ntfs.
Thats why dashboard takes so long to load sometimes.

As we figured out in my other post its best to install programm (because of orders and log)plus dbs (right at start of the node) to an spare ssd.
I would ad that to the hardware recommendations.
Minimum is something else.
But maybe this the reason pi4 sdcards go down?

Unfortunately under Windows it’s not so effective.

This is not necessary, you may change it later.

Partially. But in case of docker setup you may configure all parameters beforehand during the setup process, just provide all needed parameters after the image name, and they will be added to your config.yaml file automatically.

To have less wear of SD card you need to reduce logging on SD or move logs to the HDD, disable swap on SD and so on. However, they are wear out anyway.

Then there is me, having fun with the 16gb mft for 8 tb of data.
And 8gb ram.

IIRC form my windows days, by default desktop windows versions optimize for applications, and restrict the system cache size. There is an option to “optimize for applications” vs “optimize for system survives” or similar sounding somewhere that moves the balance somewhat.

Ideally however one should look at the Performance Monitor’s metric “Memory\System Cache Resident Bytes” and increase it if it happens to be limited and causes a lot of churn (in the recent versions of windows a bug was fixed that allowed it to grow indefinitely – which ironically is what we want here, where we deal with massive amount of small files (some of which are actually stored in MFT to begin with, due to size).

This is the API to set it: SetSystemFileCacheSize function (memoryapi.h) - Win32 apps | Microsoft Learn and this is MSFT provided app that utilizes the api: https://www.microsoft.com/en-us/download/details.aspx?displaylang=en&id=9258

I would say, that you can dive deeper. It has an explicit setting on each HDD - do you want to cache writes, or not.
For our case - if you have a managed UPS (so, it can be safe shutdown before the run of juice), you may enable cache writes. If not - you must not enable it.
Or get consequences of lost data, corrupted databases and in edge case - disqualification. Simple.

No, this is entirely different. We were talking about increasing the size of the system file cache to fit the MFT in its entirety to eliminate metadata read seek time. There is no increase is the risk of data loss.

Separately, you still need a UPS. Windows does not make all writes synchronously, that would be horrible. Enabling per-disk write caches just increases the amount of data that will be lost in the event of power loss. And often, that amount is insignificant in the context of storage node. It’s very hard to corrupt MFT itself – there is a second copy. But this is not what we are discussing.

Wich kind of node has 16gb of ram to spare for cache? Even with 32Gb ram i dont have that.
Sounds like primocache with extra steps.

An 17gb mft for 8tb data it is for me.

I use primocashe with 512-2tb ssd, no ram cashes, because it will be deleted in case of power loss.

Primo cache would effective result I the similar performance boost – frequently used blocks, including those belonging to MFT, will end up cached on SSD.

My node has 32GB (max my old motherboard/processor supports) out of which 9GB is used by services, so the rest – filesystem cache. I get 99% cache hit ratio. I consider my hardware measly ancient crap (circa 2012) and feel slight embarrassment discussing it. I’m sure most people run nodes on modern hardware that is not only beefier but more power efficient.

Remember, we are not building new hardware to run node on, we use what’s already deployed.

On the other hand, if your node is already resource constrained by hardware, why are you even considering running windows on it in the first place? There are much more lightweight storage-focused OSes.

The cache warms up within 10 minutes on node start due to file walker. I would not worry much about the cache persistence.

My box has 16 TB of ECC RAM for 25 TB+ worth of nodes. The RAM was cheap, I got old used chips for almost pennies, IIRC I’ve spend half of a single month of Storj revenue. Good investment too, it made a huge change to win ratio and profit.

I wish I could fit more RAM, but this box won’t accept, CPU’s too old.

BTW, storage nodes could be optimized more regarding memory use. I don’t know how much RAM is needed by NTFS, but for default ext4 setups 1 GB of RAM can hold metadata of 1 TB worth of nodes. Optimized ext4 can go as low as 500 MB of RAM per 1 TB worth of nodes. I’ve got some back-of-a-napkin estimate that it should be possible to write code so that only 200 MB of RAM would be necessary per 1 TB of nodes, regardless of the underlying file system. It’s a lot of work, though, it’s easier to just get more RAM or SSD for caching.

That would be awsome, especialy for NAS users, with limited RAM expandability.

The thing here is, at the current Storj scale, it’s not worth the effort. You’d probably need 2-4 weeks of a dedicated experienced developer time to get it right, probably twice that if you include testing for reliability. And the design would not be as flexible as the current one, so developing new features will also be impacted. All that to help nodes which are just a small slice of the network: they’re not one of the thousands of nodes small enough to not be affected yet, but they’re too cheap to just get more RAM to actually become a large node.

It could probably make sense when Storj gets to the exabyte scale and gets the unit profitability right, because then spending time on getting such slices of network right will be profitable.

There must be impruvements to cosider in the near future, to be implemented in storj software, taking into account that HDDs used for Storj are getting bigger and bigger, the number of pieces stored are huge, and the speed, I/O capacity, seek time, aren’t increasing.

The node crashed again last night and I decided to gracefully exit all the 8 TB node. Will rebuild it on Linux and ext4 Basis. NTFS and Windows is just poor and will always create problems.

i strongly disagree. Got 14 nodes only on windows GUI, and im very happy how it works. I got docker 3 years ago, and it was always pain to set it up, or edit. With windows i have no problems, under win 10, other that win10 like to do whatever it wants with updates, no mater what user says, but i managed to deal with it, and its smooth now.

edit: tested it on different computers, different parts, intel, amd, always smooth with NTFS, if You have at least 2,5GB RAM per node.

edit2: also i never defragmented a single HDD, all i have to do is watch if they are online, and pay internet bills in time:>
thanks to services option, like daki82 mentioned, it makes it restart proof.

1 Like

Bc 2. Node runs on my gaming pc. 16gb ram system and things other 16 for gaming
There is no way to another os that fits it all