PC speed improvement

arrogantrabbit · September 3, 2023, 10:21pm

You keep repeating this unsubstantiated claim.

If you setup the node wrong, yes.

Filewalker is your friend. It touches all
metadata and thus pre-warns caches on start, so that your node can start returning data faster and win more races. If there was no filewalker, running stat on every file at start would be a prudent thing to do for the same reason — to ingest metadata into cache.

If your filewalker runs for days — you don’t have enough ram or have misconfigured filesystem. You shall not run the node until you fix that.

As a point of reference, filewalker on my 10TB node takes about 3 minutes, and is CPU bound, not IO wait. I’ve posted performance graphs before.

Ruskiem · September 3, 2023, 10:36pm

There is lazy filewalker on the start always, thats enugh.
no need for this crazy filewlaker that runs for days.

what unsubstantiated claim? whats unsubstantiated here? that it scan for DAYS? if it got full 7TB? if You have a NOT wrong settings, thats allow it to do it under 30 minutes then im all ears.
i have 14 nodes, 14 HDDs, classic filewalker was always wearing off my HDDs,
STORJ just recently come to its mind, and allowed turning it off, and replaced it with lazyfilewalker, and why do You think they did THAT?

Edit: Don’t be arrogant, read forum, its common knowledge, for example my 7TB was under filewalker above 24h for sure, don’t remmeber but in 48h it was over. So its 2 days. Now when i have it dissabled its over in 30minutes or hour, dwo mayby.

arrogantrabbit · September 3, 2023, 11:04pm

I’ve just explained that with properly configured hardware it won’t run for days.

This one:

The claim that filewalker is somehow is wearing out your HDD along with the contradicting claim that lazy filewalker somehow does not

Hdd doing what is has been design to do is not “wearing out”. Lazy filewalker or not — wear is the same.

I did, in multiple posts. You can review my posts if you want. No reason to repost it again.

Well, should it tell you something? That you keep replicating the same bad setup? I have three nodes, filewalker is not a problem on any of them.

Because they got tired reading people who can’t figure out how to configure storage and yet try to be service providers complain on forums.

Ruskiem · September 3, 2023, 11:05pm

Read the forum, and read my eddit…
were DONE talking Arrogant, the guy should try for him self, just remmeber @pauliv to set a
"storage.allocated-disk-space: " to 80-90 % of Your real free space, not more.

arrogantrabbit · September 3, 2023, 11:13pm

Dude. It is a common knowledge that if you don’t have enough memory available to cache metadata you will end up in that situation, and this also means you will have extra seek latency associated with metadata read on every download, and will be losing more races than necessary. You have a bad setup. Fix it. Filewalker is not a problem.

Look at one of my posts with io graphs over time. You’ll see there is about 3 min of metadata reads at 100% cpu core usage, with IOps gradually diminishing to zero over this time as metadata pages are getting ingested into ram cache.

Then subsequent reads are spending no time looking up data and go straight to download.

If filewalker did not exist — I would have stat-ed every file on start to emulate this behavior and prewarm caches. I do that for other services anyway (minio)

You lost me here. If you disabled it, what does take 30 min?

Ruskiem · September 3, 2023, 11:20pm

i think i have enough RAM, but i will do experiment and add 2-3 times more and see, if it helps with classic firewalker just like You described i will give You a credit. But the RAM is always more than enough at start like 2,5GB per node. 1 node 1 hdd. And the RAM useage don’t come above 60-70%, thats why im conluding the ram is enough, but i dont know, mayby making it 4-5GB will really make big differenc at node start idk, will try and let You know. For now, i know the turning off the classic filewalker make wonders in HDD usage at start like night and day.

Now, In 30 minutes the initial scan of STORJ HDD is over, the 100% usage of the HDD, often in 10-15 minutes. Because there is still some scan, just not the stupid classic filewalker, that goes 1 by 1 on every file every node restart, i bellieve.

snorkel · September 4, 2023, 6:54pm

A. 2.5GB of RAM for a node of 7TB Is still low.
The best aproximation of RAM needed for optimum machine setup is 1GB of RAM per 1TB of node.
When I was using only 2GB of RAM, the FW indeed ran fo days; after upgrading to 18GB, it runs for an hour or 2. I didn’t check it for months, because I keep it off also. I don’t see any benefit from keeping it on. But I’ll recommend running it at least once a year.
My HDDs are CMR, ext4, SATA connections.

B. The services that run at start are FW if is ON and Garbage collector. You can’t turn that off.

C. What you see as memory used is the memory used by programs, witch usualy is a few hundred MB or some GB. But the rest of the RAM is not free, is just occupied by buffers and cache, that helps a lot speeding things up. So not used RAM dosen’t happen with storagenodes.
You can check my findings in the Tuning the Filewalker thread.

Please, keep it civilised, and don’t attack the person; just debate the subject of discussion. Is one of the terms of conduit here on the forum. Is verry easy to let your sentiments dictate your responses, but as I learned the hard way in my younger days, the best way to speak on forums is to be productive and counter the subject of discussion, not the persons that discuss them. Speaking on forums the right way has a learning curve as many things have; just be open to impruve. This is for everyone here, not just you in particular. I made many mistakes over the years, and the best achievent is that I learned from them, from other forum members and from my friends wiser than me.

pauliv · September 5, 2023, 7:57am

Here is my task manager.
Disk 1 and Disk 2 are data for storagenodes. loaded 100%. At the same time RAM usage for SN is up to 50MB.

Vadim · September 5, 2023, 9:53am

RAM not help so much on windows as on linux.

daki82 · September 5, 2023, 6:30pm

I noticed that my first node has an mft of 16gb.
And ram of 8 gb.
Defrag is terrible slow . Swap file is used much.

I would suggest, if this is not your storj only pc:
Depending on this questions!

Are there free ram slots 2? 4?
and how much can the mainbord manage in ram?
Dual channel?
Use cpuz to check if not knowing.
Are there 3.5" slot and sata +power available? Maybe nvme?
Budget?()

I would go for 2x16gb (2×8) in ram and an samsung ssd of 2tb(1tb/ 512gb) then doing more overprovisioning and leave a 256gb free if you want to go primocache.

Then put the databases and orders folders and logfilefolder (consider loglevel to error) of both nodes on that ssd. Also the swapfile since your os drive seems hdd also.

The logs! Maybe they are bloated.

Stop nodes. Delete/ rename logfiles (storagenode.log)/start nodes.

Oh boy. Your gpu uses your ram too.
Maybe you have only 1 8gb stick?

Test the new ram first with memtest.
Maybe starting it for 2 gb ram each.

After that id run ultradefrag for all drives.

Then you can slap primocache on top, if not enough.

My opinion. No guaranty

daki82 · September 5, 2023, 6:40pm

Make it 2Gb per TB. My 8tb node has an 16gb mft.

snorkel · September 5, 2023, 6:59pm

I’m not realy sure if this metadata is dependent of sector size, but maybe the 512e drives reflects in a bigger metadata/filesystem/mft (I’m confused with all these new terms ;)) and 4Kn reflects in a smaller one, because has fewer sectors. The 1GB per TB of space was reported by others.
Is there a way to check its size with a command? Or you just take the number of files and multiply that with something?
How exactely can you calculate it for Linux systems, with ext4?

daki82 · September 5, 2023, 7:17pm

Ummm. In windows ntfs ultradefrag shows the size after analysing the drive.
Maybe 1/1 for linux
And 2/1 for windows?

daki82 · September 5, 2023, 7:24pm

I dont even know if ext 4 has an mft equivalent …

Toyoo · September 5, 2023, 8:01pm

ext4 has inodes which, by default, take 256 bytes per file or directory, with some additional small amount stored in directories for file names. With typical Storj usage this means about 1 GB of inode data for each 1 TB of stored blobs.

If you plan ahead, you can reduce ext4 inodes to 128 bytes, then it’s half a gig per terabyte.

Random googling suggests that a single MFT entry is 1kiB, but that’s surprising, so not sure about it. Also, this number would suggest file walking would be quite a lot slower than on ext4.