Recommended GB of RAM / TB stored?

wthorp · June 25, 2024, 10:21pm

I was reading somewhere about a ration of .5-1 GB per TB stored, possibly related to ext4 inodes. I’ve been searching for nearly an hour, and I cannot seem to find where I saw this.
Can anyone help with a link, and/or an explanation of where this ratio applies the most?

BrightSilence · June 25, 2024, 10:34pm

That seems excessive. The RAM usage of the node itself is more related to the amount of incoming traffic than data stored. But I’d say it’s impossible to make such a general declaration to begin with. If you have slow storage, you’re going to need more RAM. And then file systems such as ZFS might need more RAM as well. Same with certain other RAID solutions. If I’d have to make any statement on it I would say 2GB per node on the safe side. But 1GB per node probably works as long as you have like a minimum of 4GB RAM in the system. But these numbers really depend on many variables.

Toyoo · June 25, 2024, 10:44pm

Please search for my posts on the topic, especially the low i/o storage thread. Sorry, too tired to look for it now.

nerdatwork · June 26, 2024, 3:00am

snorkel · June 26, 2024, 6:44am

In @Toyoo 's thread the assumption was that you want the entire file system cached in RAM for speeding up the walkers and general behavior of your node. So based on the median size of pieces and the size of inodes, there was like 1GB of RAM needed per 1TB of stored data.

stuberman · June 26, 2024, 1:32pm

I have three nodes (HDDs) on a single machine using EXT4 storing a total of 31 TiB.
I have 6GB ram.

free -h
total used free shared buff/cache available
Mem: 5.8Gi 506Mi 134Mi 1.0Mi 5.2Gi 5.0Gi
Swap: 4.0Gi 288Mi 3.7Gi

Toyoo · June 26, 2024, 11:35pm

Indeed. Plus I was assuming a specific average piece size, which has gone down a lot since that time.

So, effectively, default-formatted ext4 requires 256 bytes per file for inode, and ~60 bytes per file for dirnode (file name + 8 bytes), other costs being negligible. So 1 million pieces ≈ 320 MB of metadata.

mke2fs -I 128 reduces the size of inode, making 1 million pieces’ metadata weight around 190 MB.

Now multiply by average piece size and you will get your numbers per TB stored.

I fully recommend studying ext4 documentation here: ext4 Data Structures and Algorithms — The Linux Kernel documentation, it’s pretty well-written.

snorkel · June 27, 2024, 4:32am

What’s the average piece size now?
Found it:
https://forum.storj.io/t/node-goes-down-restarts-every-10-15-minutes-thread-allocation-error-in-logs/26010/78?u=snorkel

4.34MB segment size.
29 pieces/segment.
0.15MB piece size.
320 bytes of RAM per piece.
2.133GB RAM/TB of pieces.
For a 24TB drive (recommanded maximum node size), if you use 23TB allocated, you will need 49GB of RAM. Wow!
That’s 46GiB of RAM as we messure RAM capacity.

Toyoo · June 27, 2024, 8:19am

No longer true. @littleskunk was testing RS codes with 16 pieces per segment, and it seems (almost?) all the recent test data is like that.

snorkel · June 27, 2024, 7:09pm

So that will give 1.18GB RAM/TB of storage;
28GB for 23TB node.

lookitsbenfb · June 29, 2024, 10:15pm

I’ve been running a 18tb on windows computer with 16gb of ram for 3 years without any issues?

Given that I’ve now been learning that windows set up is so much worse than Linux, how can it be that I’ve been doing just fine with what, based on this thread, is way less RAM than I should be using!?

Surely this makes scaling incredibly expensive?

Toyoo · June 29, 2024, 11:06pm

I don’t know your setup, so I can’t reason about it, sorry. And even if I did, given you use Windows, I wouldn’t be able to help much, as I don’t use it myself.

In general though, if you can make sure that metadata fit in RAM (or a SSD-based cache), then it is much easier to ensure a fast file walker. But if a slow file walker was never a problem for you, then that’s great, you don’t need it. By fast I mean ~10 minutes per TB of pieces, which is reasonably achievable exactly if metadata of all pieces are placed on a medium faster than HDD.

snorkel · June 30, 2024, 6:24am

The biggest and visible influence of RAM is on walkers speed, especialy the startup piece scan. For a high RAM system, this scan that we all call The File Walker, can take a few hours, and for low RAM system a few days, for a 20TB data stored node. You can see my test results on Tuning the Filewalker thread, somewhere in the last posts.
I run 2x7TB nodes on a Synology with 1GB RAM. I moved the db-es on a flash drive and reduced the log level. They run just fine, all walkers finish, even the startup one, but… it can take even a week or more when they are full.
But I disabled the lazzy mode and the startup scan on all my nodes. I don’t have to run them on each restart, so not big of a deal now the lack of RAM.
If you also want to do something else on your system, you want it to be responsive. So here again the more RAM helps a lot.

ACarneiro · June 30, 2024, 7:49am

Well, technically, we seem to be obsessing about RAM in order to make filewalkers run as quick as possible. But these are not strictly necessary for the node to run just fine (provided no database issues have occurred). Doesn’t matter if it take 2 hours or two days to run.
This may chime in with @lookisbenfb’s experience.

snorkel · June 30, 2024, 11:17am

Well, it seems that you realy need to run filewalkers from time to time, and for the huge nodes that we have today, that will only get bigger, to run them in a few days you need a lot of RAM. Imagine the recent testing will be the usual load. Add on top of that a piecescan of 20TB and you have an HDD on full usage 24/7 with walker finishing in weeks.