Choosing the best-fit configuration for hosting 10+ nodes

Hello there.

  1. I have 10 IPs from different (/24 subnets).
  2. Server will be running on Proxmox + 10 HDD 8TB drives (1 HDD per NODE connected via passthrough mode.)
  3. OS Debian + 6 GB RAM per node.
    Please advise on the best-suited filesystem (ext4, ZFS, or another). Also, if running ext4, are there any recommended tunings?

Read the STORJ Node operator TOS Node Operator Terms and Conditions

See section 5 Restrictions.

1 Like

You’ve got lots of RAM, so you shouldn’t need to do anything special for 10 nodes. If they’re new perhaps try vanilla ext4 with the new hashstore flags (which aims to reduce small random IO). It’s experimental but apparently it’s working well for Select operators.

If you have a pair of SSDs (devices, or partitions), many also run the standard piecestore setup with ZFS, specifically to use the special-metadata device. Because Storj has millions of .sj1 files per TB, it really benefits from having all that metadata on flash.

Oh and it’s pretty common to have your node databases on flash too. Many people already have a SSD as their boot device with lots of spare space: may as well redirect the DB files there.

2 Likes

You want to start another holy war?

Yes, there are.

4 Likes

Nothing there prevents anyone from running 28 nodes on the same host whether it has access to 28 addresses in different subnets or not.

This of course directly defeats the whole reason the subnet load management is implemented, but this is far from being the only issue with ToS. Storj ToS is of very low quality. I feel secondhand embarrassment just reading it. For example, the duplicated careless copy-paste exists there for at least couple of years, as is imcoherent numbering. No wonder it’s hard to take such ToS seriously.

  1. Do it all in docker (preferably with a single docker-compose file or similar) rather than VMs. If you want to run a VM, run a single VM and run the docker containers within. Running multiple real OSes just means higher resource usage and more administrative overhead for no real gain.
  2. You don’t need 6GB per node.
  3. If you are going to run ZFS, don’t make 10 separate zpools. Do something like a RAIDZ across them. You pay a small price in capacity, but you don’t lose an entire node if a single drive dies, and the slightly lower capacity won’t matter until the array starts to get full, which will take a while. Turn on some form of compression (even just zle) so that you don’t have storage overhead from files that take up more than one recordsize worth of space, even though the data itself isn’t very compressible.
2 Likes

Hello.
I would say hold it as simple as possible, use what you know very good.
If it is windows then use windows, but without docker. for example i have 100+ nodes on windows.
if linux is best for you, use it.

4 Likes

re: file system, especially when running the current, older, method with a million files, it can be very useful to accelerate metadata with a SSD. For ZFS that would be a special volume or L2ARC. Synology has a custom btrfs metadata cache option. LVM can do it.

The hashstore, which I do not have personal experience with, should reduce the loan on metadata because there aren’t millions of files.

the forums have sold posts (by alexey and others) on ext4 tuning recommendations..

for ZFS I use these settings:

  • compression=on
  • lz4 compression
  • secondarycache=metadata
  • redundant_metadata=some
  • atime=off
2 Likes