Storj Node expansion

Hello All!

Hate to open a new topic for this as I expect this to be a quick question that is easily answered, but here goes.

I currently have 2 Nodes running as VMs (2 Core 4 GB ram 3TB storage space each) and they are both just about full. I’m considering spinning up a 3rd node with the same specs, but I’m not sure that makes sense. Would exiting these nodes, and spinning up one VM with the resources from all three make more sense, as the amount of traffic will be split up between the three nodes?

OR would it make more sense to try to find a way to expand the space each of these two current nodes have? Either way I go, I’m going to move SNO to their own smaller ESXi host, so I do have options and can build something and get it ready before moving them over.

Edit: I guess this leads to another question, what are people doing to future proof their setups? Don’t want to be here again in another 6 months or a year trying to expand again. I do have lots of hardware, and can build out a server with a raid array, but whatever I do now should last for at least a year.

Thanks!
Chess

If they are VMs you should easily be able to increase the attached storage/resources?

Spinning up more VMs is a waste of RAM and disk space for the OS IMHO (assuming they are all running on the same physical machine.)

If these are running on the same physical hardware you may as well have a single VM with a larger disk attached.

Thank you for the quick reply!

They are currently running on different hosts but I’m downsizing my setup from 3 hosts to 1 as I don’t need my VMs to be highly available. The power costs is not worth it. But I do want to move the nodes to their own hardware.

So you are thinking I should spin down one of the host, and then just expand out the one to cover all of the space I have available? Or keep the two and just expand them both out?

Separate nodes makes sense if they used one HDD per storagenode.
Since those are VMs, I’m pretty sure that storage is not divided to separate HDD. So in this case - big one node, assuming that you have a redundancy on your storage.

3 Likes

Currently they are on separate 3 TB drives, but that does not give me any protection for a failed drive. Would like to get at least to a setup to survive one failed drive. Should I consider a raid card with a Raid 5 setup?

Please, do not use the RAID5 with a big sata disks:

1 Like

Raid 6 or 10 then.

Thank you for the suggestions. I’ll take it all away and then consider my options.

Raid 5 is efficient with 3 or more drives, a 3.5" drive use about 5Wh so with a 3 TB configuration you would be more efficient with a dual 3TB drive in raid 1, about 10Wh configuration than a 7 500GB drives in raid 5 about 35Wh plus CPU. Or just don’t use RAID 5 as suggest Alexey and cross your fingers, it works most of the time and you save 5Wh or 44KWh/y per drive.

Consider software RAID as well, which doesn’t require purchasing additional hardware. md-raid on Linux is very mature and widely-used in production.

2 Likes

All your nodes will sit behind a single IP adress so you won’t earn more money, instead you might increase the availability (uptime).

But if all 2-3 VMs are on the same host and the host goes down your reputation will as well, for all nodes.
If you have multiple nodes and somehow have shared storage (but dedicated HDs) then it might make sense to have more than one node.

1 Like

No that is a good advice. One can argue about what is a “big” drive however :slight_smile: Just a few TB takes a really long time to rebuild. But there are different kinds of SATA drives as well, some are better suited for running 24/7. I normally replaces my drives just after a few years

I just want to share all possibilities to make a weighted decision.
I have a negative experience with RAID5 a few times. The whole array was lost during rebuild.
I was an Chief of Operations on that moment and we have had dozens branches across the country. We have had a distributed database (Sybase) stored on such RAID5 arrays. And it’s happened a couple of times, when the RAID become died during rebuild. And we forced to unload a database for failed branch from the Central office. In other cases there was a backup, so the recover was much quicker than unload.
As result we replaced the RAID5 to RAID10 everywhere and those problems was happens much less.

3 Likes

I do run a ZFS array for my file server. Never considered that for Storj. I’m not too experienced with md-raid. Any guides you’d recommend I’d take a look at before I jump in?

I’m not too worried about the money. I do understand that all three and any other ones on my same IP range (and region too I think), so at the end of the day having one or three might not really matter. I should get about the same traffic as I do now with two. I guess the question comes down to deciding if I want to attempt to expand the current two vs adding a third. I back up my VMs, so I’m not too concerned about losing a host, I could rebuild and keep my downtime short, as long as I catch it quick enough.

I have the option of either building an array and hosting all of the VMs off of it, or giving each a dedicated HD. An array would give me more security in the case of a single or dual drive failure plus that might speed up the IO for each VM.

To Alexey’s point, if I do go with raid hardware or software, I’d have to do at least raid 6 or 10. I like raid 10, but the loss of 50% storage space is hard to live with. As for raid 6, the write penalty could be a limiting factor.

Either way, I appreciate everyone’s input. I’m leaning to going with an array of some kind, as I think it might be easier to expand out, depending on the array and what hardware/software I use.

There are several levels to your question. But it sounds like you’re going to be running them on the same hardware. So first off I would say avoid making several VMs. There is no need for the overhead of running things in separate VMs. Even if you decide to run several nodes, you can easily do that within the same VM.

So now for expand vs new node
If you have more space on the HDD or array your node is already running on:
Expand the node
If you have a RAID5 or 6 (or zfs with redundancy) that is 100% dedicated to your node and you have an additional disk:
Expand the array and then the node
If you have nodes running on their own disks or other types of raid that can’t be expanded without sacrificing more disks to redundancy:
Start a new node on a new HDD
If you’re trying to decide how to run the entire thing including your existing nodes:
Aim for one node per HDD all in a single VM.

That would be my advise. But search around the forum for RAID vs separate nodes discussions. My advise aligns with the advise storj usually gives, but there are SNOs who disagree with it.

3 Likes

The amount of traffic you get is allocated on a /24 IP basis, so you are getting half the traffic now with two nodes compared to running a single node. You will get even less traffic if you will be running a third node.

Having multiple nodes is however useful if you have multiple physical hard drives as you can see that as running a “raid”, the network will split the traffic to all 3 nodes. But having multiple nodes you need to consider reputation, making them fast enough to win the race.

This is just my opinion and you are free to run as many nodes as you want. I’m currently considering moving my node to a VM as I have a cluster of ESX hosts with automatic failover. I just haven’t figured the storage situation out yet.

You make it sound like you’re getting less traffic in total if you have multiple nodes compared to a single node. You get twice half the traffic with two nodes.

1 Like

So basically the same amount of traffic of one node spread out over all of the nodes. At least that is how I understood it from earlier posts from the Devs.

2 Likes

Logicaly, as Storj team say, there is no limit of trafic to nodes in /24 only one thing is from 1 file you can get 1 peace, today there is small amount of testers and developers, so file pices is small ammount. But when will be lot of clients, and will be big amount of pices, this will be not relevant. There will be more playing role speed, responce time.

1 Like

Obviously, using md-raid with ZFS is redundant so I assume you’re talking about something like ext4 on md-raid.

The ArchWiki article on RAID is a pretty good introduction. Consider reading through the mdadm manpage at least once. You don’t need all of the options, but there’s some good stuff in there to discover (--replace is pretty cool, for example).

1 Like