Multiple storage nodes

But with raid 10 you’ll lose half your storage space

HDD storage is insanely cheap. Consider what happens without redundancy… let’s say you have a 2TB disk. When that disk inevitably fails, your node is kicked out of the network. You buy a new 2TB disk and start over. Now you have to wait about a month for a satellite to vet your node, and you’re not getting paid as much storage and (more importantly) egress revenue until your disk starts to fill back up, which could easily take 6+ months.

If you’re willing to start over at zero each time, then sure. In the long run, I think that net revenue will be higher using RAID10 than using multiple nodes with raw disks; your revenue is not going to drop down every time a disk fails.

If you have no intent to replace failed disks (they are truly just spare disks you aren’t using) then the multi-node setup could be useful.

redundancy makes little sense. Instead of wasting half your disks, you could run nodes on all of them and use all available space. You have the potential to make twice as much money and if one disk fails the worst that happens is fall back to your single disk potential. With anual disk failure rates at 2% for modern HDDs, there really is no reason to over obsess on redundancy when there is redundancy built into the network already.

3 Likes

Ok, I’ll grant you that explanation makes sense.

Now, how do we allocate monthly traffic capacity to each node? If we’re running 12 nodes, just divide by 12? What happens when one of the 12 nodes has used all of its monthly traffic allocation, but the others haven’t? How can we pool traffic among nodes? This is important considering that egress traffic is by far the most lucrative metric.

With one node, it should be simple to approach capacity each month with enough storage. With multiple nodes, a “hot” node for egress may go underutilized because the traffic capacity partitioning is strict. The only way to deal with this currently is either (1) overcommit traffic (lie to the network) or (2) have a daemon that monitors the remaining capacity and periodically reallocates it among all of the nodes (stopping, removing, and recreating each docker container).

We need a way to specify a monthly traffic limit for a whole pool of nodes.

I want to use all my disks as you describe. I have an older data center class (X8DTT) server. I have an extra token I have been sitting on until my current node (HDD1) fills up. HDD1 is currently at 1.4/2 TB capacity. I’m not an Ubuntu 18.x guy, but that’s what it’s running.

  1. I don’t want to F up my current node which has better than 99% uptime. What are the correct commands to start the node on HDD2?
  2. At what fill % should I move forward with starting a new node on HDD2?
  3. Will there ever be a GUI to easily start new nodes on each HDD? If so, when?
  4. I understand from your replies that there is no ingress advantage to running multiple nodes on a single server (or even multiple servers) using the same IP. HOWEVER, it seems apparent to me there would be an EGRESS advantage. More stored equates to more egress, presuming no ISP bandwidth or data caps. Is this thinking wrong? If so, why?

Thanks in advance to you or anyone else replying. :slight_smile:

Let me give it a shot.

  1. You would be creating a new identity and signing it with a new token. Just make sure you use another name like storagenode2 to create and sign the identity. You’ll also be creating a second container, which then also needs a different name, so storagenode2 could be used there as well. You need to use a different port for this node as well, so do the port forwarding and firewall rules for another port (like 28968). In the docker run command you would then change the ports in the address and the first port in the -p parameter. -p 28968:28967 The second port stays the same, that’s the port used inside the container by the storagenode.
  2. You want to give the second node some time to get vetted while the first node can still do full duty. I’d say 75% is probably a good time to add another one, so you’re pretty much at that point.
  3. I know there is work being done for a windows based install and binaries for most OS’s that don’t require docker. I think there will likely be a GUI as well, but not sure there are plans to have that support multiple nodes in one interface. This sounds like a good question for the upcoming town hall. https://zoom.us/webinar/register/WN_i_e4wM3JQheAuWzBw_pIVg
  4. You’re right and you’re wrong. It depends on what your question is. Between 2x2TB or 1x4TB there is no advantage. But 2x2TB would indeed have an advantage in egress over 1x2TB if there is more than 2TB of data stored on the nodes combined. The shorter version, if you store more data, you generally get more egress as well. But that can be achieved by having a larger single node as well as multiple smaller nodes. There is no advantage to doing the second just for income. It should be about equal.
2 Likes

Hi,

I’ve missed a bit since migration from rocket-chat here. Just arrived here. I’ve tried to look for a clear answer but couldn’t find one with regards to running multiple nodes on different location but same e-mail/invitation. Last I know was something to arrive on future releases.

Can I run multiple nodes on the same account/e-mail address, is this option available yet or do I need to register for a new invite with a new email address.

thanks.

You can have lot of nodes and 1 email address.
Just get invetation, generate indentity, sing indentity with invetation code.
Start node.
After 24h you can ask new invitation on same email. and make new node.
I have 11 nodes with same email address.
But each node have to have own indentity with signd own invetation code.

1 Like

Hello @george,
Welcome to the forum!
You may take a look:

1 Like

All this is confusing:

I already have invite since 3 months or so ago.
Identity is not the same with token, need different token for each node, but can do it on same e-mail? Why do I need another invite on the same e-mail if I already have an invite or token, sorry identity. I have no idea what to do.

Someone please take a minute and make this easy for people to understand?

I’m positive there are tons of people thick like me that would like to join but cannot because this is difficult to understand.

Every indentity should be unic, and signed with own invetation code. Storj need to control somehow how many storage are maded, thats why it made this way

@george

Every node needs an uniqe id just like every car needs a uniqe vin.
Its to know who is who.

When your first node has its identity and is started the sattelite know that this node exists.
Next day you can request a new token with the same email and repeat the procedure .
This new node will have a new id.

Hope this made some sence to you.

1 Like

Ok, cool.

Now where do I request a new token using the same e-mail address?

thanks

https://storj.io/sign-up-node-operator/ as all tokens.

1 Like

Please, use this guide:
https://documentation.storj.io/dependencies/identity

2 Likes

I cannot get my second node on a different computer on the same network with the same external IP to work. I read everything here over and over. The one thing I did not do was to change the name to storagenode2 since that node is on a different computer. Both ports 28967 & 28968 are open. Watchdog upgrades the node and is now on v0.29.3. No matter how long I wait my Linux dashboard says Last Contact: OFFLINE.

Do you forward port 28968 to the second PC? What is your docker command on that PC?

make sure your second start command has -p 28968:28967

I had to permit port on linux firewall, when I was starting my linux node…just try it…

is there a way to verify in fact, that more than 1 node is under same ip /24 familly and treated as one.?