Plans for bigger single nodes beyond 24Tb?

Wondering if you have plans to allow nodes as big as 100Tb+ or even 1Pb.

Currently believe it’s 24Tb is the largest size node to date.

I believe one of the townhall meetings mentioned there would be no limit.

Fantastic news, hope that comes to pass.

I also can confirm, that no limit for storage side.

Guess this begs the question, will there be dynamic add/subtract the size of node?

I.e. Say you get a new drive and want to add it to the active running node or want to start storing movies for a plex server so you downsize your node. - This would be a graceful exit for the downsize.

At this moment I know two solutions that can be grow (not shrincable):

  1. Software Raid5 (Raid6) based on md raid
  2. LVM (just add new physical volumes and expand existing logical volumes)

But all this solutions have cons and prons.

The problem is, as soon as you start hoarding a lot of data in one place, you defeat the point of a decentralized network.

4 Likes

You would need to hoard 51% of the total storage available for that to happen…not really worried about that.

Granted, can see some DataCenter IT guys just turning all the un-used HDD space and spinning up Storj Nodes.

Adding drives on a layer 1 basis isn’t the question. It’s how do you add volume to the docker container with out taking down the node/container. Not a docker expert.

I can see some IT boss getting ideas and trying to set up a few Petabyte nodes. Maybe a limit is a better idea to prevent this problem, say 100 TiB? You would need upwards of $30,000 to offer that much space, and thats just for the disks.

Part of me wants a cap and part of me does not. I have access to cheap symmetrical 1Gb fiber and 50TB online now with my NAS. I want STORJ to be wildly successful as I will use it for our clients once it’s finally released so have a vested interest in both sides of the platform. With V2 I could simply spin up another node and run with it, but maintaining multiple nodes because a major headache so a single V3 node is very attractive to me as well, but it limits the storage I can put towards it as it needs to be contiguous.

We plan to implement (in the future) an option for partial graceful exit, for the purpose of being able to downsize your storage space offered in order to free up disk space for other uses. This would involve having to send any excess data you already stored beyond your new lower allotted space, to other nodes in the network via repair, in order to be able to use that space yourself again for other purposes. We do not want to lose any customers’ important data, so it would be mandatory that you not randomly delete data from your disk to downsize your node, which would lead to database corruption, but instead request and follow the partial graceful exit process to assure the repair process will save the data to be deleted first on other nodes. More details about this will be made public when we get closer to having this feature ready for release. Needless to say, this will take some time as it is not a trivial problem to solve.

Please read the changelog of our latest v0.14.3 release and note that IP filtering has been implemented, which means you can’t just spin up multiple nodes on the same IP or in a datacenter without running into trouble pretty soon.

What do you mean by contiguous, in this context?

Just need a public IP per node then? :slight_smile:

Meaning that the storage has to be all on one volume. With V2 you could attach another drive, add a folder, and then start storing data on it.

Yeah but datacenters have blocks of IP addresses. We used to have a rack in a local DC and have 20+ public IP addresses forwarded to servers. Will Storj know that this is possible?

Does LVM not take care of this concern about everything being on one volume? Live partition and fs resize seems to be pretty good these days.

If LVM isn’t good enough, help us understand why and maybe we can design a solution.

You’re right, an IP-based system naturally can’t detect that six different AWS IPs (for example) all go to the same server. We may eventually add smarter detection of clustered nodes (noticing if they tend to go offline and online at the same times and all have IPs in Amazon blocks, for example), but even then it would be possible to fool the system with enough work or resources. We just want to make the work/resources necessary to fool the system be more costly than the benefit one would get from running multiple nodes at the same location (an unfair share of new data being stored). Data reliability suffers if nodes are not really independent.

And part of that effort is making it as convenient as possible for SNOs to run (single) nodes and scale them up to large data sizes.

1 Like

LVM should take care of those concerns however if you are encouraging desktop users to become a node a lot of them are on Windows or Macs.

Storj is aware of this detail :wink:

1 Like