Plans for bigger single nodes beyond 24Tb?

aeonaura · June 25, 2019, 1:36am

Wondering if you have plans to allow nodes as big as 100Tb+ or even 1Pb.

Currently believe it’s 24Tb is the largest size node to date.

KernelPanick · June 25, 2019, 2:02am

I believe one of the townhall meetings mentioned there would be no limit.

aeonaura · June 25, 2019, 2:04am

Fantastic news, hope that comes to pass.

Odmin · June 25, 2019, 4:42am

I also can confirm, that no limit for storage side.

aeonaura · June 25, 2019, 4:46am

Guess this begs the question, will there be dynamic add/subtract the size of node?

I.e. Say you get a new drive and want to add it to the active running node or want to start storing movies for a plex server so you downsize your node. - This would be a graceful exit for the downsize.

Odmin · June 25, 2019, 5:08am

At this moment I know two solutions that can be grow (not shrincable):

Software Raid5 (Raid6) based on md raid
LVM (just add new physical volumes and expand existing logical volumes)

But all this solutions have cons and prons.

subwolf · June 25, 2019, 5:33am

The problem is, as soon as you start hoarding a lot of data in one place, you defeat the point of a decentralized network.

aeonaura · June 25, 2019, 5:37am

You would need to hoard 51% of the total storage available for that to happen…not really worried about that.

Granted, can see some DataCenter IT guys just turning all the un-used HDD space and spinning up Storj Nodes.

Adding drives on a layer 1 basis isn’t the question. It’s how do you add volume to the docker container with out taking down the node/container. Not a docker expert.

subwolf · June 25, 2019, 11:27am

I can see some IT boss getting ideas and trying to set up a few Petabyte nodes. Maybe a limit is a better idea to prevent this problem, say 100 TiB? You would need upwards of $30,000 to offer that much space, and thats just for the disks.

ryan · June 25, 2019, 1:36pm

Part of me wants a cap and part of me does not. I have access to cheap symmetrical 1Gb fiber and 50TB online now with my NAS. I want STORJ to be wildly successful as I will use it for our clients once it’s finally released so have a vested interest in both sides of the platform. With V2 I could simply spin up another node and run with it, but maintaining multiple nodes because a major headache so a single V3 node is very attractive to me as well, but it limits the storage I can put towards it as it needs to be contiguous.

heunland · June 26, 2019, 2:56am

We plan to implement (in the future) an option for partial graceful exit, for the purpose of being able to downsize your storage space offered in order to free up disk space for other uses. This would involve having to send any excess data you already stored beyond your new lower allotted space, to other nodes in the network via repair, in order to be able to use that space yourself again for other purposes. We do not want to lose any customers’ important data, so it would be mandatory that you not randomly delete data from your disk to downsize your node, which would lead to database corruption, but instead request and follow the partial graceful exit process to assure the repair process will save the data to be deleted first on other nodes. More details about this will be made public when we get closer to having this feature ready for release. Needless to say, this will take some time as it is not a trivial problem to solve.

heunland · June 26, 2019, 2:59am

Please read the changelog of our latest v0.14.3 release and note that IP filtering has been implemented, which means you can’t just spin up multiple nodes on the same IP or in a datacenter without running into trouble pretty soon.

thepaul · June 26, 2019, 3:38am

What do you mean by contiguous, in this context?

aeonaura · June 26, 2019, 5:51am

Just need a public IP per node then?

ryan · June 26, 2019, 10:57am

Meaning that the storage has to be all on one volume. With V2 you could attach another drive, add a folder, and then start storing data on it.

ryan · June 26, 2019, 10:58am

Yeah but datacenters have blocks of IP addresses. We used to have a rack in a local DC and have 20+ public IP addresses forwarded to servers. Will Storj know that this is possible?

thepaul · June 26, 2019, 7:44pm

Does LVM not take care of this concern about everything being on one volume? Live partition and fs resize seems to be pretty good these days.

If LVM isn’t good enough, help us understand why and maybe we can design a solution.

thepaul · June 26, 2019, 8:05pm

You’re right, an IP-based system naturally can’t detect that six different AWS IPs (for example) all go to the same server. We may eventually add smarter detection of clustered nodes (noticing if they tend to go offline and online at the same times and all have IPs in Amazon blocks, for example), but even then it would be possible to fool the system with enough work or resources. We just want to make the work/resources necessary to fool the system be more costly than the benefit one would get from running multiple nodes at the same location (an unfair share of new data being stored). Data reliability suffers if nodes are not really independent.

And part of that effort is making it as convenient as possible for SNOs to run (single) nodes and scale them up to large data sizes.

ryan · June 26, 2019, 9:19pm

LVM should take care of those concerns however if you are encouraging desktop users to become a node a lot of them are on Windows or Macs.

heunland · June 27, 2019, 5:23am

Storj is aware of this detail