One Petabyte lying around

Hi lads and lasses,

as the title suggests I have a whole lot of storage lying around but that’s where things get complicated…
From the things I have read in this forum, I can’t fill up that Petabyte because ingress is to low and splitting up into multiple nodes ain’t gonna help it either because of the same IP address.

Now I was wondering if there is a way to do it anyways. For example if I were to use a different VPN for each node and port forward each node separately then could I fill them up quicker?

Will this result in a ban of my nodes in any shape or form?

What would the best way to go be, if I wanted to split that storage up? Currently it’s around one Petabyte after RAID 6 so with the 10% headroom about 900 Terrabyte that need to be turned into nodes efficiently.

One last thing, I know it’s not really optimal what I’m trying because it conflicts with the idea of decentralisation to store 1 PB in one location.

Sorry for possible typos, english isn’t my native language :slight_smile:
Kind Regards, Tyler

Just a note, 1PB in single raid 6 is not safe. It is very likely to fail on rebuild.

2 Likes

This is against Node Operator Terms & Conditions
It’s also may put customer’s data into danger. Lost data=lost customers=no payment for you.
The rule “one node per /24 subnet of public IPs” here is for a reason - to avoid big loses for the network, if your RAID will die or ISP will have issues, problems with electricity and so on. We want to be decentralized as much as possible.

And I can confirm - so big RAID will fail, just matter of time.

You may run nodes in a separate physical locations on different hardware and ISP, this is a best case.

1 Like

Okay, so what would my next step be? Split the drives up into different servers? Probably 100 TB each? And connected them each to a ups for power protection?

One problem remains, I don’t have different locations. I get a symmetrical Gigabit connection here, but only one, with one static IP address.

What raid would be best in my case? Just wondering as a Raid 6 + hot spares has worked fine in the past what is a better solution?

Kind Regards, Tyler

edit: I could (theoretically) use Starlink or cellular as a backup connection that kicks in if the ISP fails

For large storage in enterprise setting, raid 6 per each 8-10 drives with spare depending on size, stripped. Those are enterprise grade drives tho.
With consumer grade drives, if I wanted to consider raid 6, I would not go above 6 drives per array. and even that depending on their size and rating. But I would not consider any large capacity consumer disk safe for raid 6.
Pretty much only worthwhile setup for it would be stripped mirrors. But that is an overkill for storj in my opinion.

I would not say you need to split drives into different servers. just keep one standalone/array per node, so only one node can be affected in case of failure.
I know that ToS in not very clear, but this seems to be how it is commonly used by people here on forums.

In your place, I would separate drives you intend to use for storj and just use them as standalone without redundancy, as storj is already redundant solution. One node per drive on same server.

Backup connection is really up to you. I dont think many have redundant internet in home setting. If you think its worth it, sure, why not.

Thanks a lot already!

The drives I have have been in use the past 2 years and have been filled with data for archiving. They’re Seagate X16 16TB drives (around 80 of them).

I assume I can split them into 10 drives/array and then run one node/array.
Totaling 8 Nodes

But you said I should not use raid, as storj is already redundant but I should probably still use raid to prevent the node from data loss and thus not being payed.

So turning my 8 arrays into Raid 6 (with no capacity for hot spares left in the servers)
I turn 32 TB / node into Redundancy (2 Drives)
I keep 128 TB /node for data storage (8 Drives)
And end up with about 1.024 TB usable space (not considering real size and 10% headroom)

@Tomaae the real problem here is that as Alexey stated. I won’t get that storage filled in the next 20 Years (yikes) because I won’t get the data seperate for each node beacuse they are in the same /24 Subnet

So even if theoretically the data was 1000% secure the storj network wouldnt compensate me for that size of storage

Kind regards, Tyler

edit: I have a private node running w/ 2x 4TB just for testing. Those drives and server are at my work site as I don’t need a Petabyte for my private stuff

X16 are 10^15, so 8 drive raid is a theoretical ceiling with that capacity. I would go for less tho, to be more on safe side. It really depends if you want more capacity or prefer data safety.
Nothing stops you from using raid for storj of course. Just remember that by ToS, you can run only one node per array.

It will take really long time to fill, thats for sure. Its lot of storage space after all.
Give it a try and you will see. I think it really depends on customers you get via storj. I have 8 nodes and they are not similar at all.

Yea I know that it’s probably not the best for Redundancy to split the drives as I explained but to be fair they’ve been running all in a raid 6 so far so basically no redundancy :joy::skull:

I’ll go for Raid 6 / 10 Drives > 2 Drives Redundancy

And even if one node goes bad the majority will be good

Kind regards, Tyler

it’ll depend on how the raid is set up. These calculations are based on URE’s being unacceptable. Which for Storj is not the end of the world. If the RAID is set up to ignore URE’s and keep functioning, this is perfectly fine for Storj and only risks extremely rare corruption of a single piece. However, many hardware RAIDs are set to fail completely at the first URE, so be careful to ensure that that is not the case. Because that would have you lose the array at the first URE, which as you say is very possible at arrays that size.

Normally I would recommend against raid entirely, but having more space than you know what to do with is an exception. Especially if you’re planning to run a larger operation, it may be worth it just for operational simplicity. It’s much easier to replace a drive an rebuild, than having to spin up new nodes. If more reasonable disk space limitations exist (say, less than 24TB total), raid should always be avoided.

3 Likes

I would run it 1 disk=1 node.
But you should mark it very well, hardware level vs logical level to know what node on what physical disk.
This big arrays hard to maintain.
I would prefer windows here as win gui nodes have less overhead.
But i dont know does windows support 80 hdds in logic level.

Maybe windows server, think I have some key Lying around but I’m not certain. I’ll look if Microsoft states if Windows Server has some advantages.

Greetings, Tyler

edit: I couldn’t find excat number from Microsoft but some users state having problems with >10 drives. I’ll guess I just have to find out

I don’t want to start an operating system war. Just want to mention that @Vadim may be an outlier in preferring windows. I’m not sure what overhead on Linux @Vadim is referring to, but possibly the use of docker. You can choose to run the node natively on Linux as well, but the overhead for docker is so minimal that I would recommend docker for multinode setups, because it makes managing different nodes a lot easier. And any additional overhead due to docker is by far outweighed by Linux itself having a lot less overhead.

That said, should you choose to use windows, I recommend using @Vadim’s toolbox for windows: Win GUI Storj Node Toolbox
Without it it requires quite a bit of tinkering to manage multiple nodes on windows to begin with. Keep in mind that this is not an official tool, but you can look through the thread to see how many people use it with success. I guess @Vadim was to modest to mention it himself. :wink:

2 Likes

I thin it just all depends on what people used to use linux or windows. Best solution will be alway OS that you know better.

2 Likes

It has the advantage of being legal for purposes like hosting a Storj node. Windows licensing requires you to have the right type of access licenses any time you offer anything that utilizes Windows (starting from Windows’ networking stack, so basically anything server-like) to more than one person concurrently.

I guess Alexey stated it very clearly that my original topic is not possible (against the ToS) so I’ll tag his answer as the solution, but thanks for your help lads and lasses, I’ll turn that storage into something useful if not a Storj not at least maybe 1/10 of the storage or so.

I wish you all a great start into the new year!
Kind regards, Tyler

I’d start at least 5 nodes ASAP and try to find them separate /24 IPs from ISPs. Go RAID 60.

So when the drive fails, you lose a node instead of running RAID and replacing a faulty disk, then wait for node to get audits before it gets full traffic again, which is like 6 months wait at least!? OMG…

You have 60 drives laying around? WOW! Sell them for 9000$, buy Storj at 0.25$, wait 2 years for it to hit 2$ again, and sell Storj for 800% profit.:grin: Than take those 70000$ and get yourself a Tesla. :wink:

Why is the traffic so low this month? My earnings are half of last month.

I recall last year’s Xmas period was like that as well. Probably just typical customer behavior.

1 Like

Yea I haven’t really looked into the market yet. But I imagine that high capacity drives with >17k hours of operation dont sell for that much :confused: