Interesting what datacenter is down?

daki82 · August 12, 2023, 9:08pm

my question is why not running 6 real nodes and use them as backup for the real ones?

arrogantrabbit · August 12, 2023, 10:01pm

Because redundancy is already built in to the storage network design. Each node does not need to be reliable.

arrogantrabbit · August 12, 2023, 10:05pm

These SSDs are perfect for storj usecase — the drawbacks don’t matter because storagenode writes very little and never rewrites any data. Even accounting for data amplification, 1TBw per months is nothing. The benefits — low cost, noise and power consumption.

I’m not sure why are they “bad” in your book.

I would argue booting from 2TB 980”Pro” is an obnoxious overkill, as is raid6 (!?) of just eight drives, let alone for a storage node (I would do raid0) — but it’s not my money so why not. If done carefully, manually overriding 512e bullshit when deploying the filesystem — it could work.

Th3Van · August 12, 2023, 11:05pm

We had eight of those SSD’s and the Broadcom RAID controller laying on the shelf, decommissioned from another server project.

Each backup node shares a very small amount of hdd space. In my setup it’s ~42 GB per node.
The idea is not to get much ingress or egress, but instead to get pass the held back period of 15 months and eventually get all nodes vetted.
So when/if any of the primary nodes gets DQ’ed due to eg. a bad HDD, it can be replace with one of the backup nodes, and should not have to wait the held back or vetting periods.
Of course, I have to wait for new data to be send to the replaced primary node, since all data from the DQ’ed node are permanently gone.

Currently, the CPU is running between 5-10% serving 501 nodes, and has been running pretty steady for the last 131 days.

root@backup-parking-slot001:~# lsb_release -a
Distributor ID: Ubuntu
Description:    Ubuntu 22.10
Release:        22.10
Codename:       kinetic

Th3Van.dk

Walter1 · August 12, 2023, 11:25pm

Thank’s alot for the amazing answer. Now I understood why you prepare such backup-server. So it also means you are having already like 500 nodes running? You could, when you expand, always feed from this pool because as you said they would be vetted by then.

arrogantrabbit · August 12, 2023, 11:38pm

the whole idea (incubating nodes and then deploying them elsewhere) and implementation (running them all on one host) appears to violate both the letter and the spirit of Node Operator Terms & Conditions.

Th3Van · August 12, 2023, 11:42pm

Yes, I’m running 501 backup nodes.

I could, but currently I have no plans on setting up more primary nodes, than the 105 primary nodes i already have, as long as there’s free space on them.

Dev (SM01)   Available hdd space            Used hdd space                  Free hdd space 
Totals :    1.986.332.595.376.128   1.190.175.825.043.456 (59.9183 %)   796.156.770.332.672

Th3Van.dk

X64win · August 13, 2023, 6:33am

This is unbealivable. When I will have big node it will be some little nodes with different ISP. How I see there is no other way to give 100% guaranty that hardware works fine everytime.

snorkel · August 13, 2023, 8:45am

I believe you can. You must modify a parameter in config.

Mircoxi · August 13, 2023, 8:50am

My understanding is that that’s intended for debugging purposes only rather than something that should actually be lowered by an SNO. The SNO TOS and documentation both explicitly state 500GB minimum.

SNO Terms of Service (edit: raert beat me to linking this while I wasn’t looking):

4.1.4.2. Have a minimum of 500 GB of available Space per Storage Node;

Quick Start Docs:

Minimum of 550 GB with no maximum of available space per node

The docs technically say 550GB, since it’s accounting for the recommended 10% overhead, but it’s pretty clear that less than 500GB allocated on a normal node is a TOS violation.

Also, depending on how it’s interpreted, this part of the TOS could be taken as “don’t adjust the minimum allocation without permission”, since you could argue that you’re manipulating the performance of it.

4.1.3. You will not modify or attempt to modify the Storage Node Software for any purpose including but not limited to attempting to circumvent the audit, bypass security, manipulate the performance of, or otherwise disrupt the Storage Services for any reason, including but not limited to attempting to increase the amount of data stored or bandwidth utilized or the amount of Storage Node Fees, as defined herein, and you will not otherwise interfere with the operation of the Storage Services.

Toyoo · August 13, 2023, 9:59am

Eh, it can be done without modifying Storj software while being fully compliant with T&C. There are enough corner cases in T&C for that. Which is why I’m very curious what will be the new proposed document.

Besides, it’s not like Storj Inc. does not benefit from the work of Th3Van.

Mircoxi · August 13, 2023, 10:22am

Could you point out some of these corner cases? I’ve read through it and I don’t see anything that’d allow it, just stuff that disallows doing it.

Specifically;

4.1.2. You will operate the Storage Node in strict compliance with terms of this Agreement and will not take any action not expressly authorized hereunder.

You will operate the Storage Node in strict accordance with the terms of this Agreement and in no other manner. Without limiting the generality of the foregoing, you will not: […] 5 Operate a Storage Node that does not meet all of the Minimum Requirements;

5.1.12. Manipulate or alter the default behavior of the Storage Network to artificially increase or decrease the value of any reputation factor of any Storage Node;

5.1.17. In any other way attempt to interfere, impede, alter, or otherwise interact in any manner not expressly authorized hereunder with the Storage Services or the operation of any other Storage Node(s).

These are all quite clear.

Further, I’d say it’s actually a detriment to have big farms like this, rather than Storj benefitting from it - it reduces the distributed nature that the marketing is built on, and, as we’ve seen, petabytes of capacity can be wiped out in one go.

According to this site, there are 21,685 active nodes. 501 nodes is 2.31% of that total. That is a hugely disproportionate amount for one person to be running. When the payout changes hit, the network saw a significant dropoff in the number of SNOs participating (and I believe I read that most didn’t exit gracefully? I could be wrong on that). This site (having a GitHub link to an official repo, implying it’s at least somewhat official or using official data sources) shows a free capacity on the network roughly in line with what the graphic in the first post shows. The graphic shows 3PB of free space lost. This is just shy of 10% of the network’s free space lost with those nodes being taken offline.

Caveat; that’s reported free space. 3PB spread across 500 nodes would be 6TB per node, so based on Th3Van’s post about “sharing” the space, I’m assuming in this instance the nodes are configured to deliberately lie about their capacity to the network, in the absence of any other information.

So, let’s consider knock-on effects - on one hand, we have whales dropping out of the network causing disproportionate impacts. If one leaves permanently, the stats don’t look as good in marketing, and Storj becomes less appealing to customers. It may even be that over time, when free space isn’t 33PB, the eleven nines guarantee can’t be legitimately offered anymore. Storj has to offer the new, reduced service for cheaper, and pay SNOs less. More people leave, service quality degrades, it gets cheaper, etc etc… it’s a snowball effect. This can be compounded by individuals without silly money to invest seeing the whales and deciding not to join the project, further centralising the distribution of data and weakening Storj’s marketing.

When there’s people not playing by the rules - especially so brazenly, on such a large scale - it inherently harms the project rather than benefits it. It’s not unreasonable for SNOs - and the customers who are paying for the service as advertised and who’s data is at risk - to expect enforcement of the TOS as it’s written (and this is ignoring the fact that relying on edge cases to do things that clearly aren’t in the spirit of a legal document is braindead at best).

Toyoo · August 13, 2023, 10:51am

If you operate even a medium-sized data center, it’s easy to have thousands of drives in operation. Allocating half a terabyte on each would not be a huge deal if they’re already decently sized. Besides, nothing in T&C actually denies using the space allocated, but not yet filled, for other purposes, as long as the space is available when it is needed by the node. And given the nodes operate behind a single IP, they are not going to be filled quickly.

The only problem is with T&C mentioning a separate IP per node, but this has been already covered on the forums.

Going away from artificial requirements made by T&C into real needs of the network. 2.31% of all nodes? Does not matter. What matters is amount of data actually stored. And this is limited by the fact it’s a single IP address. With the current operation what counts is number of unique IP /24 blocks that a given piece of hardware operates in, and these 500 nodes just don’t matter—except, probably, slightly increasing resources needed by satellites for accounting.

These 500 nodes dropping out of the network? Not a big deal, it’s just 500 × 42 GB = 21 TB of data. I host more, with likely worse uptime and latency parameters, simply because I’m operating with just a residential connection. I’ve likely caused more repair traffic with my single Microserver gen7 setup, than these 500 nodes. You will just not see this in official stats because it’s a low number of nodes. So you won’t cry about that.

There’s one more factor that might, or might not be at play here. Random people like you and me join the network with expectaction that we will follow the regular T&C. But large operators may just as well negotiate different setups. I wouldn’t be surprised at all if Storj Inc. had some special arrangements, for example to maintain stable supply of nodes in presence of regular node churn. With the right type of contracts even regular node operators would benefit from presence of these operators. I recall someone from Storj Inc. stating here on the forum that they were approached by some companies already.

andrew2.hart · August 13, 2023, 12:52pm

In a zero trust situation, complaining about violations of TOS is like antelope complaining about the speed of the leopard

daki82 · August 13, 2023, 12:54pm

With great respect to your work.
In my opinion: if one of the backup nodes goes live, grows and churns, it will not have sufficient heldback to cover the churn repair.
It would be perfectly fine, if they would grow normaly to 500gb. And grow more later or never.

Here i see the biggest problem with heldback,
Since this would be perfectly fine with everything if it were like 24 backup nodes. Wich would be sufficient as backup also, but also would not have enough heldback in case of going live.

Also i can understand the cry of the community.
Lets wait for the new TOS, then rethink again.

My guess: @Alexey :we have to work at the held amount, to make it more reliable. Or let it go completely.

deathlessdd · August 13, 2023, 1:17pm

People have all done this with held amounts its not something new having new nodes with small amounts of space waiting for the 15month mark to not have held amounts to worry about has been happening since ive started storj… But the fact of the matter is these whales have large amounts of storagenodes in a single location on different subnets isnt ok, they can control almost all of the data if they see fit they can shut down all there nodes and really screw storj if storj doesnt get some kinda control over this they can go bankrupt real quick, and have to tell there customers all there data doesnt belong to them.

One thing I know for certain is the whales arent doing it legit and they can for sure run every single node on its own subnet inside a single datacenter.

https://etherscan.io/advanced-filter?tkn=0xb64ef51c888972c908cfacf59b47c1afbc0ab8ac&txntype=2&fadd=0x303edcd8dbe1607fe512d45cc15d3e41fa4db44b&tadd=0x303edcd8dbe1607fe512d45cc15d3e41fa4db44b&p=102

Just a small amount of the whales getting paid out. More then 50% of the network is controlled by whales.

snorkel · August 13, 2023, 7:51pm

31 wallets received over 1000 tokens for july, with 3 addresses receiving over 6K, 12K and 13K tokens. That’s a lot! I received 400 tokens for 63TB of data. Those 3 SNOs have like 5PB of data, if the math is correct. Aprox 23% of the entire network.

Vadim · August 13, 2023, 8:02pm

your math is wrong, but may be not very much, I have 280tb and I got little over 2k, but it is with 10% bonus.
Also it can be a contractor that get salary in tokens.

snorkel · August 13, 2023, 8:27pm

I hope they are something else than SNOs, but the payment date is the same as for SNOs.

arrogantrabbit · August 13, 2023, 9:03pm

Well, this is pure cheating: you bypass most of the reasons behind the held amount structuring, and completely undermine the reasons for the vetting process.

Exactly. When Joe Shmoes do that here and there on their raspberry pi nodes, it’s probably tolerable and unenforceable. However it seems a rather big players are openly stating about doing this now, and get no reaction from Storj.

I would like a definitive asnwer Storj staff, is such blatant undermining of vetting mechanism and circumvention of incentive/held back program OK to do? Yes or no? @Alexey @heunland? Because if that’s OK – I have a few unsavory ideas…