Put all the Hardware to the Work

realcm · May 2, 2021, 8:12pm

pretty impressive, got less than 30GB last time (about a year or so) in 1.5 months.

Of course. But at least I have to give a signal to network, - that I have empty drives and I’m ready to accept amount of data.
I’m not demanding payment for empty drives - that would be really stupid.
I have drives. They are ready to accept data. Just fill em - I earn tokens.

And what sia does for example is not really good, they allocate space forth, even when there is no data.

realcm · May 2, 2021, 8:25pm

Supply for the network is tightly bound with interest of suppliers.
Tell I him… - “5$ a month you’ll earn for 500GB node, don’t turn off you PC” ?
The answer would be like “Friend, I know, you can fix my PC from most disasters, but are you much sensitive to the coffee really?” and then “Do you know the cost of electricity in my office? It is 5 times higher than home…”

So at home I can make up to ~200$ a month with 20TB… with electricity cost of about only 10$.

peem · May 2, 2021, 8:34pm

But business and money are elsewhere - create software that stores data on the Storj network and get customers to pay for it. Simple? You have the money, you pay SNO a little to store these customers’ data, and a lot of the rest for you.
As SNO it is difficult to earn even electricity … so don’t look for money on this side of the equation.

kevink · May 2, 2021, 8:34pm

Well, apart from it being strange that electricity in your office is more expensive than at home, I doubt that there’ll ever be a need to “convince” people to run a node if it’s worth it. If it’s profitable, people will add more HDDs. They don’t need to tell storjlabs that they are ready to do so when the time comes. And it wouldn’t work anyway, unless you make a contract with people.

Pentium100 · May 2, 2021, 8:35pm

The data in the nodes is customer data. If there are not enough customers (as is the case right now) then nodes stay mostly empty. Customers pay Storj, Storj pays us.
Storj also has test data (which probably accounts for half the data in a lot of nodes).

If Storj filled all nodes completely (generating more test data since there is not enough customer data), then, it would be the same as paying for empty drives. All chia farmers would start nodes, Storj would fill them with test data and icur huge losses because there would not be enough income from customers.

The only real way to fill the nodes is for Storj to get more customers, I am sure they are putting a lot of effort into it.

peem · May 2, 2021, 8:38pm

Or set up a Nextcloud server and find customers who will pay you and only YOU. Simple?

Alexey · May 2, 2021, 9:42pm

There is no duplication. Only erasure codes. 80 erasure coded pieces are unique, they are not duplicates, but any 29 of them are enough to reconstruct the file.
However, if all of them would be in your single location and it would be unavailable, there is no other places to get files back and customer will not have an access. This complete disaster for the network. So we will do anything to not be in a such situation. Rewarding a behavior which increases risk to fall into such situation is a bad business model.
So, the filtering by /24 subnet is a smallest thing to protect customers.

The node selection is random, we do not prioritize nodes with more free space, this make the network self-balanced. Operators will add more space if there would be grow in usage. And they will reduce available space if usage is low.

realcm · May 3, 2021, 12:28am

They wouldn’t.
Even if single location has 1000TB, protocol should diversify locations for each user that stores data.
Apparently each location should maintain diversity of customers.
If location fails, each user loses only small fraction of data with redundancy, which is immediately restored from other locations and redundantly copied into free storage available.
This way maintaining safe level of saturation.
If other locations are same large - the more data they can keep, and the higher network reliability is.
I don’t know what exactly erasure codes mean, but I think it is something like data with recovery information for other data.

kevink · May 3, 2021, 5:09am

So you expect the storj network to treat all your nodes separately and give you more data just because you advertise more free space?

I thought we already answered that:

Additionally it would make running small nodes irrelevant, so only people with large amounts of space would run a node (or people who know how to “fake” having large amounts of free space).

This surely doesn’t look fair to you, does it?

You’re just trying to find a way to get maximum earnings for yourself with ideas that just won’t work for a decentralized network. There’s a reason all these things are in place but it’s getting a bit tedious to explain them all to you, while you’re not even reading any information about storj.

twl · May 3, 2021, 6:42am

Yes it does.

So you have 1 internet connection per motherboard? That is astounding.

There are those 4 mechanics called “vetting”, “held amount”, “audit score” and “uptime score”.

You can of course propose another method, but I would kindly ask you to read into these topics before you do so.

realcm · May 3, 2021, 7:57am

People with large space will find the way how to utilize it.
Scores will go to this or the other pocket.
One day it is going to be the one system that rewards equally to efforts taking in account every aspect.
But for now this is decentralizations vs localization battle.
Out of 256 people there is about only 1 is going to get in to care about them all, in his field.
May be when electricity will be for free, complete decentralization is possible.
Users should pay with space to have space…
But there will be no 2 users equal in abilities and needs…

kevink · May 3, 2021, 8:12am

That doesn’t really answer anything… but it’s a nice philosophical answer…

Pentium100 · May 3, 2021, 8:20am

Yes, so? Farm chia or something else on your free space.

Storj specifically wants to avoid large datacenters getting most of the data, since this reduces the decentralization.

If a node with more free space gets more data than a node with less free space, this would incentvize people to:

Cheat - I create a fake 10PB node to get more data, then somebody else creates a 100PB node…
Build large datacenters, the end result would be that all data is “distributed” among a few petabyte-scale datacenters and nobody else gets anything.

Storj wants to distribute the data about evenly to all nodes, based on their performance (faster nodes get a bit more data).

There is not enough customer data to go around. But you can think about this another way - there are more than enough nodes with the current rules, so why change them?

realcm · May 3, 2021, 10:27am

People are going to cheat anyway.
The only difference is to guarantee storage declared. With maintenance time no longer than declared. 10PB? - impossible for average user.

The question is how large?
Large datacenters are ineffective, Small datacenters are ineffective too.
How much skills and responsibility is required for running datacenter?
And how much of it is required to run around 30TB of storj nodes?
With more skills datacenters can outperform decentralized storj, sia and the rest in efficiency.
Imagine network of 20 datacenters connected with high throughput channels.
Data is distributed across them, so failure of even 3 datacenters simultaneously is not leading to data loss. And they will get more money because they use hardware way more efficiently.

If you completely decentralize everything people are going to have to provide quadruple space in one location in order to get single amount of space decentralized.
Who is going to maintain network and develop code?
How their work is going to be rewarded?
By selling part of the space?
So this is exchange - space for money.

Helping people to decentralize data developers have to sell people’s space to get reward.
So if they can sell my space - why can’t I sell my space?
Reward according to capabilities - connection speed, space, reliability - is fair.

Decentralizing everything is the same thing as to achieve nothing.
Even apps like nicehash for hdd space utilization can appear, which will automate multiple hdd apps usage.

Average PSU works with 75% efficiency, and cpu+mobo+chipset+memory take more power than HDD. So over-decentralization is made at the cost of carbon dioxide, burning coal and oil.

Let’s say I live in America and my salary is 50$ an hour.
Am I going to mess with my 1TB PC for $40 a year? Do I even need that 1TB?
If I want to be mobile I’d buy it. You provide me a space, I go to work.
I go to work - you provide space.
I provide space - you go to work!
There is some reasonable specialization in everything…
The only question is how fine grained decentralization is - how optimal is it?
So decentralization vs optimization.

Pentium100 · May 3, 2021, 10:50am

That’s what Amazon does. If Storj encouraged large datacenters, it would be “Amazon with extra steps”. Might as well stop bothering with node operators and just run your own servers.

For a large company that already has a datacenter - probably not much more than usual.

Current system is made to prevent the types of cheating seen in v2 and to make cheating more difficut in general.

Example 1:

In v2, each node got about equal data, so a valid strategy was to create hundreds of nodes to get as much data as possible. Those nodes were usually slow and not very reliable, but they got (in total) more traffic than a fast and very reliable single node.

However, someone may have a legitimate reason to run multiple nodes - multiple hard drives, spreading the risk and such.

Solution: aggregate all nodes with the same IP address and make them all get about the same data as one (bigger) node. This allows node operators to spread the risk and use multiple devices, but it makes it impossible to get unfair amounts of data.

Example 2:

In v2, a node that declared higher capacity, got more data (there was a very weird reason for it). So, everybody set their nodes to declare 8TB (which was the maximum supported size) to get as much data as possible, then set it to the actual value when the node filled up.

Solution: Distribute data to nodes, that have enough free space, equally.

I can just mod the software to report 10PB to the server.

Storj business model is to provide a service that is cheaper than equivalent service, provided by large datacenters (say, Amazon), but is as reliable (or even more reliable) than those datacenters.

The way to do that is to use home user devices. My internet connection is cheaper than a connection at a datacenter. My labor in looking after my server is cheaper than the same for a company. So, I can provide the same service cheaper. The downside is that my service is less reliable - I do not have as much redundancy as a proper datacenter. That’s where decentralization comes in. Spread the data to lots of nodes and if some of them are inaccessible at the moment - no problem.

cyber-arknet · May 3, 2021, 12:05pm

A different setup for the storj nodes:

BrightSilence · May 3, 2021, 12:57pm

That’s gotta be one of the geekiest and awesomest looking setups I’ve seen! Nice job!

cyber-arknet · May 3, 2021, 5:36pm

I am using for multiple purposes. Only to machines are for storj but as the HDD get full can keep adding more storage to the RAID. As for the display. Is not just display. Is coming with the cooling fan, Raspberry Pi RGB Cooling HAT with adjustable fan and OLED display for – Yahboom

Alexey · May 4, 2021, 8:15pm

We are diversify locations by selecting a one node from /24 subnet of public IPs for every piece of the segment of the file.
And having 1000TB in one physical location behind the same /24 subnet of public IPs will not give you any advantage above small operators with 2TB only. Pieces will be distributed across them with the equal probability. Fastest and closest to the customer (from random set of 110 nodes) will receive more pieces.