Network statistics

ARA · October 6, 2020, 8:32am

Thanks, I see the logic here. Just had to ask.
Is there any place I can see the aggregated data for STORJ? where I can compare network traffic in TB this month vs example last year same period ( not for my node , but the total network ? )

Alexey · October 6, 2020, 7:16pm

There is no such site. All stat is local because of GDPR and analogues.

yEinFallsLos · October 6, 2020, 7:49pm

I would be interested in that as well.
I thought I could write a small script which pushes the data, (like how much ingress/egress, allocated space, etc.) to some public dashboard I could host.
It would provide some information if your node runs correctly, how much data there is in specific regions and what you can expect regarding growth/payout.
Of course there has to be some general interest in it so there will be sufficient data… so let me know if you’re interested, I would love to start such a project.

@Alexey Also, I don’t know how it will be seen by Storj. Of course regarding GDPR… you agree with sending that data by using the script - I think that should not be much of a problem, but I’m no lawyer…

Alexey · October 6, 2020, 8:06pm

Hello @yEinFallsLos,
Welcome to the forum!

We are welcomed any helpful contribution!
Since you will run such a service, you will be obligated to store and process private information accordingly to laws of your location and regulations regarding storing information from foreign citizens.

kalloritis · October 8, 2020, 9:50pm

I respectfully disagree.

GDPR does not come into play when the stats are anonymized and aggregated. If you store information about “where” or “by whom” then it does, but that is not what was asked.

Alexey · October 10, 2020, 9:21am

I’m agree with you!
I just wanted to warn regarding such issues before they could happen
However, how you would allow to participant to figure out where is their node in that stat?
The most asked feature - a possibility to compare, I know, it’s useless, because you almost nothing can do to increase the customers’ traffic, but it’s requested anyway.
You can take a look on

kalloritis · October 10, 2020, 2:25pm

They wouldn’t, as that is not supported/compliant with GDPR nor the use case for a “network health/status” dashboard. Your node is implicitly part of it, but not identifiable, nor needed to be.

I agree its an asked for feature- so make it easy for the SNO, themselves, to export the graph data. Don’t make it shareable, just exportable. What the SNO does with the data after that point is on the SNO, not Storj itself at large.

yEinFallsLos · October 9, 2020, 3:06pm

Currently there is to my knowledge no real way of knowing many statistics about the network, for example:

Capacity of the network / average used capacity per node
Average Ingress/Egress, globally and per region / satellite

I thought I could host some dashboard where you can look up these stats, so node operators (or anyone curious) can get an overview of the network, how it grows and what to expect regarding bandwidth/capacity/payout.

The catch is that in order to get this data I would write a small script which every node operator can deploy and which pushed all the data about their node(s) to the server.

How many would be interested in such a project? Who would install that script on their server? I am asking because if there is not much interest, there will also be not enough data to really represent anything.

Thanks, and feedback is always welcome!

TheMightyGreek · October 9, 2020, 3:27pm

I would personally be interested and I’d run the script as it shouldn’t require much processing power.
I think Storj is in the process of building something like that but it’s not very high on their priority list.

kevink · October 9, 2020, 3:34pm

The problem with projects like this is that just a single troll could submit an immense amount of faulty data and your whole project would become irrelevant…

TheMightyGreek · October 9, 2020, 3:57pm

That’s true the script would need some way of preventing that but I don’t know how it could be done.
The idea is to take the numbers from the dashboard api right ?

anon27637763 · October 9, 2020, 5:11pm

Any reliable analytics collection technique is going to require giving up some anonymity… unless Storj provides that data.

An independent analytics collection might be possible if node operators were willing to provide their Ethereum wallet address to your service for verification. The analytics service could check for transactions with addresses associated with STORJ pay days. If the address checks out as received payments, then the incoming analytics data could be treated as valid.

In prior releases of Storj, it was possible to simply check for the TLS cert on the claimed port… but that doesn’t seem possible at the moment. I haven’t tried in long while, and it doesn’t seem to be functional now.

Another possible method would be to run a Tardigrade client… upload a bunch of random data through each satellite… and collect IP addresses of Storj nodes. An analytics participant could send in a node’s IP address and the collected analytics would be validated once the corresponding IP shows up in the collection of nodes from Tardigrade.

Of course, I over think everything… so there’s probably a much simpler method that is probably fairly anonymous and less complex.

yEinFallsLos · October 9, 2020, 6:49pm

Are they? Is there some kind of roadmap or something where they mentioned it?

kevink · October 9, 2020, 6:51pm

I’m not sure about that either. As I recall they don’t really want to publish many stats about the network and it’s even more unlikely they put effort into building something to show network stats while they have bigger things to do.

articrain · October 9, 2020, 6:54pm

You could reverse from the monthly payment addresses and then compile a list of the addresses that receive the Storj payments, and you could use how much they get paid to estimate how much storage is being used. Just a thought.

anon27637763 · October 9, 2020, 7:02pm

This might not work as expected, since Storj node operators are paid in USD values. So, the algorithm would need to record the value of STORJ at the time of the payout transactions and track the USD value rather than the number of tokens.

The USD value will only provide a very large picture of the network usage, which isn’t really useful for any given Storj node operator.

articrain · October 9, 2020, 7:06pm

You could use something like https://nomics.com/assets/storj-storj to get the price in your program, and then you could get a rough estimate for the bandwidth for each node. However, there is probably more than one payout address, or they change, so you would need to figure out some way to track them. This would however provide a broad overview, so people could get a rough estimate of node sizes and the network scale.

yEinFallsLos · October 9, 2020, 7:16pm

Regarding misuse, yes I also thought a bit about this.
@anon27637763 has actually similar thoughts than me. Afaik the node id is derived from the public key which is associated with the pow from the identity generation. Maybe there is a way to at least verify the pow…
Also, you can check if the node’s IP is reachable, so at least there is that.

Looking at past transaction could verify that there are nodes hosting for this ETH address, but I can’t really verify which and how many node ids.

Alternatively, anyone who wants to publish his data has to make an account with a reputable e-mail address. Not the best solution, also not bullet proof, but at least its not that easy then.

anon27637763 · October 9, 2020, 7:17pm

I usually use Etherscan and Eth Gasstation for token lookup information.

Eth Gasstion requires a DeFi Pulse API key:

https://data.defipulse.com/signup

DeFi Pulse probably has what you are describing in an easy to use API.

yEinFallsLos · October 9, 2020, 7:20pm

You could maybe derive a rough estimate of the current network size. Paid egress traffic is roughly the same across the nodes, so it could be accounted for. The problem is that you don’t know how many nodes are behind one address, and therefore some addresses could get more egress than others…