Alternative Node Software

I think you got my statement wrong. I was talking about your version. Your statement earlier was that you want to focus on performance. Go for it. In return I will not complain about low priority bugs but I am sure others like yourself will complain about it. It will be interesting to see how that dynamic plays out. I think the correct term for this is eating your own dog food.

2 Likes

Every day the dashboard stays broken I feel good that development time is spent on something that is important and not this nonsense.

Dashboard provides good enough ballpark. Nobody should be wasting time making it realtime byte-perfect. Removing it altogether while the right thing to do long term, is not worth spending efforts short term.

I don’t look at dashboard. Ever. I don’t understand why anyone would. How can you undermine anything with something that nobody cares about is therefore a moot point.

4 Likes

I understand it’s sarcasm, but I don’t have grafana either. Not even NetData. As a node operator I don’t monitor the node at all. I monitor the server, regardless of storj, I have checks setup to ensure node responds and returns AllHealthy, and occasionally remove bottlenecks — when and if any appear enough to be noticeable.

For examples, when IOPS became too high — I disabled sync writes and atime updates.

As a SNO, I honestly could not care less what node is doing. It has access to a quota limited dataset, bandwidth limits per requirements in ToS, and pays me some money. That’s the extent of accommodation I’m willing to offer as a SNO.

I believe it’s in storj interest to use resource they are paying for efficiently, and they will prioritize work accordingly for the benefit of their company. If this will require them to develop additional telemetry — groat. Now, tomorrow, or never, does not affect me. Maybe there will be some ML involved eventually to manage the load dynamically, based on node resources, etc.

I don’t see the point in obsessing over data, logs, or graphs, for $60/month.

(Side note. Before anyone brings this up — yes, I wrote extra scripts to run it on FreeBSD, replacement updater, and vpn iptables rules — but that’s was fun for me. An excuse to learn something new)

1 Like

Maybe that’s the difference if the only reason to monitor anything is Storj.

Also, this is not only about the dashboard. Dashboard is just a way to display the underlying data. So it is also about the data provided by the API. If this would provide correct data you could use other tools for visualization then the SNO dashboard.

I’d argue this falls under “use what you have” approach. If you don’t already have an always-on server with monitoring, – you probably don’t have time to set all that up, just for storj.

I already had a VPS, with Kuma running, to monitor other services. Adding storage node monitoring was literary two clicks and few keystrokes.

The question is – why would you want to visualize anything, other than out of sheer curiosity?

Either way, any accounting on the node is irrelevant, be that via UI, or API. Source of truth is the satellite, so there is no reason to duplicate efforts and keep another set of statistics on the client.

An objection I hear is that how else would SNO detect issues like trash cleanup and what not?
The answer is – it’s not SNO’s job to be unpaid QA. If STORJ wants telemetry – they shall develop telemetry. If they don’t want to debug in production – advance QA efforts.
Running complicated accounting on each node is the wrong approach here. Nodes must be extremely lightweight.

The only important metric that needs tracking is remaining available space – and guess what, it’s already being communicated to the satellite with every request. Everything else is not necessary and hence, not worth wasting time developing, and more importantly, running, on a thousands of nodes:

STORJ claims to be green technology – but imagine how much power is wasted when for each file fetch there are a bunch of database reads and writes – both in terms of waste of performance, IOPS must be conserved, and top achievable performance.

The API does provide satellite data.
And I disagree here. As terabytes of data were not being deleted it is not sufficient to know what the satellite believes what should be on the node. You’d better check also what the node locally is in fact storing. So I think you need both.

I even kind of agree. But again, history has shown, that the node software fails for many different reasons.
For me most important is used space, trash and payout estimate. The satellite average currently is only a bandaid as the local space calculations cannot be trusted.
To verify payout some additional satellite data can be handy like online times.
And of course there are the online and audit metrics that I look at.

Who has marked that one post as a solution? I don’t get it. Not a single line of code has been written why give up that early?

6 Likes

No I wouldn’t. I do not like the non constructive or offensive behaviour, that’s true, but I’m glad to help to resolve or find a workaround, while I providing a feedback to the team.