2 nodes on 1 core/1 thread cpu

As explained - it will never recover this data. When the number of missed pieces in that segment would reach the threshold, the repair job will be triggered. It will recover missed pieces and pointer to your node will be eventually removed from the satellite’s database.
However, it could take a lot of time, because the network has a low nodes churn rate (pieces got lost slowly), this is a good for the network, but not good for your node - it will always balance on the edge.

At all. On your opinion, what should i do with this node. Stop, or run till it DQed?

It always up on you. While it’s not disqualified it will be paid for service.

I get the diplomatic response from @Alexey and he’s right of course. But I wouldn’t feel great about keeping a node online with missing data. I don’t know how much is currently stored on the node, but I would probably just start over clean and make sure to not repeat the same mistakes.

But… if all you care about is getting paid, then you could take the gamble on this node as it doesn’t seem to drop below 90 for audits, it might just survive. It would save you having to go through vetting again and collecting new data.

Others can’t really make this decision for you. But I can say that I would start over in this situation.

i seem to remember you saying the node is 1 month old… so maybe start another node and then in a month evaluate if this one is getting better or if you want to GE it… or whatever its called when you pull a node out of the network.

it will take a long time to evaluate if its worth keeping it, and it could in theory die a month from now… just down to pure chance… or atleast thats how i understand it…

i would say if it survives a month the odds are fairly okay it will keep surviving, as always there isn’t a good answer… personally i would count of it… i mean a year from now there could still be the potential of the node dying due to random chance if a satellite was to hit enough files that wasn’t existing… and so just on the off chance that happened… i you called it a cradle death and just plan to GE it whenever possible… ofc with current ingress and if the node is 1 month, there isn’t really much node to plan to GE…

so i might just kill it, if it is basically empty, then i could put the drive back to work on a node that doesn’t have some off chance or just dying randomly at any moment… even if its fairly remote…
it could run for years… so that would increase the odds

ofc i don’t know the statistics, but thats how i see it…
there would be a chance, and it would run a long time and thus i don’t want to keep it running. when i can fairly inexpensively replace it.

maybe somebody can give you some actual numbers on what the odds would look like… because i cannot do that… and my argument might not be relevant, just a modus operandi i got for such things.

I’ve decided to not to kill node and there are some results:

Wounded node running fine, and it nearly restored score.

Also, i’ve updated stroj exporter and prometheus image, they are now starting via docker-compose. That allowed me to save up to 75% processor time. Current cpu usage is below 25%, what is perfect result for old 1 core / 1 thread cpu, used by 3 storjenodes.

I hope my experience will help someone.

2 Likes

Where can I find those dashboards?

those are a combination of grafana and prometheus
or maybe just prometheus, but i think that may use grafana in some way…

1 Like