Prometheus Storj-Exporter

Well, you got it working, that’s all that matters :smiley:

I published the updated Storj-Exporter-Boom-Table.json` here GitHub - anclrii/Storj-Exporter-dashboard: Storj-exporter Grafana dashboard.

This is the one I’m currently using and here’s some recent changes I made:

  • Updated to Grafana 7+
  • Cleaner summary views for multiple nodes
  • Updated to use new metrics recently added to exporter, payout numbers come from api now and not calculated
  • Node summary table now includes Held amount, Version, Uptime, Online, UpToDate, Suspended, Disqualified status etc
  • Node details moved to a satellite summary table and more fields exposed by satellite
  • Templated datasource as a variable so it should pick up datasource automatically upon import if you have just one.
  • Unified colours a bit and removed some excessive overlapping graphs for rate/sum as they worked out to be confusing.
  • Added a shameless panel with my wallet qr code :stuck_out_tongue:


As always any feedback is very welcome.

7 Likes

Very nice work! :+1:

So far I’ve found two small bugs :wave:
At Disk free: You have forgot the trash space.
And at DQ and SU in Null Handling (Null Value): There is a “2” instead of “No data”

1 Like

Thanks, so if I understand this correctly you would prefer Disk free to show Disk free + Disk trash?
Currently Disk total = Disk used + Disk free where Disk trash is part of Disk used and not part of Disk free. This is how api reports it and I think it makes sense because trash can not be used as free space until it’s cleared. What do you think?

I fixed the Null Value now and will be there in next update :wink: (Comparing master...dev · anclrii/Storj-Exporter-dashboard · GitHub)

I mean the trash is not available as free Disk for some time…
Shouldn’t it then be for "Disk free" = "Disk available - Disk used - Disk trash" ?

Right, looking at sources here and here

// SpaceUsedForPiecesAndTrash returns the total space used by both active
// pieces and the trash directory.

Disk used(usedSpace) already includes space used by trash, so "Disk free" = "Disk available - Disk used - Disk trash" would subtract trash twice. Can you share an example where numbers don’t match? Could be an error in api but looks fine on my node.

1 Like

@greener
You are totally right, I didn’t notice that “Disk used” contains the trash.
Sorry, this was my mistake…

I have problem to got storj exporter to start at one of my nodes after update of storj-exporter docker container. The diffrence is that node has been disqualified on one satellite. I got this in my log from the container.

Blockquote
Traceback (most recent call last):
File “./storj-exporter.py”, line 191, in
REGISTRY.register(StorjCollector())
File “/usr/local/lib/python3.7/site-packages/prometheus_client/registry.py”, line 26, in register
names = self._get_names(collector)
File “/usr/local/lib/python3.7/site-packages/prometheus_client/registry.py”, line 66, in _get_names
for metric in desc_func():
File “./storj-exporter.py”, line 187, in collect
yield from self.add_sat_metrics()
File “./storj-exporter.py”, line 119, in add_sat_metrics
yield from self.add_extended_sat_metrics(sat)
File “./storj-exporter.py”, line 123, in add_extended_sat_metrics
self.get_sat_data(sat)
File “./storj-exporter.py”, line 33, in get_sat_data
sat[‘sat_data’].update(self.sum_sat_daily_keys(sat[‘sat_data’], ‘bandwidthDaily’, [‘repair’,‘audit’,‘usage’], ‘egress’))
File “./storj-exporter.py”, line 39, in sum_sat_daily_keys
for day in daily_data_dict[daily_data_key]:
TypeError: ‘NoneType’ object is not iterable
Traceback (most recent call last):
File “./storj-exporter.py”, line 191, in
REGISTRY.register(StorjCollector())
File “/usr/local/lib/python3.7/site-packages/prometheus_client/registry.py”, line 26, in register
names = self._get_names(collector)
File “/usr/local/lib/python3.7/site-packages/prometheus_client/registry.py”, line 66, in _get_names
for metric in desc_func():
File “./storj-exporter.py”, line 187, in collect
yield from self.add_sat_metrics()
File “./storj-exporter.py”, line 119, in add_sat_metrics
yield from self.add_extended_sat_metrics(sat)
File “./storj-exporter.py”, line 123, in add_extended_sat_metrics
self.get_sat_data(sat)
File “./storj-exporter.py”, line 33, in get_sat_data
sat[‘sat_data’].update(self.sum_sat_daily_keys(sat[‘sat_data’], ‘bandwidthDaily’, [‘repair’,‘audit’,‘usage’], ‘egress’))
File “./storj-exporter.py”, line 39, in sum_sat_daily_keys
for day in daily_data_dict[daily_data_key]:
TypeError: ‘NoneType’ object is not iterable

Thanks, I’ll fix this, tracking in https://github.com/anclrii/Storj-Exporter/issues/41

Can you share the output of curl 'http://localhost:14007/api/sno/satellite/<sat id here>' | jq for the desqualified satellite?

1 Like

I also got storj-exporter error on one node that has performed graceful exit on 3 satellites.
Other nodes has no error when running new storj-exporter:

F:\prometheus> C:\Users\musah\AppData\Local\Programs\Python\Python38-32\python.exe '.\storj-exporter.py' Traceback (most recent call last): File ".\storj-exporter.py", line 191, in <module> REGISTRY.register(StorjCollector()) File "C:\Users\musah\AppData\Local\Programs\Python\Python38-32\lib\site-packages\prometheus_client\registry.py", line 26, in register names = self._get_names(collector) File "C:\Users\musah\AppData\Local\Programs\Python\Python38-32\lib\site-packages\prometheus_client\registry.py", line 66, in _get_names for metric in desc_func(): File ".\storj-exporter.py", line 187, in collect yield from self.add_sat_metrics() File ".\storj-exporter.py", line 119, in add_sat_metrics yield from self.add_extended_sat_metrics(sat) File ".\storj-exporter.py", line 123, in add_extended_sat_metrics self.get_sat_data(sat) File ".\storj-exporter.py", line 33, in get_sat_data sat['sat_data'].update(self.sum_sat_daily_keys(sat['sat_data'], 'bandwidthDaily', ['repair','audit','usage'], 'egress')) File ".\storj-exporter.py", line 39, in sum_sat_daily_keys for day in daily_data_dict[daily_data_key]: TypeError: 'NoneType' object is not iterable

one of graceful exited sat
http://127.0.0.1:14002/api/sno/satellite/121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6
{"id":"121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6","storageDaily":null,"bandwidthDaily":null,"storageSummary":0,"bandwidthSummary":0,"egressSummary":0,"ingressSummary":0,"currentStorageUsed":0,"audit":{"totalCount":9532,"successCount":9532,"alpha":19.99999999999995,"beta":0,"unknownAlpha":1,"unknownBeta":0,"score":1,"unknownScore":1},"uptime":{"totalCount":19495,"successCount":19423,"alpha":0,"beta":0,"unknownAlpha":0,"unknownBeta":0,"score":0,"unknownScore":0},"onlineScore":1,"priceModel":{"EgressBandwidth":2000,"RepairBandwidth":1000,"AuditBandwidth":1000,"DiskSpace":150},"nodeJoinedAt":"2019-10-05T02:20:08.037764Z"}

Blockquote
Xxxx@Server:~$ curl http://localhost:14002/api/sno/satellite/1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE|jq
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 2358 0 2358 0 0 143k 0 --:–:-- --:–:-- --:–:-- 143k
{
“id”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”,
“storageDaily”: [
{
“atRestTotal”: 23818283351137.484,
“intervalStart”: “2020-11-01T00:00:00Z” },
{
“atRestTotal”: 23228610790976.977,
“intervalStart”: “2020-11-02T00:00:00Z”
},
{
“atRestTotal”: 22830749835839.58,
“intervalStart”: “2020-11-03T00:00:00Z”
},
{
“atRestTotal”: 21310279948438.28,
“intervalStart”: “2020-11-04T00:00:00Z”
},
{
“atRestTotal”: 23432204660848.97,
“intervalStart”: “2020-11-05T00:00:00Z”
},
{
“atRestTotal”: 23133306287232.496,
“intervalStart”: “2020-11-06T00:00:00Z”
},
{
“atRestTotal”: 22716632642664.688,
“intervalStart”: “2020-11-07T00:00:00Z”
},
{
“atRestTotal”: 22714351856842.47,
“intervalStart”: “2020-11-08T00:00:00Z”
},
{
“atRestTotal”: 20677663283168.117,
“intervalStart”: “2020-11-09T00:00:00Z”
},
{
“atRestTotal”: 23313113034491.742,
“intervalStart”: “2020-11-10T00:00:00Z”
},
{
“atRestTotal”: 21111166099020.15,
“intervalStart”: “2020-11-11T00:00:00Z”
},
{
“atRestTotal”: 23134595568077.32,
“intervalStart”: “2020-11-12T00:00:00Z”
},
{
“atRestTotal”: 22163097618727.99,
“intervalStart”: “2020-11-13T00:00:00Z”
},
{
“atRestTotal”: 22093652658135.992,
“intervalStart”: “2020-11-14T00:00:00Z”
},
{
“atRestTotal”: 22030571399887.812,
“intervalStart”: “2020-11-15T00:00:00Z”
},
{
“atRestTotal”: 21172572356985.492,
“intervalStart”: “2020-11-16T00:00:00Z”
},
{
“atRestTotal”: 23106003702273.14,
“intervalStart”: “2020-11-17T00:00:00Z”
},
{
“atRestTotal”: 21187979391157.453,
“intervalStart”: “2020-11-18T00:00:00Z”
},
{
“atRestTotal”: 22530189184044.7,
“intervalStart”: “2020-11-19T00:00:00Z”
},
{
“atRestTotal”: 20405503146216.14,
“intervalStart”: “2020-11-20T00:00:00Z”
},
{
“atRestTotal”: 22044231957044.04,
“intervalStart”: “2020-11-21T00:00:00Z”
},
{
“atRestTotal”: 21594040167995.887,
“intervalStart”: “2020-11-22T00:00:00Z”
},
{
“atRestTotal”: 1866421189991.8809,
“intervalStart”: “2020-11-23T00:00:00Z”
}
],
“bandwidthDaily”: null,
“storageSummary”: 491615220131198.75,
“bandwidthSummary”: 0,
“egressSummary”: 0,
“ingressSummary”: 0,
“currentStorageUsed”: 0,
“audit”: {
“totalCount”: 16902,
“successCount”: 16868,
“alpha”: 11.966608,
“beta”: 8.033391,
“unknownAlpha”: 20,
“unknownBeta”: 0,
“score”: 0.5983304299165214,
“unknownScore”: 1
},
“uptime”: {
“totalCount”: 24532,
“successCount”: 24531,
“alpha”: 0,
“beta”: 0,
“unknownAlpha”: 0,
“unknownBeta”: 0,
“score”: 0,
“unknownScore”: 0
},
“onlineScore”: 1,
“priceModel”: {
“EgressBandwidth”: 2000,
“RepairBandwidth”: 1000,
“AuditBandwidth”: 1000,
“DiskSpace”: 150
},
“nodeJoinedAt”: “2020-02-11T17:30:52.381242Z”
}

This should work https://github.com/anclrii/Storj-Exporter/commit/d16049b75b53f3e91d143af090b83b808ac33199
Can you test with anclrii/storj-exporter:dev docker image?

this work on my node with 3 sat graceful exited

Works for me too. Good and fast work!

1 Like

That’s exporter v1.0.2 published and available in anclrii/storj-exporter:latest image.

  • fix handling null daily lists returned by storj api
  • implement HTTPRequestHandler with /status endpoint

This should prevent issues seen on first day of the month and also on disqualified/GE nodes.

3 Likes

Thanks for you amazing work.

Can’t tell you how much I learnt until I sat up the full env, but worth the job :slight_smile:

Can anyone explain me what is the difference between the Online Score vs Audit score?

Yeah I know. Low values. It took few days until I tracked down what is wrong with my 2nd hdd, because it randomly sent an error msg to the dmesg. (Spoiler alert: low power from the RPI 4).

Also. I wasn’t able to find out by myself what does it mean in the logs:

Hey.
Thanks for the dashboard
Found inconsistency in the Uptime column for satellites.
Attached screenshots.
2020-11-28_104644

2020-11-28_165600

I reported that issue on github 6 days ago: https://github.com/anclrii/Storj-Exporter-dashboard/issues/11
(You can fix it yourself until he releases a new version)

1 Like

Thanks, corrected.