Prometheus Storj-Exporter

greener · July 8, 2020, 10:17am

@djbill looks like you’re missing Prometheus server in your setup:

bovcan · July 8, 2020, 1:27pm

I tried this exporter but it is not playing with me corectly. Iam using docker and installed it this way
docker run -d --link=storagenode --name=storj-exporter -p 9651:9651 anclrii/storj-exporter:latest
but if I go to 127.0.0.1:9651 i get :

Error response

Error code: 500

Message: error generating metric output.

Error code explanation: 500 - Server got itself in trouble.

greener · July 8, 2020, 1:36pm

If you just started a new storagenode it will need to run for some-time before api starts to return relevant values. Few hours maybe.

Anything in docker logs --tail 100 storj-exporter?

bovcan · July 8, 2020, 1:44pm

yup, a lot of stuff
Exception happened during processing of request from (‘172.17.0.1’, 51322)
Traceback (most recent call last):
File “/usr/local/lib/python3.7/socketserver.py”, line 650, in process_request_thread
self.finish_request(request, client_address)
File “/usr/local/lib/python3.7/socketserver.py”, line 360, in finish_request
self.RequestHandlerClass(request, client_address, self)
File “/usr/local/lib/python3.7/socketserver.py”, line 720, in init
self.handle()
File “/usr/local/lib/python3.7/http/server.py”, line 426, in handle
self.handle_one_request()
File “/usr/local/lib/python3.7/http/server.py”, line 414, in handle_one_request
method()
File “/usr/local/lib/python3.7/site-packages/prometheus_client/exposition.py”, line 152, in do_GET
output = encoder(registry)
File “/usr/local/lib/python3.7/site-packages/prometheus_client/exposition.py”, line 121, in generate_latest
output.append(sample_line(s))
File “/usr/local/lib/python3.7/site-packages/prometheus_client/exposition.py”, line 87, in sample_line
line.name, labelstr, floatToGoString(line.value), timestamp)
File “/usr/local/lib/python3.7/site-packages/prometheus_client/utils.py”, line 8, in floatToGoString
d = float(d)
ValueError: (“could not convert string to float: ‘2020-02-11T17:33:41.320726Z’”, Metric(storj_sat_summary, Storj satellite summary metrics, gauge, , [Sample(name=‘storj_sat_summary’, labels={‘type’: ‘storageSummary’, ‘satellite’: ‘118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW’, ‘url’: ‘satellite.stefan-benten.de:7777’}, value=5807734037940.923

djbill · July 10, 2020, 11:51am

Thanks for your help. This is my first Grafana setup. I set up a prometheus server and reconfigured grafana data source. Now I have all metrics. Thanks again.

node1 · July 14, 2020, 10:45am

Hello,

Need some help with setting up this thing.

I’ve managed exporter:
http://localhost:9651
and getting:

HELP python_gc_objects_collected_total Objects collected during gc

TYPE python_gc_objects_collected_total counter

python_gc_objects_collected_total{generation=“0”} 205.0
python_gc_objects_collected_total{generation=“1”} 149.0
python_gc_objects_collected_total{generation=“2”} 0.0

HELP python_gc_objects_uncollectable_total Uncollectable object found during GC

TYPE python_gc_objects_uncollectable_total counter

python_gc_objects_uncollectable_total{generation=“0”} 0.0
python_gc_objects_uncollectable_total{generation=“1”} 0.0
python_gc_objects_uncollectable_total{generation=“2”} 0.0

HELP python_gc_collections_total Number of times this generation was collected

TYPE python_gc_collections_total counter

python_gc_collections_total{generation=“0”} 50.0
python_gc_collections_total{generation=“1”} 4.0

Aslo i’ve installed Prometheus:

http://localhost:9090/graph

As i understand the next step is Grafana.

But how and where to install it? Is it over the CLI or over GUI ?
How you all get to that point where the nice dashboard is visible?

sudo apt-get install grafana
Reading package lists… Done
Building dependency tree
Reading state information… Done

No apt package “grafana”, but there is a snap with that name.
Try “snap install grafana”

E: Unable to locate package grafana

T.Y.

greener · July 14, 2020, 12:18pm

Depends on your platform:

Docker option is there also

node1 · July 14, 2020, 12:25pm

Thank you

What about multinodes?

Should i install only storj-exporter on each node or Prometheus as well?

I was trying to add other data source (i believe each node is separate data source?) but somehow it did not worked

goldyka · July 25, 2020, 10:35pm

Hey,

Ive been looking to set up a monitoring for my multi-node setup, some on different IP’s( locations) some on the same server Dl380gen8.

I have been searching the forum and github to put it together but its just so much information and ways to do it i dont know how to start. Im running a hypervisor on my server so id like to keep it as separate as i can in case i screw something up and can start again.

Can somebody do a step by step guide on how to do it? What to install on a new VM what to install on docker that runs each node? Maybe a youtube tutorial?

Thank you

goldyka · July 26, 2020, 9:15pm

After a day of tinkering, i got to the point that its running, i set up prometheus and grafana, imported the dashboard but the dashboard gives me random values and changes with every refresh.

Storj node is running on a VM with IP x.x.x.122 and has installed storj-exporter in docker.

Prometheus and grafana are on another VM with IP x.x.x.140 on the same subnet.

revyte · July 27, 2020, 7:16am

Do you use Prometheus as datasource and are all your nodes listed as target in Prometheus?

goldyka · July 27, 2020, 4:18pm

I dont know what happened but now its working… IT magic monkeys?

So ive got some graphs that are pretty correct, but i dont have: disk usage / free space , uptime … any ideas?

goldyka · July 27, 2020, 4:34pm

Now i just figured out that 2 of my nodes are down from prometheus. In the GUI it tells me its out of bounds.

Systemctrl status prometheus gives me:

Jul 27 19:33:02 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:02.690Z caller=scrape.go:1170 component=“scrape manager” scrape_pool=storj-bay3 target=http://192.168.1.121:9651/metrics msg="Error on ingesting samples that are>
Jul 27 19:33:02 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:02.691Z caller=scrape.go:930 component=“scrape manager” scrape_pool=storj-bay3 target=http://192.168.1.121:9651/metrics msg=“append failed” err=“out of bounds”
Jul 27 19:33:02 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:02.691Z caller=scrape.go:934 component=“scrape manager” scrape_pool=storj-bay3 target=http://192.168.1.121:9651/metrics msg=“append failed” err=“out of bounds”
Jul 27 19:33:02 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:02.691Z caller=scrape.go:945 component=“scrape manager” scrape_pool=storj-bay3 target=http://192.168.1.121:9651/metrics msg=“appending scrape report failed” err=>
Jul 27 19:33:11 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:11.554Z caller=scrape.go:1170 component=“scrape manager” scrape_pool=storj-pi4 target=http://192.168.1.120:9651/metrics msg=“Error on ingesting samples that are >
Jul 27 19:33:11 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:11.554Z caller=scrape.go:945 component=“scrape manager” scrape_pool=storj-pi4 target=http://192.168.1.120:9651/metrics msg=“appending scrape report failed” err=”>
Jul 27 19:33:12 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:12.178Z caller=scrape.go:1170 component=“scrape manager” scrape_pool=storj-bay6 target=http://192.168.1.122:9651/metrics msg="Error on ingesting samples that are>
Jul 27 19:33:12 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:12.179Z caller=scrape.go:930 component=“scrape manager” scrape_pool=storj-bay6 target=http://192.168.1.122:9651/metrics msg=“append failed” err=“out of bounds”
Jul 27 19:33:12 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:12.179Z caller=scrape.go:934 component=“scrape manager” scrape_pool=storj-bay6 target=http://192.168.1.122:9651/metrics msg=“append failed” err=“out of bounds”
Jul 27 19:33:12 prometheus prometheus[910]: level=warn ts=2020-07-27T16:33:12.179Z caller=scrape.go:945 component=“scrape manager” scrape_pool=storj-bay6 target=http://192.168.1.122:9651/metrics msg=“appending scrape report failed” err=>
~

goldyka · July 27, 2020, 7:57pm

any ideas? i pretty much tried installing ntpdata on every node, syncing the time… i uninstalled and reinstalled prometheus, it works for 5 minutes then it gives me the same error again.

Prometheus and grafana is running on ubuntu, on a esxi host.

fmoledina · July 28, 2020, 5:17pm

The Disk Usage formula needs to be updated for the changed API endpoint. You can fix this manually in your existing installation by editing that panel and changing the expression from:

storj_diskSpace_used{instance=~"$node.*"}

to:

storj_total_diskspace{type="used",instance=~"$node.*"}

Alternatively, check out my fork with updated dashboards along with an alternate Boom Table version.

Hope this helps.

goldyka · July 28, 2020, 7:56pm

Thank you now diskspace is correct

goldyka · July 28, 2020, 8:13pm

But i still have the “out of bounds” error, sometimes it works, then it just stops… and get the above error in prometheus status.

dragonhogan · August 22, 2020, 12:39pm

This might be a long shot request, but I know I’ve looked at this thread many many times, and I still find myself at a loss on how to implement something like this on my RPi nodes. I’m quite illiterate when it comes to these things…is there anyway someone could post “step by step” instructions on how to set this up fully?

I’ve done the “start docker container” for the storj-exporter, but then I seem to get lost on what else I actually need to do to end up with the pretty graphs. What else do I need to install? What else do I need to configure, and how is that done? Then after those steps are completed, how do you actually see the graphs?

node1 · August 23, 2020, 9:08pm

Hello. Could you tell me please how do i add second node in to prometheus-server? As i understood, prometheus server scrapes the data it self. On my second node (where exporter is running), over the web port :9651 i see scraped data.

Then i add second job, in to the prometheus.yml, but on the grafana i still see only one node. The one, that runs on the same machine as prometheus server. Where is my mistake?

job_name: node1
If prometheus-node-exporter is installed, grab stats about the local

machine by default.
static_configs:
- targets: [‘localhost:9651’]

labels:

instance: ‘node’

job_name: node2
static_configs:
- targets: [‘192.168.1.13:9651’]

Thank you.

Krystof · August 24, 2020, 6:24am

Hello,
Do you have open port 9651 on 192.168.1.13?
Is status of node2 in prometheus OK?
Do you restart prometheus after change. yaml?
Krystof