Prometheus Storj-Exporter

I was wondering if there are plans to update boom table to grafana native table.
Looks like Boom table is no longer maintain, and will stop working all together once Grafana 11 is release
I would gladly donate to get this updated since I rely on it heavily to check my node status

3 Likes

Agree with this. I have seen the same issue.

This dashboard depends on Angular, which is deprecated and will stop working in future releases of Grafana.

@lyoth and @kosti11
I’ve done some updates to the dashboard, and have uploaded them here:

Also, as there’s additional changes to the actual Storj-Exporter and Anclrii has gone missing, I’ve also published the changes here: GitHub - TheChrisTech/Storj-Exporter: Prometheus exporter for monitoring Storj storage nodes
and pushed them to DockerHub (Changes include API Timeouts and Binding the metrics url to something other than localhost).

Let me know if you run into any problems.

5 Likes

@Chris21788
I got to try it out, dashboard looks good.
Few questions about the prometheus scrapper.
Currently my setup is

scrape_configs:
  - job_name: 'storj-node'
    static_configs:
      - targets: ['storj01:9651']
      - targets: ['storj02:9652']

This seems to jumbo the 2 nodes as a single node.
Am I supposed to do it like this?

scrape_configs:
  - job_name: 'storj01'
    static_configs:
      - targets: ['storj01:9651']
  - job_name: 'storj02'
    static_configs:
      - targets: ['storj02:9652']

For this dashboard, yes, you’ll want to use separate job names, as the dashboard queries I’m using is breaking it down by “job”, which helps isolate different nodes.

I have different timeout settings for different nodes, because some I’m running locally, others are remote, so different latencies, sizes of node/databases, etc.

@Chris21788
I just copied your panel into the old dashboard that was by instance and changed from job to instance and that seems to work.
Thank you for your work on this

1 Like

Hello Chris,

Thank you for your work.
I have done import dashboard from Grafana ID and everything works fine but it combines all my nodes into one thing. Tried to separate it onto another jobs but i have no data on my Grafana dashboard. Have you any ideas how to do it?

@kosti11 , the top portion should be a summary of all of your nodes, with a breakdown further down the page. Each section repeats for each node. Check out Storj-Exporter-dashboard/Storj-Exporter-v0.1.png at a4f8b708f62109cbd101f8090262ac785ff61ad1 · TheChrisTech/Storj-Exporter-dashboard · GitHub for an example picture. Let me know if that clears it up for you.

Hi,

Looks like v1.104 broke some metrics storj_sat_month_ingress, storj_sat_month_egress at least.
Any idea if it is an issue with the exporter or with the apis?

Bandwidth is added when order is submitted, so not a bug in the exporter.
I think it would make sense to rewrite the exporter to monitor the network interface instead of using storagenode api

Monitoring the network interface is a bad idea, other program might be running on the same OS and consume bandwidth (updates, metrics, etc.)

Wouldn´t it be possible to monitor the Docker interface?

How are you running Prometheus.
It consumes too much RAM for me. I added like 20GB of RAM to the system and still it runs out of memory…

or am i doing something wrong? i have 7 Nodes running that i would like to monitor

That would assume everyone use docker. And even then not 100% of the NIC is used by Storj.
I’ll try to use the debug metric exposed by node himself to get the info when i have the time

You are definitely doing something wrong. Prometheus itself … 40-80 meg, exporter python instances… ~10-20 meg each node. History dependent.

Awesome, please post your discovery. :slight_smile:

This is version 1.104. I’m looking forward to hearing about the latest version of the exporter… :yum:

The storj exporter is working fine. And i use the log exporter too. That is working fine too. Just Prometheus that keep overloading my system. My nodes just consume like 4-6Gb out of the 20Gb ram. But if i start Prometheus the ram usage explodes until OoM killer is triggered.

Can you please provide me with the docker flags you use for Prometheus (if you use Prometheus) and the Prometheus config itself?

I will try to look into it as soon as I am back at home

Thanks in advance

1 Like

Would love to, but can’t. My grafana (using 4 different dashboards), plus the exporter(s) & prometheus(es) services are all running on a Windows server. :confused: Maybe you have too much history/corruption in your prom db, and it just keeps loading it all in ram till exhausted. Nuke your data dir, and start over, or check default 14/15 day limit is in effect.
Good luck.

scrape_configs:

  • job_name: storagenode.001
    scrape_interval: 30s
    scrape_timeout: 20s
    metrics_path: /
    static_configs:

    • targets: [“localhost:9601”]
      labels:
      instance: “STJ.001-1xxxLH-GaZoo”
  • job_name: storagenode.002
    scrape_interval: 30s
    scrape_timeout: 20s
    metrics_path: /
    static_configs:

    • targets: [“localhost:9602”]
      labels:
      instance: “STJ.002-1xxxG3-GaZoo”

I am using this and it’s running fine for years now:

docker run -d -p 9090:9090 --restart unless-stopped --name prometheus -v /mnt/user/deploy/grafana/prometheus.yml:/etc/prometheus/prometheus.yml -v prometheus:/prometheus prom/prometheus --storage.tsdb.retention.time=1080d --storage.tsdb.retention.size=70GB --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus --web.enable-admin-api

1 Like