Prometheus Storj-Exporter

Awesome update! Thanks so much!

So with the new update I noticed that I can query new metrics in prometheus, that aren’t from the exporter directly, but processed through netdata:
e.g. netdata_prometheus_storj_payout_currentMonth_currentMonth_average

How can we prevent this? Firstly it results in storing the metrics twice in different namespaces and secondly, netdata is polling the endpoint every 5 seconds, resulting in high cpu usage spikes on the exporter and the storagenode.

Nice Work, are you also going to Update the Dashboard template?

This probably has to do with netdata, there’s no integration with netdata in storj-exporter and I’m not using it so can’t confirm what those netdata_* metrics are.

As to the endpoint polling I recall we discussed this on github issue and happens if you point netdata to health-check storj-exporter port. Currently exporter returns metrics on any url and I will limit it to /metrics to prevent cpu drain on port health-checks etc. That’s up next.

Yep, doing this slowly. So far I have added a boom-table for individual nodes with breakdown of traffic/audit/uptime etc per satellite. Also going to replace payout formulas with the new payout metrics which will be more precise and also add any new metrics if I find them useful.

3 Likes

oh you are not using netdata…
Yes ist has nothing to do with the exporter directly, it is something netdata does by default. I don’t have anything configured about that.
I just figured that most people are using netdata so they will have a similar experience to mine.
I’m not sure it has something to do with health-checks since netdata is actually pulling the exporter’s data.

This is an awesome release! Thank you!

I have a pretty new node running and some information seems to be missing/not showing up… is that because of the node’s age and missing information or did I configure something wrong?

Also, which things exactly fire off every so often? I am running the python file directly for the exporter (how often does that scrape my logs)? I understand the prometheus server forwards this information (how often does that fire) and lastly grafana reads prometheus server (how often does that happen?). And where can I tweak the timings of each of those?

Thanks in advance!

I got rid of netdata polling the exporter endpoint by using docker-compose to connect the exporter to prometheus without exposing any port to the host. This significantly reduced the CPU spikes.

version: '3.7'

services:
  storagenode:
    image: storjlabs/storagenode:latest
    container_name: storagenode1
    user: "1000:1000"
    restart: unless-stopped
    ports:
      - 7777:7770
      - 14002:14002
      - 28967:28967
    environment:
      - WALLET=
      - EMAIL=
      - ADDRESS=
      - STORAGE=9TB
    volumes:
      - type: bind
        source: /media/STORJ/STORJ
        target: /app/config
      - type: bind
        source: /media/STORJ/identity
        target: /app/identity
    networks:
      - default
    stop_grace_period: 300s
    deploy:
      resources:
        limits:
          memory: 4096M

  storj-exporter:
    image: anclrii/storj-exporter:latest
    container_name: storj-exporter1
    user: "1000:1000"
    restart: unless-stopped
    environment:
      - STORJ_HOST_ADDRESS=storagenode1
      - STORJ_API_PORT=14002
      - STORJ_EXPORTER_PORT=9651
    networks:
      - default

  prometheus:
    image: prom/prometheus
    container_name: prometheus
    user: "1000:1000"
    ports:
      - 9090:9090
    volumes:
      - /sharedfolders/config/prometheus.yml:/etc/prometheus/prometheus.yml 
      - type: bind
        source: /sharedfolders/prometheus
        target: /prometheus
    restart: unless-stopped
    command: --storage.tsdb.retention.time=720d --storage.tsdb.retention.size=30GB --config.file=/etc/prometheus/prometheus.yml --storage.tsdb.path=/prometheus
    networks:
      - default

networks:
  default:

and in promtheus.yml:

  - job_name: storagenode
    scrape_interval: 30s
    scrape_timeout: 20s
    metrics_path: /
    static_configs:
      - targets: ["storj-exporter1:9651"]
        labels:
          instance: "node1"
3 Likes

For everyone waiting for a dashboard with uptime score, this is mine (hope it works for everyone this way, might need to change the datasource):

Screenshots:



Key differences to the “old” dashboard: replaced vetting stats per node with online Score, fixed uptime in node overview, separated net out into egress and repair egress, fixed calculation of montly payout (but using old data, not the new metrics yet so this dashboard has held amount included)

2 Likes

Thanks for this. I’m just getting started using Storj-Exporter (Unfortunately running a docker for each of my 3 nodes - If there’s a better way, please let me know). One comment, and this could just be me, but for “Storage Used” & Ingress/Egress graphs…Is the Storage Sum and I/O labels backwards? Shouldn’t they be on the other sides of the charts?

2 Likes

One exporter container per node is the correct way.

oh lol, you’re right. I never realized that.

1 Like

That could be carryover from the original dashboard by @greener. Some of those graphs have the y-axes titles backwards.

What or where do i have to check this “undefined” Uptime? tnx

Not sure why it’s undefined. Metrics for onlineScore:
avg(storj_sat_summary{type=“onlineScore”}) by(instance) * 100

Have you updated Storj-Exporter for all of your node instances?

3 Likes

tnx and here some other words

Hey together,
does anybody know, how to give Alias-Names to the Satellites?
Now it looks like this:
image

I’d like to give them the “front” names:
image

How is this possible?

Thanks.

Yeah that would be nice but personally I have no idea how to do that.

It´s easy, you can see the names in web-dashboard:
image

Then edit your fav. grafana panel and add a override with field name:
image

1 Like

Managed to change the original JSON file to display URL instead of Sat ID:
image

I just replace “legendFormat”: “{{satellite}}” with “legendFormat”: “{{url}}”

2 Likes