Good luck with your exams. By then I should have an auto-build working so it can be downloaded from docker hub (where currently only amd64 is available).
@kevink, Iāve been playing with Grafana Loki and changing the log.enconding: json
in the storagenode config.yml
. It may be possible to grab metrics that way without a ton of regex work. Iāll let you know if I come up with anything substantial. Loki is not nearly as robust/featureful as Prometheus for dashboard queries, so Iād still be interested in exporting metrics to Prometheus to feed your dashboard.
I managed to get the multi-arch auto-build working. The image can now be downloded from docker-hub kevinkk525/storj-log-exporter:latest and doesnāt need to be built locally. I updated the How-To in the first post.
I got the dashboard running again however I did build it locally, I tried to use the updated command to download it from docker hub but it gave the following:
root@odroid:~# sudo docker run -d --restart unless-stopped --user "1000:1000" \
> -p 9144:9144 \
> --mount type=bind,source="/mnt/StorjHDD",destination=/app/logs \
> --name storj-log-exporter \
> kevink525/storj-log-exporter:latest -config /app/config.yml
Unable to find image 'kevink525/storj-log-exporter:latest' locally
docker: Error response from daemon: pull access denied for kevink525/storj-log-exporter, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.
huh that is weirdā¦ what platform are you running on? do other downloads work? I tested it even on a remote RPI and it downloaded just fine.
My node is on an odroid HC2, the node was running so I suppose that downloads work.
Just retried now and it seems to work, no idea why if failed earlierā¦
Time for an update:
You need to update the log-exporter container!
because with the last update storjlabs added a āsizeā attribute to all upload messages and I only added it to the upload_successfulā¦ So you are missing information at the moment.
New Dashboard
I created a new dashboard that also needs the storj-exporter from @greener Prometheus Storj-Exporter and can be found in his dashboard repository as well as mine. You can download it from here: https://raw.githubusercontent.com/kevinkk525/storj-log-exporter/main/dashboard_log_exporter.json
The new dashboard has many metrics and many options. Choose what you want to see. You can (un)hide the sections you want to see. Coloring has been standardized, ingress/upload is green, egress/download is blue, storage(io) is purple-ish. Also egress is always shown negative in graphs and ingress positive (netdata does it the same way, the old exporter from @greener does it the other way around, so donāt be surprised).
Sections:
- Combined Summary with all important information. If you only look at these, you wonāt miss anything important. It will warn you about (among other things) new error messages or the minimum uptime/audit/suspension score of any node on any satellite, so you can quickly see if one node is having a problem.
- Node overview showing all nodes in a boomtable with the most important information. Get a quick overlook over all nodes and possible problems (audit score dropping etc).
- Different form of NetIO graphs: Simple (maybe you like it because it has less colors), By node (multiple colors, very distinct node overview), by satellite (so you can finally see where all that traffic comes from or which satellite deletes lots of data)
- Successrates
- Piece information (see what average pieces size you get/send, repair and usage difference)
- Detailed stats from each node
See pictures in the first post, or better: Just try it
Hey @kevink, wouldnāt this be the correct link ?
https://raw.githubusercontent.com/kevinkk525/storj-log-exporter/main/dashboard_exporter_combined.json
Anyway it looks awesome !
Yes that is the correct link to the new dashboard.
Great! Runs wonderfully and immediately saves a lot of time. Thank you for the work
I donāt get any data for some fields and the error
ā1: 141: parse error: missing unit character in durationā
Is there a solution?
So youāre getting data for the successrates but not for the NetIO? What does the log entries panel show? and the debug panel please and the successrates graph.
Did it ever work or just happen now?
Right from the start. I was hoping that more data would just have to be available but the log exporter has been running for a few hours now.
Strangeā¦
Why is it not showing labels in the last picture (e.g. for Earnings month)? Did you change that?
Honesly, Iām buffledā¦ I mean all those broken graphs actually use the same data as the successrates in the last picture, so Iām not sure why it doesnāt work.
I have no idea what your error message means: ā1:139: parse error: missing unit character in durationā
I imported the dashboard and didnāt change anything on it: D
The error also occurs with the old āsimpleā dashboard. Totally strange. I can also change the hours in Grafana, but the mistake remains.
Wait ā¦ Which version of Grafana do you have?
EDIT: got it. My Grafana version was too old xD
Glad we solved that one.
Can you send me the output of your errors in your logs (only the errors, not the whole log)? The debug value indicates that my log-exporter misses some error categorizations. Itās nothing serious as the error count up top is not relying on it but Iād still like to get it right.
Sure, should I send you all errors per node or combined?
Combined is fine, I just have to check why my rules donāt match all your errors. So all unique errors are interesting but not per node.
@kevink, Iāve taken a spin on your Grok Prometheus exporter and implemented it in Grafana Loki and Promtail, ultimately resulting in the same dashboard as this, but populated from Promtail metrics instead of storj-log-exporter.
From the repo, hereās the motivation:
Iāve been interested in exploring Grafana Loki with Promtail for log ingestion and metrics for a number of different services on my home server. Testing it out for Storj nodes seemed like a great way to get an understanding of how it works.
For Storj nodes, the main benefit is that a single Promtail listener can injest logs from multiple nodes and produce metrics that Prometheus can then scrape. Individual storj-log-exporter instances are not required.
Furthermore, once the logs are ultimately shipped to Loki, one can do LogQL queries against them in Grafana or using LogCLI. For example, to search for all
ERROR
log level entries:
The repo includes instructions for installing the Loki Docker driver and a docker-compose stack for Storj, Storj-Exporter, Loki, Promtail, Prometheus, and Grafana and builds heavily on the existing work that has been done by you and @greener.
One of the main caveats is that to reduce regex work, I opted to use the JSON formatted log output from the storagenode service. This makes the logs look different from the familiar console output, but itās much easier to parse for generating metrics and from my cursory checks, doesnāt appear to trip up apps like successrate.sh.
Just throwing this out there in the wild in case anyone else is interested in a Loki/Promtail proof-of-concept. Thanks for all the hard work on this!