Since Storj already has direct support for Prometheus, and there have been a few community Grafana dashboards… I’d look at those dashboards first for ideas.
It can already send alerts as well: so perhaps your solution can primarily be Grafana reports?
I’m won’t list every metric I care about: but I can say payout and bandwidth data is pretty low priority. I mainly care about cumulative-used-space: to see how much things have grown in the last 2days/7days/month. You also can easily see when bloom filters have been sent out, and data-dropouts that indicate a node isn’t running properl)