Success rate script - Now updated for new delete terminology since v1.29.3

I’ve updated the success rate script to include both canceled and failed as separate values. There are now percentages for both as well. I also fixed an issue where critical audit failures weren’t counted as such due to a change in terminology.

The updated script can be found here (For linux, windows version below)

The new output has a lot more info and looks like this


Please note this screenshot was made on a log from mostly the version prior to the update. This is why there are high amounts of failed transfers. After the update to 0.34.6 or later, failed transfers only occur if there was some sort of error. Not if a transfer was interrupted because enough piece were transfered.

The windows version was also updated by @Alexey and can be found here

The terminology used in these scripts is in line with the logs and is always from a customers perspective. This means that uploads are uploads to the network and ingress for your node. Similarly downloads are egress for your node.

17 Likes

Any chance you might add the delete operation to your script? not neccessarily to know the percentage of succesful deletes but just for stats.

I could add something, but the logging is less detailed for deletes. What would you like to see exactly? From what I can tell only “deleted” and “delete failed” can happen. I could add percentages as well.

The amount of deletes would already be sufficient. Basically just as a nice statistic next to all the other operations.
The “delete failed” should hopefully not occur often in the future.

Sure, might as well include it though.


Orange since any failed delete would be caught in garbage collection anyway, so it’s not a big deal if they don’t go through.

Updated version is on github. :slight_smile:

6 Likes

Awesome! Thanks a lot!

Great job, my node just updated some hours ago, updated the script and also started logs in a new file (kept the old ones). Works on both nodes, good job. Will update the comparison thread once I have at least a day of data in. @BrightSilence - thanks as always! :+1: :pray:

Your’s perform a bit better (old data before update), think it’s as always my bad ADSL connection :slight_smile:

2 Likes

Love it. Thanks so much!

Any way you can incorporate the specific breakdown of which satellite is cancelling uploads? Something like this raw output, translate to satellite name, with prettier output.

docker logs storagenode 2>&1 | grep ‘“PUT”’ | grep ‘upload canceled’ | awk -F" " {’ print $11 '}|sort | uniq -c

 1 "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW",
224 "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6",
250 "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S",
452 "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs",
 15 "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE",

If I have time I may work on it and submit a pull request. My upload success is in the %25 range and are awful, trying to figure out if I’m having issues with everyone, or what.

Thanks so much :slight_smile: BrightSilence

I have this script I use for my monitoring system:

You run it every once in a while using cron or whatever, it counts up the various log entries and produces a json file with the results (successes, failures etc per satellite and total). It only uses the log entries that were created since the last time it was run. It runs on php 7, probably would run on different versions as well, I did not test it.

I am using it to have graphs like this:


Seems that Stefan’s satellite does a lot of repairs, but not much else.

2 Likes

Is it possible to update the windows script as well?
Would be great.
Thanks.

Windows version found here

It’s managed by @Alexey, perhaps he could look into that, but he may have other priorities. Anyone could update it though and send a PR which I’m sure he would merge.

updated. Should work

2 Likes

That’s awesome! Thanks, I’ve added the link to the top post.

Why not to include these data somewhere in Storj node web dashboard?

I don’t think this data would have priority. More important right now would be to add info on payouts and held back amount. After that, I would love it if they added it somewhere. But they would have to figure out a better way to store this information. Currently we crawl through logs to find it, which is far from ideal, but it’s all we have.

Success rates are interesting to know, but in many cases there is not much you can do about it and most SNOs would be fine never knowing them. This is more for us tinkerers who like to keep a closer eye on things.

I agree. This data is accessible in the logs, while the held actual amount etc is not, so I would prefer more information becoming accessible at all over making the same information a bit more accessible. Parsing the logs is annoying, but it still works.

So isn’t it better to do both? Everything is accessible in the logs, but… The more friendly and informative is Storj software, the greater future the whole Storj network will have. Don’t you think so?

However, there is limited time to do stuff, so, if there is not enough time to do everything, then it would be better for me if they made more information accessible at all.

1 Like