Recommended node log level

Pac · August 20, 2020, 1:00pm

Hello everyone

Lately, (as suggested by @BrightSilence here: Node Suspension - #203 by BrightSilence) I re-enabled logging and redirected all my logs to the only CMR drive I have available on my “Storj device” (other drives are SMR).

It works perfectly and I now have a nice history of what’s going on my nodes, pretty handy whenever something goes wrong, which happened a few times in the past but as I had no logs, debugging was quite horrendous…

Now, I was wondering if it was really useful to log everything down to the “info” level, as it seems to be the case by default.

I shall configure a logrotate eventually so logs don’t grow indefinitely, but more globally wouldn’t it be enough to only log warnings and above?

I can’t find any clear documentation on that, is it just a matter of putting the following configuration in config.yaml for doing so?

# the minimum log level to log
log.level: warning

In short:

What are all the available log levels? My guess would be: debug, info, warning & error?
Is it really useful to log all “info” entries (as in, is it going to help in anyway in case something goes wrong)?

Thanks

andrew2.hart · August 20, 2020, 2:04pm

We are all on “info” so your logs will make more sense to us here, if you ask for help. (Well I’m on debug)

nerdatwork · August 20, 2020, 2:14pm

You can’t go back and set debug level if your node faces any issue. If you set it to debug you can find out whats going on regarding any issue. So I would recommend debug since you had issue in previous thread.

BrightSilence · August 20, 2020, 2:30pm

I would recommend debug only if you’re trying to debug something. It can get a bit noisy (latest update tripled the number of lines for deletes with debug enabled for example). Otherwise I’d stick to info. Warnings aren’t really used by Storj from what I can tell. And error lines only often miss context. I would suggest sticking with info and dropping to debug if you’re trying to find a specific issue. And then just setting up logrotate.

Pac · August 20, 2020, 3:08pm

Alright thanks for your insights.

I’ll think about it, once logrotate is configured.

And, am I right by saying all available log levels are: debug , info , warning & error?

BrightSilence · August 20, 2020, 3:27pm

Warning seems to not be used, so it’s irrelevant. There is also FATAL for anything that stops the node process. I’m not entirely sure about what you can use for this parameter though. You shouldn’t restrict it more than info and higher anyway.

SGC · August 20, 2020, 7:23pm

why the log level isn’t debug by default is a mystery to me…
i mean sure if the system never had errors or the users was completely ignorant… the sure info…

but i’m sure everybody basically agrees on this… and first time you got a big problem thats hard to track down… people say… put your log on debug…

which then just means all the saved logs is then basically useless because they wasn’t on debug to begin with… meaning much less data to look at to figure out what is wrong and thus taking longer to troubleshoot issues… which basically means the default log level being info is only causing more wait during troubleshooting…

@BrightSilence i would suppose the reason the higher levels like fatal or warning looks basically empty is because we don’t have many of those… if any…

ofc by that i assume it would log only warning messages and above…

Pac · August 20, 2020, 9:01pm

Well, putting everyone in “debug” level by default could also cause a new problem: filling up everyone’s space pretty quickly, and having Nodes getting disqualified or suspended because they won’t be able to store data sent to them, in case logs are stored on the same disk.

I think having the logs set at “info” level creates well enough logs already, and still doesn’t prevent one from having to set up logrotate eventually at some point.

Whenever a problem arises, info level already gives a lot of context about the possible causes of the issue. Then, if it’s not enough, one can always switch to “debug” punctually if necessary, and get more details when the problem happens again.
If it doesn’t happen again, well it’s up to the SNO to keep the logs in “debug” mode or not for a few days/weeks; but as long as the problem does not happen again, although it might not feel “perfect” because we’re not sure what happened, at least there’s no problem anymore, so… good enough.

In my opinion setting logs in “debug” mode by default would be a viable option only if logs were automatically truncated after a certain TTL (after 1 month for instance). But it’s currently not the case as my understanding is that right now by default logs are simply outputted and logged by the docker container, so it’s the SNO’s responsibility to configure something to make sure logs don’t fill up the whole disk where docker writes logs.

SGC · August 20, 2020, 10:18pm

most of the time if stuff is working as it should you won’t really see many debug messages…

just did a couple of grep -count 's for INFO and DEBUG
for the 19th i got 434244 lines with info
and 2237 with debug… so really they won’t take up any space at all… not compared to the logs anyways

O.o
what…
okay checked my logs, for my currently 14tb node i got a max log of 64mb and some down in the few mb range… but lets do a 50mb avg just for ease of math… so 20 days of logs are 1gb, so 2000 days would be 100gb… 1/5th of the minimum required space to be shared… and in theory if it was a 500gb node, then it would most likely get a log file 1/28th of mine… so that puts about 6 years of logs at a max capacity lost at 3-4gb… but lets just call it 1% … and thats without compression… i don’t think my log size counts compression and such… but i do let my system compress them to conserve space further… thought they would take up more space really… xD

and debug seems atleast in my case to be 1/200th of 1% then…
rough numbers… but still… its over all a completely insignificant amount, less if there are issues… then logs might be slightly larger i guess… but still then you can compress them in atleast 10x if not more… so it becomes 1% for 60 years of logs…

Text in modern storage is just always so insignificant its ridiculous numbers required to make a dent.

so yeah… log on debug without a doubt…
i don’t see why not…

ofc these are my numbers… and i’m just one data point… so not easy to say how much space other storagenodes use on logs… but i don’t plan on deleting in my any time soon… nor do i plan to bother doing more compression than the limited bit my filesystem will do on its own… which is maybe x3

ofc after like a year or something the logs become kinda useless almost… atleast for anything than progression of node activity comparison and the program will change so they might not even be good for that.

andrew2.hart · August 21, 2020, 5:37am

I agree, normally you would expect Debug to spew tons of messages but on storagenode it is hardly any more than info.
I think it is being used to target current issues rather than the traditional debug all calls

BrightSilence · August 21, 2020, 7:03am

There are basic conventions around log levels that go way beyond Storj. By definition DEBUG exists as something that you can turn on when normal log levels don’t provide enough information. Depending on the software debug can increase logging a lot. This isn’t so much the case for Storj where debug lines are relatively rare. INFO is almost always the default log level. This is often because WARN or ERROR alone may show the specific error but doesn’t show you what the software was doing at the time. I don’t really understand why Storj skips warnings altogether. Most of the error lines we see should in my opinion be warnings. Like the “use of closed connection” errors that pop up from time to time. It’s true that some people would see more errors than others, but the argument is not that we don’t have enough errors for it to be useful, it’s that even if you have more errors, those lines alone don’t tell enough of the story to get useful information.

Debug messages are not dependent on things working as it should. It’s just granular logging of normal operation which could provide more context in case of errors, but is always there even if there are no errors.

SGC · August 21, 2020, 7:12am

i more or less fully agree on that…

debug in this case is just like… its basically free to turn it on… sure if it was so that it cost a lot in space or whatever then it could agree… but debug it 1/200th of log data and log data is 1/100 of customer data stored…

i am not adverse to turning off stuff for more sleek performance or less complexity, but for the case of storj and debug… using info as default is just … well maybe to make people that don’t know much not ask questions… but there is often so many questions anyways… so

i duno… if i was to decide i would make it on by default unless if i find a good reason not to do that… but i doubt there is enough of viable reasons

BrightSilence · August 21, 2020, 7:15am

Yeah, I just think your argument is not with the default setting, but with how the log levels are used in the Storj software. I prefer to have debug level on always as well for Storj specifically. But that’s because some of the log lines (like garbage collection) that are logged as debug, should be in info in my opinion and there is not a lot of actual debug logging going on.

SGC · August 21, 2020, 7:30am

to be fair i don’t really care much… i mean my log is on debug and ill keep setting it on debug… basically do that for every i log…

information is critical for understanding errors after they happened… else it can be very difficult to prevent them from happening again… or even go unnoticed…

but yeah you are totally right… debug should be debug… not features that are basic operations
just like many of the errors should be warning… a dropped connection is hardly anything more than a lost connection or a lost race for in or egress.

maybe you should make a list … i don’t think many others see it from the same perspective… i mean the programmers don’t really deal with the logs in the same way as the users…

just like people that build stuff shouldn’t test / review it themselves… or like a washing machine engineer / designer shouldn’t write the manual, it should be written by somebody that has no idea about what a washing machine is when they start using it / starts writing the manual…

xD

BrightSilence · August 21, 2020, 7:34am

That’s generally not a good idea. Some software increases the amount of logging tenfold or even more. There can be significant impact on performance as well, as developers don’t care about performance when debug logging is turned on, since it shouldn’t normally be. As for the rest of your post, I agree with you about manuals and reviews, but the job of logging is for a large part to help developers find issues and improve the product. It’s more important for them to know what it means than for the end users. This goes doubly for debug level log lines.

SGC · August 21, 2020, 7:42am

yeah debug in storj seems kinda tame…

one day i’m sure they will make good use of it… i suppose its still early on… from a program age / corp age… i’m sure as storjs standard / special operations manuals / guidelines evolve there will be a lot more structure in these kinds of things.

ofc when corporation and programs get to old they end up filled with junk… i suppose that goes for everything… just look at the universe lol

Pac · August 21, 2020, 7:53am

Alright, I did not know debug logs weren’t numerous at all.

That’s unexpected. Okay then.

Totally agree.

Basically to sum up, it seems we all agree that Storj nodes don’t really use log levels the right way
There’s some room for improvement here ^^