Node is 1.110.3, startup scan disabled, lazy filewalker enabled.
When I grep for “trash-cleanup” on other nodes, I get results.
Not on this node.
Logs go back until 17th. When I grep for trash, trash-cleanup, retain, bloom, gc, lazy nothing comes up.
However there are date folders old enough to get deleted:
Could it be that the tasks are running for so long that the starting log entries have been deleted in the meantime and that there are no log lines indicating progress while the tasks are still running?
Yes, it’s possible. The default log drive for docker uses 5 log files 20MB each, so 100MB in total. This is why our instruction for Raspberry Pi suggests to have 10 logs with 50MB each, however, I believe that it’s not enough for the month (the log file on my biggest node is 3.6GB already).
However, I do not see a trash cleanup filewalker in the /mon/ps output, but this method shows only the currently running processes.
You may also check the trash in the /mon/funcs to see did it run at all.
Did you check the trash folders? It’s possible, that the trash filewalker is still running and deletes pieces from the trash.
It’s working one by one for the satellites’ folders.
Also /mon/funcs should have mention of trash and some stat. But if it’s still running, it should appear in the /mon/ps output too.
2024-08-21T12:13:05Z INFO lazyfilewalker.trash-cleanup-filewalker subprocess exited with status
This suggests that it was indeed running. Subsequently the filewalker restarted with the expected log message.
So it appeared to be running all the time while the logs got deleted,
This brings back one of my suggestions:
It would be good if such long running tasks would have a log message while they are running and what they are currently doing (like which date folder currently being worked on)
This would make tracking progress much easier.
The INFO level captures events in the system that are significant to the application’s business purpose. Such events are logged to show that the system is operating normally.
Unfortunately I think the housekeeping is never considered as a business purpose of the node, which should handle uploads and downloads and keep the customers’ data unmodified and haven’t lost.
If you want details - you always can increase the log level. Unlike many applications - on the fly. You may also use a debug port to track all housekeeping processes without increasing a log level. Of course you wouldn’t see a gauge or ETA, but I guess it’s hard to estimate.
It’s also possible to use a custom log level for services, in that case you need to increase it for the process in question.
I don’t agree to that. I think it is part of the overall purpose of the node. And why is there a log entry for the start of the trash-cleanup on INFO level then? I see the start, the progress and the stop all on INFO level. It is not debugging to check if such a progress is currently running it is verifying that the software is doing what it should do.
I’m strongly disagree to print a progress on the info log level. Printing a progress on INFO means that we also need to do the same for all operations, like GET*, PUT*. This doesn’t make sense. There should be only start, finish or failure.
The start and finish messages are sounds reasonable for the info log level. The progress of the said filewalker - is not. It should be in debug or even trace.
By the way, for trace you may also use /debug/run/trace/db, it would wait for the process to start and would collect the info and print it. There are also /mon/trace/json and /mon/trace/svg too.
The /mon/ps method would provide you how long the process is running right now, the /mon/funcs or /mon/stats will provide the stat.
You may check the article above how to provide parameters.
The progress should not be printed on INFO, it’s too verbose. If you would like to have it, you have plenty of options: you may enable debug or trace log level, including the custom log level for a specific process, you may use a debug port (which is designed exactly for your request). Why should other nodes suffer?
No, because these are not long running processes I am talking about.
As stated my logs had covered like 3 days. From that I would be happy with a log entry every 6 or 12 hours, maybe even only once a day stating that it is still running.
I don’t see that this would be too verbose.
You may submit an idea here: Storage Node feature requests - voting and let the Community to decide.
I still believe, if ones wants to see a progress they can always use a debug port to switch to the debug level or use the implemented debug methods.
I may not have been clear enough. While using the debug port can be useful for closely following progress, it requires additional setup. However, that’s not my main point. I’m specifically thinking of situations where the logs don’t indicate that a long running process is still running, which is what led to this thread. The INFO-level logging I’m referring to is not about logging the process’s activities realtime completely, but rather about providing a periodic “sign of life” to indicate that it’s still running, and nothing more.
Then you again can solve it without a single code string: redirect logs to the file and keep it at least for a week.
Or configure the docker log driver to keep more logs, than 100MiB by default (5 x 20MiB).
The problem is solved.
You didn’t convince me, sorry. We must not spend developers time on what can be changed in the local setup and is already supported.