Who meet this error - please, enable the debug level for logs (it’s a log.level: option in the "C:\Program Files\Storj\Storage Node\config.yaml") and post one of logs (30-40 strings, when the node is crashed) between three backticks: ``` at the first line and three backticks after the last line.
Do not forget to save the config and restart the storagenode service.
log.level: debug was set at the time of the original crash. I have set the other two but another crash is yet to happen. It is quite a drawback not being able to change this retroactively in some fashion.
Ours too. You enabled extended log and now we can see more information.
I do not aware of any service which can automagically add missed information to the text log back to the past.
This is the same, as requesting to show a video from the Polaroid photo.
You are able to do so, if you start a record of video before the time of the making a Polaroid photo.
The same for logging - you can receive an extended log, if you enable the extended logging long before it is needed.
Can you also check if your node has the file blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/s5/idqsy3x4etubzbxox5ovj253zqswlitqgvz5gycomvdj5xxysq.sj1 in the data directory?
I’ve seen some where you can select a level with which you want to view the log, I assume they keep the whole log and then filter it later. Is there any benefit to logging higher priority events only?
When I first installed 1.6.3 it manifested 3 times in about a day, then it didn’t happen again until you asked me to change logging parameters, almost a week later, once I restarted, it crashed again within hours. Doesn’t seem temporal but I don’t know.
It is not there.
By the way, I occasionally see file not found/doesnt exist errors in the log, it has been happening for a long time, and I do not think there is any data loss, all the audits are at 100% and it’s running on ZFS. Not sure if this is some artefact in way system works or a bug somewhere.
There is always a performance impact. The more you log the more computation that machine needs. Also logging the stack and caller requires more resources.
It may not impact that much in your machine, but it has more impact in low power machines.
Thanks for your replies.
I’ve found a place where we should skip that error and not interrupt the execution because it could happen that the file gets deleted before we call the Lstat function.
The changes are in review considering that today’s is an official holidays, so I expect that it will be reviewed next week and perhaps some back and forth with them.
Let’s see if my mates see that change with good eyes and it get merged and rolled out next week.
On the other hand, other errors reported in this post were pointing to another problem that your one don’t have; both are related and caused the same crash, but I couldn’t get root cause of that one because your’s didn’t show it.
The other one is the createFile error? I will keep an eye out for that one and report if it’s different from lstat one already reported here then.
I will leave the extra logging on, it doesn’t bother me, in case I run into any bottlenecks I can always shut it off. The only other problem at the moment is big overstep of allotted space, mentioned in other thread, tell me if there’s anything I can do to help debug that.
Any purposed fix? I’m crashing on Windows GUI Server 2019 every 12 hours or so now. Per the logs it looks like Piece Deleter removed the blob a few hours before the error.
2020-06-22T00:06:55.779-0400 FATAL Unrecoverable error {“error”: “CreateFile G:\Storj Node\blobs\qstuylguhrn2ozjv4h2c6xpxykd622gtgurhql2k7k75wqaaaaaa\6g/itvzk3s6o5csohh5shpp3hcvsi6dstfhbdeqmyv5yllnaesfua.sj1: The system cannot find the file specified.”}
2020-06-21T22:49:59.016-0400 INFO piecedeleter deleted {“Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Piece ID”: “6GITVZK3S6O5CSOHH5SHPP3HCVSI6DSTFHBDEQMYV5YLLNAESFUA”}
@llamamanc We were promised a fix is coming soon! Oh it looks like you have the second type of error. Please turn all logging on on your SN and post the fatal log message when it happens again.
Is there any ETA for this to continue release (of the next version that fixes the issues, I suppose), or any place we can see the status? I could not find any relevant PR/issue on GitHub
I suppose it is not easy to give a date, I just wanted to know if you think it would be a matter of days or weeks - I have a Windows node waiting for update to be on the same page as linux to migrate it
Storj core devs are doing an amazing task building this platform - my intention is not to make pressure on this at all <3