Hey everyone! I am one of the engineers who developed the pre-flight check, which was introduced on January 23rd in release v0.30.5. This release corresponds with ongoing unusual spikes in repair, so I am wondering - if your node is not starting (or has not started at some point since January 23rd, even if you have resolved the issue now), could you please post the error you are getting in this thread?
I want to get a compilation of current errors - especially related to the pre-flight check to determine if there is anything we can change on our end to be more lenient with allowing nodes to start.
This is not a debug thread. It might be possible for me to provide assistance with some of the issues people are having, but the primary purpose of this thread is to have a central place to report errors where nodes are failing to start.
I gotta give props to @Alexey who has been helping a lot of people fix their issues with either indexes or bad table definitions. It might be a good idea to align with him about this issue as well (if you haven’t already).
I am not sure if this issue is causing more repair, but it very likely could be. As a start, I am attempting to make a modification to the preflight database check that allows additions, but not deletions. For example, if there is an additional index on one of the storagenode’s tables (which seems like the case for many of these preflight check failures), the storagenode will be allowed to start, but a warning will be logged out. However, if anything has been removed from the expected schema, the storagenode will fail to start. My change is here https://review.dev.storj.io/c/storj/storj/+/1072
This is really good to know! I didn’t realize that was causing an issue. I updated my change to remove test_table from the schema comparison. If preflight fails and the issue is fixed, test_table can still exist because the preflight check never got to the end, where it deletes the table.