Node went offline 14002 error

sjb · May 18, 2023, 1:52pm

I had a powerfailure as my inverter tripped. when PC restarted node was suddenly offline
Help please!

Knowledge · May 18, 2023, 2:08pm

Have you checked the logs? Can you provide us with any errors it is showing you? Did your IP change?

sjb · May 18, 2023, 2:09pm

Hi IP did change I inserted the new Dynamic IP

sjb · May 18, 2023, 2:10pm

How can I check logs

mars_9t · May 18, 2023, 3:30pm

Do you mean it works now or what? Inserted new IP where?

Would be easier to help if you provided any info about your setup and what you are actually doing.

sjb · May 18, 2023, 4:07pm

2023-05-18T07:00:31.932-0700 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-05-18T07:00:31.961-0700 INFO Anonymized tracing enabled
2023-05-18T07:00:31.971-0700 INFO Operator email {“Address”: “sjcrypto3006@gmail.com”}
2023-05-18T07:00:31.971-0700 INFO Operator wallet {“Address”: “0x08c4Fd3F2327590710AF5697992A01dcec9Ec554”}
2023-05-18T07:00:32.352-0700 INFO db database does not exist {“database”: “secret”}
2023-05-18T07:00:34.730-0700 INFO Telemetry enabled {“instance ID”: “1PBbNb8zmYD8RNVkEcZMd1bAxT6sesWXYEYYyuDsR6KfwLQSj6”}
2023-05-18T07:00:34.730-0700 INFO Event collection enabled {“instance ID”: “1PBbNb8zmYD8RNVkEcZMd1bAxT6sesWXYEYYyuDsR6KfwLQSj6”}
2023-05-18T07:00:35.154-0700 INFO db.migration.29 Migrate piece_space_used to add total column
2023-05-18T07:00:35.766-0700 FATAL Unrecoverable error {“error”: "Error creating tables for master database on storagenode: migrate: v29: no such table: piece_space_used\n[tstorj.io/storj/private/migrate.(*Migration).Run:209\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:354\n\tmain.cmdRun:95\n\tmain.newRunCmd.func1:32\n\tstorj.io/private/process.cleanup.func1.4:399\n\tstorj.io/private/process.cleanup.func1:417\n\tgithub.com/spf13/cobra.(*Command).execute:852\n\tgithub.com/](http://tstorj.io/storj/private/migrate.(*Migration).Run:209\n\tstorj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:354\n\tmain.cmdRun:95\n\tmain.newRunCmd.func1:32\n\tstorj.io/private/process.cleanup.func1.4:399\n\tstorj.io/private/process.cleanup.func1:417\n\tgithub.com/spf13/cobra.(*Command).execute:852\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:960\n\tgithub.com/spf13/cobra.(*Command).Execute:897\n\tstorj.io/private/process.ExecWithCustomOptions:113\n\tstorj.io/private/process.ExecWithCustomConfigAndLogger:79\n\tstorj.io/private/process.ExecWithCustomConfig:74\n\tstorj.io/private/process.Exec:64\n\tmain.(*service).Execute.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75)

JWvdV · May 18, 2023, 7:56pm

Seems like database files have been corrupted. That’s a shame. As far as I know, only bandwidth save payment history is something that’s being tracked locally, as all the others are kept central or not that much relevant. So first check whether location of the database directory is still correct in the config-file. Second, check what files are in the directory. Remove the problematic dbs if not that important, otherwise you can try to restore them which is often a waste of your time.

See: https://support.storj.io/hc/en-us/sections/360004515252-Databases-Issues

sjb · May 18, 2023, 8:34pm

I initiated the node on a new hard drive and now its working fine.

I still have all the data on the old drive.
What should i do with all the data on the old drive? Should I copy it over?

mars_9t · May 18, 2023, 9:15pm

Stop the node immediately!

(A bit dramatic, but I wanted to warn ASAP :V)

mars_9t · May 18, 2023, 9:16pm

You cannot start node of same identity with different data (or with no data). It will look to satellite as a node that lost data = not trustworthy and will be disqualified. You need to run it with the same data that it had previously.

RJGSpace · May 18, 2023, 9:32pm

try run chkdsk on that storagenode disk…

Alexey · May 19, 2023, 4:04am

If you did not follow the suggestion from @mars_9t to stop your node immediately, I think your node is disqualified already because of lost data.

What you actually needed to do is to stop your node, check disk for errors and fix them, then check your databases:

fix corrupted ones, and recreate not fixable:

If your node is disqualified on all satellites, this is permanent and you will be forced to remove this identity and its data, then start over - with a new generated identity (you must generate a new identity, not clone or restore from the backup), sign it with a new authorization token and start with a clean storage.

If it’s not disqualified on all satellites and you was able to recover its data, you may continue to run it, satellites will pay until your node would not be disqualified on them.

sjb · May 22, 2023, 9:29am

I tried multiple steps to get the corrupted DB’s removed. no luck still the 14002 error.

I did stop the node after it ran for about an hour, I think I may already be disqualified.
I am happy to share the data i have so that its not a loss to Storj.

Best will probably be to start a new node with a new Identity.Your input will be appreciated

mars_9t · May 22, 2023, 3:20pm

Now it’s been days and time is not your friend here.

For the most straightforward resolution:

.__ 0. Shutdown node if running.

Remove the databases. Just don’t bother recovering them. Maybe keep revocations.db (I’ll be honest, I have no idea what its purpose is :P).
Open elevated (as administrator) PowerShell. Run chkdsk X: /F where X is a drive letter where the node’s data is.
After it runs (for a while), run it second time if there were errors. Basically repeat until no errors (or it’s a 3rd or 4th pass - probably too severe damage and I wouldn’t bother running it more).
If there are no errors left, just make sure everything else is configured as before and run the node. Databases will be recreated and node could “just” keep working.

That being said, after those few days it may already be over for it. Satellites now consider many files to be unhealthy and lots of it will be removed from your node anyway. Suspension score might drop enough to get you a suspension, which locks the ingress for some time. Do the calculations if it’s worth to keep it running. If it’s a bigger node then I would at least try to resurrect it. Small one would be better to just kill and start new.

sjb · May 22, 2023, 7:21pm

many thanks for your help, I will try the above as a learning curve, if I see its unsavable I will start fresh.
Again many thanks