10% of my windows nodes dont start after update 1.97.2

10% of my windows nodes dont start after update 1.97.2

How do i revert to 1.96.6 ?

Dont produce any logs since the node does not start.

Win event just gives me this

The Storj V3 Storage Node service terminated unexpectedly. It has done this 39 times (times). The following correction action will be taken in 60000 milliseconds: Restart the service.

Event ID: 7031
Level: Error
Source: Service Control Manager

“# Event ID 7031 showing up in Event Viewer”

Thank you for posting your query at Microsoft community. If you are still facing the issue, I would like to inform you that Event ID 7031 usually pops up if we have any issues with any drivers.

https://answers.microsoft.com/en-us/windows/forum/all/event-id-7031-showing-up-in-event-viewer/189caf02-9d00-42e5-a0d9-108c8edb77b6

image

1 Like

Does running storagenode help or storagenode info from command-line work?

For reverting to an older release download

https://github.com/storj/storj/releases/download/v1.96.6/storagenode_windows_amd64.zip

And replace the storagenode.exe binary.

Also, what exact Windows version and release you are running things on?

1 Like

Thanks for you reply Egon, sorry i just reverted to 1.96.6 and it works. Did not perform the commands you asked for i dont really have the time now going away for some days. So just wanted to get my nodes up. But it seems im not the only one having this issue so something you should look deeper into what is causing it. Sorry for not being able to help you any further.

Best regards

Yes, we are digging into it. Unfortunately, at the moment we don’t have any solid leads on what is happening, yet.

Seeing whether storagenode help or storagenode info runs, would help us eliminate some possible issues. Primarily, whether there’s some DLL that’s causing issues or whether there’s some init or flag parsing code that stopped working.

Hi Egon,

I understand that is a nightmare to troubleshoot when u dont have any lead. Reinstalled 1.92.2 and got the following when running storagenode.exe info

2024-02-28T19:28:24+01:00       INFO    Identity loaded.        {"process": "storagenode", "Node ID": "124SoUiQsaLjNR9kj8ZgePLw1bAAAMEhWWxNwB9RYiLcrqS19Dg"}
Error: error starting master database on storage node: CreateFile C:\Users\Administrator\AppData\Roaming\Storj\Storagenode\storage\blobs: The system cannot find the path specified.; CreateFile C:\Users\Administrator\AppData\Roaming\Storj\Storagenode\storage\temp: The system cannot find the path specified.; CreateFile C:\Users\Administrator\AppData\Roaming\Storj\Storagenode\storage\garbage: The system cannot find the path specified.; CreateFile C:\Users\Administrator\AppData\Roaming\Storj\Storagenode\storage\trash: The system cannot find the path specified.

Seems like it tries to look for files in Roaming which seems a bit odd to me.

Hope it leads you in the right direction.

Best regards

1 Like

Thanks, that shows that it doesn’t seem to be related to an init or DLL related issue.

For context, the “Error” here isn’t important, because the configuration flags weren’t present at the moment. I just wanted to know whether it gets to handling command-line flags.

Great,

Should it not look in the storage folder instead of Appdata\Roaming?

C:\Users\Administrator\AppData\Roaming\Storj\Storagenode\storage\blobs <— is not my blobs folder, so ofc it fails if it looking at the incorrect folder.

Ah i see. Good luck in your troubleshoot and please ping when you think you have a fix so we can reenable the automatic updater and update to the fixed version.

Thanks in advanced.

Btw. what Windows version are you running?

21H2 OS Build 20348.2322 Windows Server 2022 (Not working)
21H2 OS Build 20348.1850 Windows Server 2022 (Working)

Recently i had real bad network problems, it was the network/mainboard driver not updating after a windows update.

There are literaly different drivers for different windows versions…

2 Likes

Do you happen to have any custom scripting or monitoring that calls Storage Node API-s directly? If yes, what and how are you calling them?

PS: the only thing I currently found at the moment was that calling API “/payout-history/[date]”, with an invalid [date] could cause a failure.

We finally managed to create version v1.97.3 that should be able to capture the startup failure.

If the failure happens before the service is fully operational, it will try to log to event viewer. If that also fails it outputs it into stdout – which will be only visible when starting the service from command line as the administrator with the same arguments as the service.

Hopefully this will help us figure out what’s causing the issue.

2 Likes

@flwstern have you had a chance to test it, or did v1.97.3 happen to not cause an issue any more?

1 Like

@Egon Sorry late reply have not tested 1.97.3 will try to do that in the week. All my nodes are on 1.96.6

1 Like

Hello on one node which did update im experiencing the same issue

Which command from prompt u want me to execute?

@Egon

C:\Program Files\Storj\Storage Node>storagenode run
2024-03-27T21:34:18+01:00       INFO    Anonymized tracing enabled      {"process": "storagenode"}
2024-03-27T21:34:18+01:00       WARN    Operator email address isn't specified. {"process": "storagenode"}
2024-03-27T21:34:18+01:00       ERROR   failure during run      {"process": "storagenode", "error": "Invalid configuration: operator wallet address isn't specified", "errorVerbose": "Invalid configuration: operator wallet address isn't specified\n\tmain.cmdRun:62\n\tmain.newRunCmd.func1:33\n\tstorj.io/common/process.cleanup.func1.4:393\n\tstorj.io/common/process.cleanup.func1:411\n\tgithub.com/spf13/cobra.(*Command).execute:983\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1115\n\tgithub.com/spf13/cobra.(*Command).Execute:1039\n\tstorj.io/common/process.ExecWithCustomOptions:112\n\tmain.main:31\n\truntime.main:267"}
Error: Invalid configuration: operator wallet address isn't specified

my bad ofc i get that error when i dont specify the conf… :slight_smile:

Issue resolved @Egon

Seems in version 1.97.3 if we specify the same confing lines multiple times (by accident) it cant parse the yaml config correctly.

In my case it was:

C:\Users\Administrator>"C:\Program Files\Storj\Storage Node\storagenode.exe" run --config-dir "C:\Program Files\Storj\Storage Node\\"
Error: While parsing config: yaml: unmarshal errors:
  line 339: mapping key "storage2.monitor.verify-dir-writable-timeout" already defined at line 338

That was causing the issue. Thansk for the added logging output well played @Egon

4 Likes

let’s keep moving…
Amen!

1 Like

This will not work. You need either start the service from the elevated PowerShell:

Start-Service storagenode

or provide all mandatory options for the run command, e.g. --config-dir and --identity-dir, but also override a log output to stderr, e.g. --log.output=stderr.

this always was like that. The YAML format have a mandatory requirement to do not have a duplicate keys.