Windows node cannot start, stop, restart or reinstall. I will be disqualified :(

Hello StorJ

I’ve been running a windows node for 18 months now without any issues, and have around 6TB.

It has suddenly been unable to start the storagenode process, restart it, stop it or uninstall anything.

When I noticed this, I investigated and found a single issue: Logs had grown to 33GB. I could not open the log file in any of my installed programs, and removed it hoping for redemption.

As you can see, whenever I try to restart the program, it just stalls. I cannot access the webpage, and I cannot do anything with the windows install.

Storagenode.updater.log shows “invalid configuration file” on all fields that I’ve entered are invalid, even though they are fine and have been for a long time.

Running v.1.62.9.

Autoupdater does not work.

As the the Windows image is running in a VM, I tried taking a snapshot of the VM, and deleted as much of the old StorJ install as I could, and then tried installing again. I could not install the new program, with the installer hanging on “Calculating size requirements”. I’ve reverted the VM.

I am without a doubt over 24 hours of downtime at this point, and I don’t know what to do :frowning:

This is fine. Don’t rush. Take your time. You have days to get your node back online.

  1. Restart the computer.
  2. Check the storagenode.log file. Did it actually delete?
  3. Post the last 20 or so lines of the storage node.log file

Edit - just read you tried to delete and reinstall. Don’t do that until you know its totally dead, the majority of issues can be fixed.

go to services, find Storj V3 Storage Node service, change it in startup options from Auto to Disable, restart your pc
after restarting- navigate to storj node folder and delete the log file
start the storj service (log file will be recreated automatically) and check it for errors, post it if needed here

1 Like

Hello @Stob, @vladro

I’ve changed my storj instance (both the auto-updater and the regular .exe to disable), and have restarted. Below is an anonymized version of my entire log file:

2023-01-17T22:35:48.949+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-17T22:35:48.954+0100 INFO Anonymized tracing enabled
2023-01-17T22:35:48.970+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-17T22:35:48.970+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-17T22:40:18.128+0100 INFO Stop/Shutdown request received.
2023-01-18T00:19:38.932+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T00:19:38.948+0100 INFO Anonymized tracing enabled
2023-01-18T00:19:39.349+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T00:19:39.349+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T00:27:16.004+0100 INFO Got a signal from the OS: “terminated”
2023-01-18T00:27:26.214+0100 INFO Stop/Shutdown request received.
2023-01-18T00:32:13.207+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T00:32:13.207+0100 INFO Anonymized tracing enabled
2023-01-18T00:32:13.218+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T00:32:13.218+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T21:13:36.618+0100 INFO Got a signal from the OS: “terminated”
2023-01-18T21:13:41.762+0100 INFO Stop/Shutdown request received.
2023-01-18T21:28:55.529+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T21:28:55.539+0100 INFO Anonymized tracing enabled
2023-01-18T21:28:55.560+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T21:28:55.560+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T21:29:04.210+0100 INFO Stop/Shutdown request received.

Hello @Stob, @vladro

Kind many thanks for your responses.

I’ve disabled the two services (Storage V3 Storage Node and Storage V3 Storage Node Updater), and rebooted the service. Below is a anonymized version of my entire logoutput. It is not very long.

It looks like I can only have 5 hours of downtime per month - howcome I have days to get the node up and running? Once again, thank you for helping me with troubleshooting.

2023-01-17T22:35:48.949+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-17T22:35:48.954+0100 INFO Anonymized tracing enabled
2023-01-17T22:35:48.970+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-17T22:35:48.970+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-17T22:40:18.128+0100 INFO Stop/Shutdown request received.
2023-01-18T00:19:38.932+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T00:19:38.948+0100 INFO Anonymized tracing enabled
2023-01-18T00:19:39.349+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T00:19:39.349+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T00:27:16.004+0100 INFO Got a signal from the OS: “terminated”
2023-01-18T00:27:26.214+0100 INFO Stop/Shutdown request received.
2023-01-18T00:32:13.207+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T00:32:13.207+0100 INFO Anonymized tracing enabled
2023-01-18T00:32:13.218+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T00:32:13.218+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T21:13:36.618+0100 INFO Got a signal from the OS: “terminated”
2023-01-18T21:13:41.762+0100 INFO Stop/Shutdown request received.
2023-01-18T21:28:55.529+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2023-01-18T21:28:55.539+0100 INFO Anonymized tracing enabled
2023-01-18T21:28:55.560+0100 INFO Operator email {“Address”: “[[EMAIL]]”}
2023-01-18T21:28:55.560+0100 INFO Operator wallet {“Address”: “[[PAYOUTADDR]]”}
2023-01-18T21:29:04.210+0100 INFO Stop/Shutdown request received.

INFO Stop/Shutdown request received.- if it wasn’t your action to stop storj node u should check windows log file for errors (maybe u have not enough disk space or memory or previous update was unsuccessful)
So navigate to start-right mouse click- event log and check for errors
check this time 2023-01-18T21:29:04.210+0100- smth sending stop request to storj service
also check if u have 500gb+ on data disk or ensure that u disabled it in configuration file

This is in the SNO terms and conditions and is a theoretical goal. Currently the enforcement only kicks in after 228 hours (12 days) of downtime in a sliding 30 day time window.

@vladro,

Timestamps of shutdowns roughly fit when I tried running

Get-service Storagenode | Stop-service -force

I’ve found a few instances of interest.

"A corruption was discovered in the file system structure on volume E:"

Not good. It is a iSCSI btrfs volume, so it should not be the data itself. Ill try scanning in windows.

I am getting some errors from yesterday:
“A timeout (30000 milliseconds) was reached while waiting for a transaction response from the storagenode service.”

“The storagenode-updater service depends on the Dnscache service which failed to start because of the following error:
The dependency service or group failed to start.”

All the way back from the 9th
Fault bucket 1395407283512529538, type 5
Event Name: RADAR_PRE_LEAK_64
Response: Not available
Cab Id: 0

Problem signature:
P1: storagenode.exe
P2: 1.69.2.0
P3: 10.0.19044.2.0.0
P4:
P5:
P6:
P7:
P8:
P9:
P10:

Attached files:
\?\C:\Users\StorJ\AppData\Local\Temp\RDR8AFC.tmp\empty.txt
\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER8B1D.tmp.WERInternalMetadata.xml
\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER8B3D.tmp.xml
\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER8B89.tmp.csv
\?\C:\ProgramData\Microsoft\Windows\WER\Temp\WER8BC8.tmp.txt

These files may be available here:

Analysis symbol:
Rechecking for solution: 0
Report Id: 9318ae2b-a2aa-4731-9a08-8ae55bda83e9
Report Status: 268435456
Hashed bucket: 980166429c50584c735d7b87fac1d282
Cab Guid: 0

That is wonderful to hear - thank you very much. There might be hope afterall :slight_smile:

are u sure its a btrfs attached to windows or its a btrfs volume over iscsi that is ntfs formatted (seems u are running virtual machine in vmm synology)?

Good point, you are right.

The Storagevolume that the ISCSI target on my Synology box is formatted with BTRFS.

The windows box has it formatted as plain old NTFS

if u are running ntfs- check to show hidden folders under folders options- if u see found.xxx folders on your data disk- hmm data is corrupted because of File System failure or hardware issue

also u shouldn’t worry much if u loose a bit of users data- u just get some penalty in audit but if it repeats- u have to think about hardware issue with your hdd

I’m running error scanning as we speak, will try to look for found.xxx folders when it is finished.

Let’s say there is some disk issue, windows does it’s thing, and that’s all - do you recon the node will just start up again?

  1. how much physical memory do u have on your Synology ?
  2. is it the only virtual machine?
  3. how much memory is given for storj virtual machine
  4. is this virtual machine running only storj node or smth else?
    Anyway- check smart of your hdd on synology if u are NOT running raid with redundancy (if its shr or raid0 )

Also u have to check storj service under service tab and highly likely u haven’t check the restart option after failure. U should do it like on the screen attached
image

You were right, the restart options were not set as shown, i’ve corrected that.

Synology is at 2 GB Memory. These are the stats

It’s running alright. The array is SHR1 of four 8TB Ironwolf Pros.

I’m running two other nodes that I set up for testing (also windows), who’ve both filled up their 500 GB LUNs. All Luns are thick eagerly formatted. All VMs are running ONLY StorJ, and have had their windows installation cut down by this script → GitHub - n1snt/Windows-Decrapifier: A script to debloat Windows.

All VMs are running on local VMhost SSD storage, using only their E:StorJ drives over the network.

Latency does shoot into 50ish ms at times, but I am not terribly worried of LUN performance
[Here should have been a screenshot of LUN performance, with average of 16m, I cannot post it, as I am a new user]

VMs are on two different hosts, the two test VMs having each 2 cores and 6GB vRAM on one host, with the primary and large node (the one we are troubleshooting) having 4 cores and 9GB vRAM on another VMhost

I am not seeing any errors on the E: drive, where StorJ data is. (installation is on C: drive)

I’ll try to reenable the service, and check if it comes up

Additional info: here is a snippet of my storagenode-updater.log file:

It complains a tonne about “Invalid configuration file key”. I have never toyed with the configuration apart from when updating the size of the node - I am not worried that the configuration file is wrong.

2023-01-18T21:13:31.626+0100	INFO	Got a signal from the OS: "terminated"
2023-01-18T21:14:24.326+0100	INFO	Configuration loaded	{"Location": "C:\\Program Files\\Storj\\Storage Node\\config.yaml"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "storage.allocated-disk-space"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "nodestats.storage-sync"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "nodestats.reputation-sync"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "server.private-address"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "operator.email"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "operator.wallet"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "storage.allocated-bandwidth"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "contact.external-address"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "storage.path"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file key	{"Key": "server.address"}
2023-01-18T21:14:24.347+0100	INFO	Invalid configuration file value for key	{"Key": "log.stack"}
2023-01-18T21:14:24.473+0100	INFO	Anonymized tracing enabled
2023-01-18T21:14:24.559+0100	INFO	Running on version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T21:14:24.560+0100	INFO	Downloading versions.	{"Server Address": "https://version.storj.io"}
2023-01-18T21:14:28.122+0100	INFO	Current binary version	{"Service": "storagenode", "Version": "v1.69.2"}
2023-01-18T21:14:28.122+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode"}
2023-01-18T21:14:29.713+0100	INFO	Current binary version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T21:14:29.713+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode-updater"}
2023-01-18T21:29:24.572+0100	INFO	Downloading versions.	{"Server Address": "https://version.storj.io"}
2023-01-18T21:29:25.050+0100	INFO	Current binary version	{"Service": "storagenode", "Version": "v1.69.2"}
2023-01-18T21:29:25.051+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode"}
2023-01-18T21:29:26.067+0100	INFO	Current binary version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T21:29:26.067+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode-updater"}
2023-01-18T21:44:24.561+0100	INFO	Downloading versions.	{"Server Address": "https://version.storj.io"}
2023-01-18T21:44:25.283+0100	INFO	Current binary version	{"Service": "storagenode", "Version": "v1.69.2"}
2023-01-18T21:44:25.283+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode"}
2023-01-18T21:44:25.541+0100	INFO	Current binary version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T21:44:25.541+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode-updater"}
2023-01-18T21:58:25.480+0100	INFO	Got a signal from the OS: "terminated"
2023-01-18T23:05:36.319+0100	INFO	Configuration loaded	{"Location": "C:\\Program Files\\Storj\\Storage Node\\config.yaml"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "storage.path"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "nodestats.storage-sync"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "storage.allocated-disk-space"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "server.private-address"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "contact.external-address"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "nodestats.reputation-sync"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "storage.allocated-bandwidth"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "operator.email"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "server.address"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file key	{"Key": "operator.wallet"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file value for key	{"Key": "log.output"}
2023-01-18T23:05:36.321+0100	INFO	Invalid configuration file value for key	{"Key": "log.stack"}
2023-01-18T23:05:36.332+0100	INFO	Anonymized tracing enabled
2023-01-18T23:05:36.341+0100	INFO	Running on version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T23:05:36.341+0100	INFO	Downloading versions.	{"Server Address": "https://version.storj.io"}
2023-01-18T23:05:37.157+0100	INFO	Current binary version	{"Service": "storagenode", "Version": "v1.69.2"}
2023-01-18T23:05:37.157+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode"}
2023-01-18T23:05:37.251+0100	INFO	Current binary version	{"Service": "storagenode-updater", "Version": "v1.69.2"}
2023-01-18T23:05:37.251+0100	INFO	New version is being rolled out but hasn't made it to this node yet	{"Service": "storagenode-updater"}

I am back where I started.

Here is a screenshot of the Powershell output of the situation

Have you renamed/deleted the log file before start of the storagenode service?
Could your Synology run a docker? If so, I would recommend to run the node there instead of using storage through iSCSI from a different host.

Usually inability to stop the service meaning this host (VM?) have a hardware issues. In this case it could be too slow storage or problems with RAM.
If you checked the filesystem from Windows and it fixed an issue, the service should start normally and should be able to stop without any issues.