Cant startup again after update.. Windows GUI

Girard · December 16, 2020, 9:30pm

Hey there

Again I restarted the box, storj did an update and now I get this error:

2020-12-16T22:27:10.449+0100 INFO Configuration loaded {“Location”: “C:\Program Files\Storj\Storage Node\config.yaml”}
2020-12-16T22:27:10.470+0100 INFO Operator email {“Address”: “my.mail”}
2020-12-16T22:27:10.470+0100 INFO Operator wallet {“Address”: “myadress”}
2020-12-16T22:27:10.991+0100 INFO Telemetry enabled {“instance ID”: “my instance”}
2020-12-16T22:27:11.034+0100 INFO db.migration Database Version {“version”: 46}
2020-12-16T22:27:11.535+0100 INFO preflight:localtime start checking local system clock with trusted satellites’ system clock.
2020-12-16T22:27:12.345+0100 INFO preflight:localtime local system clock is in sync with trusted satellites’ system clock.
2020-12-16T22:27:12.345+0100 INFO bandwidth Performing bandwidth usage rollups
2020-12-16T22:27:12.345+0100 INFO Node nodeid started
2020-12-16T22:27:12.345+0100 INFO Public server started on [::]:28967
2020-12-16T22:27:12.345+0100 INFO Private server started on 127.0.0.1:7778
2020-12-16T22:27:12.346+0100 INFO trust Scheduling next refresh {“after”: “6h4m53.66353063s”}
2020-12-16T22:27:12.354+0100 ERROR services unexpected shutdown of a runner {“name”: “piecestore:monitor”, “error”: “piecestore monitor: error verifying location and/or readability of storage directory: open D:\storage-dir-verification: Das System kann die angegebene Datei nicht finden.”, “errorVerbose”: “piecestore monitor: error verifying location and/or readability of storage directory: open D:\storage-dir-verification: Das System kann die angegebene Datei nicht finden.\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1.1:121\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1:118\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.355+0100 ERROR contact:service ping satellite failed {“Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “attempts”: 1, “error”: “ping satellite error: rpc: dial tcp: operation was canceled”, “errorVerbose”: “ping satellite error: rpc: dial tcp: operation was canceled\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:108\n\tstorj.io/common/rpc.TCPConnector.DialContext:72\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:175\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:96\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:87\n\tstorj.io/common/rpc.Dialer.dialPool:141\n\tstorj.io/common/rpc.Dialer.DialNodeURL:95\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.355+0100 ERROR contact:service ping satellite failed {“Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “attempts”: 1, “error”: “ping satellite error: rpc: dial tcp: operation was canceled”, “errorVerbose”: “ping satellite error: rpc: dial tcp: operation was canceled\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:108\n\tstorj.io/common/rpc.TCPConnector.DialContext:72\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:175\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:96\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:87\n\tstorj.io/common/rpc.Dialer.dialPool:141\n\tstorj.io/common/rpc.Dialer.DialNodeURL:95\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.355+0100 INFO contact:service context cancelled {“Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”}
2020-12-16T22:27:12.355+0100 ERROR contact:service ping satellite failed {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “attempts”: 1, “error”: “ping satellite error: rpc: dial tcp: operation was canceled”, “errorVerbose”: “ping satellite error: rpc: dial tcp: operation was canceled\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:108\n\tstorj.io/common/rpc.TCPConnector.DialContext:72\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:175\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:96\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:87\n\tstorj.io/common/rpc.Dialer.dialPool:141\n\tstorj.io/common/rpc.Dialer.DialNodeURL:95\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.356+0100 ERROR contact:service ping satellite failed {“Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “attempts”: 1, “error”: “ping satellite error: rpc: dial tcp: operation was canceled”, “errorVerbose”: “ping satellite error: rpc: dial tcp: operation was canceled\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:108\n\tstorj.io/common/rpc.TCPConnector.DialContext:72\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:175\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:96\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:87\n\tstorj.io/common/rpc.Dialer.dialPool:141\n\tstorj.io/common/rpc.Dialer.DialNodeURL:95\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.356+0100 INFO contact:service context cancelled {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2020-12-16T22:27:12.356+0100 ERROR nodestats:cache Get pricing-model/join date failed {“error”: “context canceled”}
2020-12-16T22:27:12.356+0100 INFO contact:service context cancelled {“Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”}
2020-12-16T22:27:12.355+0100 INFO contact:service context cancelled {“Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”}
2020-12-16T22:27:12.355+0100 ERROR contact:service ping satellite failed {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”, “attempts”: 1, “error”: “ping satellite error: rpc: dial tcp: operation was canceled”, “errorVerbose”: “ping satellite error: rpc: dial tcp: operation was canceled\n\tstorj.io/common/rpc.TCPConnector.DialContextUnencrypted:108\n\tstorj.io/common/rpc.TCPConnector.DialContext:72\n\tstorj.io/common/rpc.Dialer.dialEncryptedConn:175\n\tstorj.io/common/rpc.Dialer.DialNodeURL.func1:96\n\tstorj.io/common/rpc/rpcpool.(*Pool).Get:87\n\tstorj.io/common/rpc.Dialer.dialPool:141\n\tstorj.io/common/rpc.Dialer.DialNodeURL:95\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:124\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.357+0100 INFO contact:service context cancelled {“Satellite ID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2020-12-16T22:27:12.357+0100 ERROR pieces:trash emptying trash failed {“error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:150\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:309\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:359\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.358+0100 ERROR pieces:trash emptying trash failed {“error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:150\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:309\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:359\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.360+0100 ERROR pieces:trash emptying trash failed {“error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:150\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:309\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:359\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.361+0100 ERROR pieces:trash emptying trash failed {“error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:150\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:309\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:359\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.362+0100 ERROR pieces:trash emptying trash failed {“error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:150\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:309\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:359\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:12.367+0100 ERROR piecestore:cache error getting current used space: {“error”: “context canceled; context canceled; context canceled; context canceled; context canceled”, “errorVerbose”: “group:\n— context canceled\n— context canceled\n— context canceled\n— context canceled\n— context canceled”}
2020-12-16T22:27:12.373+0100 ERROR bandwidth Could not rollup bandwidth usage {“error”: “sql: transaction has already been committed or rolled back”}
2020-12-16T22:27:12.858+0100 ERROR servers unexpected shutdown of a runner {“name”: “debug”, “error”: “debug: http: Server closed”, “errorVerbose”: “debug: http: Server closed\n\tstorj.io/private/debug.(*Server).Run.func2:108\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2020-12-16T22:27:14.565+0100 FATAL Unrecoverable error {“error”: “piecestore monitor: error verifying location and/or readability of storage directory: open D:\storage-dir-verification: Das System kann die angegebene Datei nicht finden.”, “errorVerbose”: “piecestore monitor: error verifying location and/or readability of storage directory: open D:\storage-dir-verification: Das System kann die angegebene Datei nicht finden.\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1.1:121\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/storj/storagenode/monitor.(*Service).Run.func1:118\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

No idea what it is again - I dont want to loose the 6 month old node with 2 TB - someone knows how to fix it?

baker · December 16, 2020, 9:32pm

The node can’t find the verification file that is in place to prevent the node from starting without access to the data files. So either your drive letter has changed, the drive is not available, or the drive has become read only. So I would start there to see what’s going on with the disk.

Girard · December 16, 2020, 9:33pm

Disk is there, I see all files, except that file.

I got already the statistic SQL damaged a month ago…

Can I rebuild this?

Girard · December 16, 2020, 9:34pm

You can see it here that all is there

baker · December 16, 2020, 9:39pm

That is strange that the file is gone. I would wait for a Storjling to jump in and see if they can guide you on how to rebuild this file without damaging the node. I believe it contains unique info, so I am unsure how to manually create it.

Girard · December 16, 2020, 9:40pm

I hope it wont be disqualified tomorrow morning It make me sad to loose the work of 6 months
Already last time the bandwith.db was suddenly gone… Also after an update

Pac · December 16, 2020, 10:48pm

You can be offline for a couple of days, it shouldn’t disqualify your Node: It’s going to decrease your Online score which should recover 30 days later.

Why is the content of the storage directory directly on your disk root?
My tree does start one level above that (db-shm & db-wal files omitted):

Storage directory targeted by my Node (could be the disk root - is a subfolder in my case)
├── config.yaml
├── orders/
│   ├──...
├── revocations.db
├── storage/
│   ├── bandwidth.db
│   ├── blobs/
│   ├── garbage/
│   ├── heldamount.db
│   ├── info.db
│   ├── notifications.db
│   ├── orders.db
│   ├── piece_expiration.db
│   ├── pieceinfo.db
│   ├── piece_spaced_used.db
│   ├── pricing.db
│   ├── reputation.db
│   ├── satellites.db
│   ├── secret.db
│   ├── storage-dir-verification
│   ├── storage_usage.db
│   ├── temp/
│   ├── trash/
│   ├── used_serial.db
└── trust-cache.json

Are things different between Linux and Windows?

Stob · December 16, 2020, 11:12pm

In windows you can put storj directly in a drive root. The D:\ isn’t the main system drive, that would be C:\ , it’s most likely a separate disk assigned only for storj.

Pac · December 16, 2020, 11:20pm

I get that (besides, we can on Linux too, just wasn’t a good practice in the past), but then where are config.yaml, orders/, revocations.db… ?

Stob · December 16, 2020, 11:24pm

Those files and folders would be in the C:\Program Files\Storj\Storage Node\ directory if it’s a standard install for a single node.

Stob · December 16, 2020, 11:32pm

There’s no disqualification for downtime at present so you shouldn’t worry about being offline.

There should be a process to recreate the storage-dir-verification file, however that is beyond my knowledge. This thread was referencing a similar problem - Error starting 2nd node on windows - but it’s still in process.

deathlessdd · December 16, 2020, 11:34pm

It should start right back up if you point it to the right location, I tested it myself to make sure.

Girard · December 17, 2020, 4:26am

It is an own disk just for Storj. What do you mean right location?
In the config I have:
"# path to store data in
storage.path: D:\

I did not change a location - it worked for almost 6 months there?

I dont see the file. But a weird thing I do see is the following:

This file has exactly the reboot time and is on C.\ (system root).
if I open it there is that in there:
grafik

Storj must have it created, dont ask me why and how…

As far as I read it - somehow that storage-dir-verification is missing…
and it looks like the new updated messed up something with it,

A help from a mod or someone would be appreciated

deathlessdd · December 17, 2020, 4:37am

Well I tested uninstalling the GUI storagenode and reinstalling it, and deleting the storage verification file and it still was able to start it the storagenode it just created the file again.

You shouldn’t have any files in c:\ especially in the root of the drive. Unless you run everything as administrator it won’t let you create files in that drive.

Are you running 2 nodes on the same machine?

Girard · December 17, 2020, 4:58am

I know that there shouldnt be files in there. That is exactly why I wondered since it was there! It was Storj that created it. Something must be wrong with spaces in dirs.

Even more strange is the following: In Appdata I found a Storj folder where the storage-verification-dir was present. I copied it over to my D:\ drive and guess what: It starts again.

Clearly storj messes around something. It creates files where there shouldnt be any, it uses a appdata dir, that is not set anywhere. My Logfile was btw 3 GB big…

It runs again now without any changes, except copy paste that file. And thanks to that, I have that now on one Satellite:
grafik

I was on 99% at least on all sattelites in Online. I guess the other will fall down too now

Girard · December 17, 2020, 4:59am

Btw: to answer your question: No only 1 Node

deathlessdd · December 17, 2020, 5:01am

This is very interesting it shouldn’t create a random file in a location it shouldn’t be… It would only write a file if that is set in the config. Are you running it though the storagenode.exe file or are you running though the setup file? Cause if you don’t correct this the next update could very well do the samething.

Girard · December 17, 2020, 5:09am

I am running it as it was installed. I start it with the service in the task manager. I rightclick and click “start service”.

Since it tries to start, fails, and stays closed. That should start it as it was installed by the gui

deathlessdd · December 17, 2020, 5:16am

Oh ok I was trying to look though the config but I couldn’t find anything, But like I said I was able to uninstall the storagenode GUI and reinstall it, I also deleted the storagenode verification file just to see if the node would start, and it did.

Girard · December 17, 2020, 5:55am

So something is weird. I hope an official will read through it.
It is annoying to have every couple of month missing file problems. The HDD is more than fine - there are other programms running on that machine that work flawless. Nothing just disapears suddenly.

Maybe I will have to do a clean install - but for a tool that asks me of 97% uptime there shouldnt be errors like that out of nowhere, without doing something on my side.

The longer a node runs, the more it hurts.