One of docker node has gone offline with strange reason

Aka985 · April 19, 2023, 4:43pm

Hi guys! One of the old nodes, running under Docker in Windows has gone offline and I can’t start it. Another 3 nodes running on same machine under Docker and 1 node under Windows working fine. 20 lines of log is below. What can be the reason?

C:\Users\Storj D1>docker logs --tail 20 storagenodeD1.1
github.com/spf13/cobra.(*Command).ExecuteC:960
github.com/spf13/cobra.(*Command).Execute:897
storj.io/private/process.ExecWithCustomConfigAndLogger:92
main.main:478
runtime.main:250
2023-04-19 09:28:50,065 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-19 09:28:50,065 INFO success: storagenode entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-19 09:28:50,066 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-19 09:28:50,071 INFO exited: storagenode (exit status 1; not expected)
2023-04-19T09:28:50.392Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode”, “Version”: “v1.76.2”}
2023-04-19T09:28:50.392Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode”}
2023-04-19 09:28:50,398 INFO spawned: ‘storagenode’ with pid 36
2023-04-19 09:28:50,399 WARN received SIGQUIT indicating exit request
2023-04-19 09:28:50,400 INFO waiting for storagenode, processes-exit-eventlistener, storagenode-updater to die
2023-04-19T09:28:50.400Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode-updater”}
2023-04-19T09:28:50.452Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.76.2”}
2023-04-19T09:28:50.452Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}
2023-04-19 09:28:50,467 INFO stopped: storagenode-updater (exit status 0)
2023-04-19 09:28:50,472 INFO stopped: storagenode (terminated by SIGTERM)
2023-04-19 09:28:50,474 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

Regards,
Alexander

moby · April 19, 2023, 9:08pm

Hey @Aka985!

I’m not 100% sure what the issue could be based on the error you shared. My first thought was that maybe your db or disk is corrupted, but this seems less likely if you have other nodes set up on the same machine in the same manner that are working fine.

One thing I did notice is that this part of your logs:

github.com/spf13/cobra.(*Command).ExecuteC:960
github.com/spf13/cobra.(*Command).Execute:897
storj.io/private/process.ExecWithCustomConfigAndLogger:92
main.main:478
runtime.main:250

Looks like it is the tail-end of an error, which is not fully visible if you only include 20 lines. Would you be willing to share more of your logs? If possible, it would be nice to see the logs since startup, but even seeing a little further up from what you shared might be helpful.

If it’s too much to share in a comment on this thread (or you don’t want to for any reason), feel free to send it to me at moby@storj.io - hopefully I will be able to help more then.

Also, if you are having trouble starting the node anyway, you may as well enable debug logs with the flag --log.level=debug - just in case that provides more information

Aka985 · April 21, 2023, 8:37am

Hello, I have sent data to your mail. Please help!

Regards,
Alexander

Alexey · April 21, 2023, 4:09pm

storagenodeD1.1

Error: Error starting master database on storagenode: database: bandwidth opening file "config/storage/bandwidth.db" failed: file is not a database

You need to recreate this database, using this guide:

storagenodeD1.2:
no problems in logs
storagenodeD1.3

"error": "bandwidthdb: database disk image is malformed",

or re-create it using the first guide, but the history of bandwidth usage will be lost

Aka985 · April 23, 2023, 2:14pm

Hello, after step 5.2.5.2 I see this message:

Is there any errors?

Shall I go to p.7 in manual?

Regards,
Alexander

Aka985 · April 23, 2023, 2:34pm

Also this is not clear for me:

After I run the command
docker run --rm -it --mount type=bind,source=F:\StorjD1.3\storage,destination=/storage sstc/sqlite3 sh
I see this picture and nothing is go on:

Is it fine? Go to p.9?

Alexey · April 24, 2023, 1:39am

Yes, you need to fix the bandwidth.db database with this instruction. Alternatively you may decide to re-create it instead (you will lose the history of the bandwidth usage though, so it will be wrong on the dashboard).

yes, it’s fine. This is the goal of this command, to get a shell with an access to database files.

Aka985 · April 24, 2023, 11:13am

Hi,
I stack with this:

Or in PowerShell:

Could you, please, help!

Alexey · April 24, 2023, 6:39pm

If you want to use a native sqlite3.exe, make sure that you downloaded a correct version for your Windows (if you have a 64 bit Windows, make sure to download from SQLite Download Page).
And as far as I know, they distribute this utility as an archive. Did you unpack it?

In cmd.exe you need to run it without leading ./ or .\. In the PowerShell you need to use ./ or .\ before the command (if you are trying to run it in the location, where you unpacked a binary).

Aka985 · April 25, 2023, 9:37am

Thanks Alexey. Version was correct. I have uploaded new sqlite files instead of previous and it become work.

Please, check this screen. Looks something is not OK, somehow I need to exit sqlite3.exe before run this command?

Get-Content dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 dump_all_notrans.sql

Regards,
Alexander

Alexey · April 26, 2023, 3:00am

You need to perform commands outside of the sqlite3.exe session, in your PowerShell.
The sqlite3.exe binary is used to manipulate databases, not files.
I thought instruction is clear, but seems not.

Exit form the empty sqlite3.exe session (you usually should not open it without any arguments):

;
.exit

In the PowerShell session:

cp F:\StorjD1.3\bandwidth.db F:\StorjD1.3\bandwidth.db.bak

Now run sqlite3.exe with the path to the database, it will open an sqlite3 session

sqlite3.exe F:\StorjD1.3\bandwidth.db

Unload all not corrupted data from the opened database to the file (from the sqlite3 session):

.mode insert
.output F:\StorjD1.3\dump_all.sql
.dump
.exit

Now in your PowerShell prepare data for load

Get-Content F:\StorjD1.3\dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 F:\StorjD1.3\dump_all_notrans.sql

In the PowerShell remove the corrupted database:

rm F:\StorjD1.3\bandwidth.db

Create a new bandwidth.db database with sqlite3 from the PowerShell and load prepared data:

sqlite3.exe F:\StorjD1.3\bandwidth.db ".read F:\StorjD1.3\dump_all_notrans.sql"

Check, that size of the database is not zero (PowerShell):

ls F:\StorjD1.3\bandwidth.db

if size is not zero, you may try to run your node back from the elevated (as an Administrator) PowerShell:

Start-Service storagenode

Then check your logs, if node is running - check your dashboard.

If the database has zero size, then you need to re-create it:

Aka985 · April 26, 2023, 1:52pm

Hello Alexey, somehow I am going forward), node StorjD1.3 now generating new bandwidth.db, it goes very slow, about 2Mb/hour, so it will takes about 20 hours. Volume of storj data is about 200Gb. Is it normal? When I did this in last november, it took 2-3 hours with Storj data volume 900Gb on a very old laptop.

Anyway, in parallel I started to recover another node on another machine and meet a problem with step 5.1

I tryed to run command from 3 locations, but result is not ok

My command transcription is:
Get-ChildItem F:\StorjD4.3\storage*.db -File | %{$.Name + " " + $(C:\sqlite\sqlite3.exe $.FullName “PRAGMA integrity_check;”)}

Location of file is:

Also I have a problem with p.5.2.4

File bandwidth.db has been deleted with unknown reason. Is this critical? Can I skip this step? In further commands this file shall be recreated if I understood right.

Regards,
Alexander

Alexey · April 27, 2023, 4:42am

Using HDD - yes. If you could use docker with a temporary RAM disk, then it would be faster, but you need to have 2x amount of RAM for the size of the restoring database.
See

but you use

instead of

Get-ChildItem F:\Storj4.3\storage\

The same problem here:

the path F:\StorjD4.3\bandwidth.db doesn’t exist, because the actual path is F:\Storj4.3\storage\bandwidth.db

You may use TAB key to finish typing path, i.e. start with f:\Sto then press TAB on your keyboard, it should ether finish the path or give you a suggestion. Or you may just copy path from the Explorer and paste it to the terminal.

Aka985 · April 27, 2023, 4:48pm

Hello Alexey, its hard, but I will go forward

I did small help for myself for restoring procedure, could you check if sequence is correct:

Остановить контейнер
Забэкапить файл базы данных F:\Storj4.3\storage\bandwidth.db
Установать sqlite3 v3.25.2 or later
Создать папку C:\sqlite
How To Download & Install SQLite Tools
Скачать и скопировать в нее Precompiled Binaries for Windows qlite-dll-win64-x64-3410200.zip и sqlite-tools-win32-x86-3410200.zip
Перейти в C:\sqlite
cd c:\sqlite
.\sqlite3.exe
Проверить базу данных на ошибки
.\sqlite3.exe F:\Storj4.3\bandwidth.db “PRAGMA integrity_check;”
Ответ Ok
Открыть Powershell и запустить
Get-ChildItem F:\Storj4.3\storage*.db -File | %{$.Name + " " + $(C:\sqlite\sqlite3.exe $.FullName “PRAGMA integrity_check;”)}
Результат - диагональная матрица и под ней все .db ok
В Powershell C:\sqlite> запустить
docker run --rm -it --mount type=bind,source=F:\Storj4.3\storage,destination=/storage sstc/sqlite3 sh
Результатом является ответ /data # Выйти exit
В Powershell C:\sqlite> запустить (создаем новый файл базы данных)
cp F:/Storj4.3/bandwidth.db F:/Storj4.3/bandwidth.db.bak
Затем запустить ./sqlite3.exe F:/Storj4.3/bandwidth.db
Результат SQLite version…
Enter “.help”…
sqlite>

9.В SQL запустить следующие команды
sqlite>
.mode insert
.output F:\Storj4.3\dump_all.sql
.dump
.exit
Результатом на экране является - ничего. Но в корне диска появился файл Storj4.3dump_all.sql

Get-Content F:\Storj4.3dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 F:\dump_all_notrans.sql
Результатом на экране является - ничего. Но в корне диска появился файл dump_all_notrans.sql

Удалить испорченную базу данных rm F:\Storj4.3/bandwidth.db
Запустить .\sqlite3.exe F:\Storj4.3\bandwidth.db “.read F:\dump_all_notrans.sql”

And now I have a problem with step 12:
File is here:

Database created on step 8 is here:

I didn’t understood, why to remoove it on step 11, why?

When I trying to do step 12, but do not see blinking coursor and nothing is going on

Could you, kindly, reply to me atleast 2 times a day? E.g. morning 7.30 (Moscow time) as you normally do and some 14.00 (I will try to go forward and ask if something)?

heunland · April 27, 2023, 5:10pm

We are sorry but Alexey can only respond during his normal working hours. Thanks for your understanding.

Aka985 · April 27, 2023, 7:22pm

Ok, what time is suitable for second session?
Regards,
Alexander

Alexey · April 28, 2023, 4:01am

This is excess step, you should not do that, because you will enter to sqlite3 session without any opened database, and will be forced to quit anyway, because you should perform a backup procedure first, this one:

if this command returns OK,

then you do not need to do anything with that database. And also it contradicts the next step as well, where you will check all databases, not only this one.

and again - if everything OK you should not proceed further, because there is nothing to repair. By the way, I fixed your command, it should be:

Get-ChildItem F:\Storj4.3\storage\*.db -File | %{$_.Name + " " + $(C:\sqlite\sqlite3.exe $_.FullName "PRAGMA integrity_check;")}

You do not need docker version of sqlite3, if you installed it locally and vice versa. This is one way or another, not both.

so the correct sequence using the local installation of sqlite3.exe is

Install sqlite3.exe

Stop the node (elevated PowerShell)
Windows GUI

Stop-Service storagenode

if you use docker (regular PowerShell):

docker stop storagenode

Check all databases

Get-ChildItem F:\Storj4.3\storage\*.db -File | %{$_.Name + " " + $(C:\sqlite\sqlite3.exe $_.FullName "PRAGMA integrity_check;")}

For not OK database:

Alexey:

In the PowerShell session:
cp F:\StorjD1.3\bandwidth.db F:\StorjD1.3\bandwidth.db.bak
Now run sqlite3.exe with the path to the database, it will open an sqlite3 session
sqlite3.exe F:\StorjD1.3\bandwidth.db
Unload all not corrupted data from the opened database to the file (from the sqlite3 session):
.mode insert
.output F:\StorjD1.3\dump_all.sql
.dump
.exit
Now in your PowerShell prepare data for load
Get-Content F:\StorjD1.3\dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 F:\StorjD1.3\dump_all_notrans.sql
In the PowerShell remove the corrupted database:
rm F:\StorjD1.3\bandwidth.db
Create a new bandwidth.db database with sqlite3 from the PowerShell and load prepared data:
sqlite3.exe F:\StorjD1.3\bandwidth.db ".read F:\StorjD1.3\dump_all_notrans.sql"
Check, that size of the database is not zero (PowerShell):
ls F:\StorjD1.3\bandwidth.db
if size is not zero, you may try to run your node back from the elevated (as an Administrator) PowerShell:
Start-Service storagenode
or for docker (regular PowerShell) use your full docker run command with all your parameters.

Then check your logs, if node is running - check your dashboard.

If the database has zero size, then you need to re-create it:

https://support.storj.io/hc/en-us/articles/4403032417044-How-to-fix-database-file-is-not-a-database-error

If you do not want to install sqlite3.exe, but you have docker, then you can use

and use instructions as for Linux

Backup

cp /storage/bandwidth.db /storage/bandwidth.db.bak

open corrupted database

sqlite3 /storage/bandwidth.db

unload data

.mode insert
.output /storage/dump_all.sql
.dump
.exit

Prepare data for load into a new database

cat /storage/dump_all.sql | grep -v TRANSACTION | grep -v ROLLBACK | grep -v COMMIT >/storage/dump_all_notrans.sql

Remove the corrupted database (make sure that you have a backup!)

rm /storage/bandwidth.db

Now we will load the unloaded data into the new database

sqlite3 /storage/bandwidth.db ".read /storage/dump_all_notrans.sql"

Check that the new database (bandwidth.db in our example) has a size larger than 0:

ls -l /storage/bandwidth.db

Alexey · April 28, 2023, 4:13am

Seems it’s a different node, so are you sure that bandwidth.db is corrupted?

Please check first:

Get-ChildItem F:\Storj4.3\storage\*.db -File | %{$_.Name + " " + $(C:\sqlite\sqlite3.exe $_.FullName "PRAGMA integrity_check;")}

if all databases are OK, you do not need to do anything.

But if bandwidth.db is not ok, then you may proceed.
If you have corrupted not bandwidth.db, then replace the bandwidth.db file to the corrupted one in the instruction below.

the restored database has 0 size as a backup one. That’s because you specified a wrong path to it.
Your databases not in F:\Storj4.3 as you specified, but in F:\Storj4.3\storage,
so

Backup

cp F:\Storj4.3\storage\bandwidth.db F:\Storj4.3\storage\bandwidth.db.bak

Open database

c:\sqlite\sqlite3.exe F:\Storj4.3\storage\bandwidth.db

Unload data

.mode insert
.output F:\Storj4.3\dump_all.sql
.dump
.exit

Modify data for load

Get-Content F:\Storj4.3\dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 F:\Storj4.3\dump_all_notrans.sql

remove the corrupted database

rm F:\Storj4.3\storage\bandwidth.db

Create a new one and load data

c:\sqlite\sqlite3.exe F:\Storj4.3\storage\bandwidth.db ".read F:\Storj4.3\dump_all_notrans.sql"

check size of the resulted database

ls F:\Storj4.3\storage\bandwidth.db

Aka985 · April 28, 2023, 4:08pm

Алексей спасибо! Я перейду на русский, все равно больше никто не отвечает)

Проделал команды по вашей инструкции, нода вроде бы поднялась, но есть ошибки в логах. Посмотрите, пожалуйста, в конце.

Выкладываю здесь алгоритм, по которому мне удалось восстановить bandwidth.db

Storj4.3
Лог ноды до восстановления:
PS C:\Windows\system32> docker logs --tail 20 storagenode3
“12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “ATPGM3DWUR3EQQMJ7HGUNKXXUFSZ7QI2NGOM2P6KRDAYN6QHEUPA”}
2023-04-28T08:51:54.103Z INFO collector deleted expired piece {“Process”: “storagenode”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “MSH5WGPV4T4DO4QJMKBN4ZVT4ILWGZSVN5DOWFLB6WZQS7WPJJPQ”}
2023-04-28T08:51:54.525Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “FSJGVMJK5TQKMPSIZ7BHLLAQYAXR7MUNQAAFISE2U3TB676DYZ4A”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_REPAIR”, “Offset”: 0, “Size”: 2319360, “Remote Address”: “172.17.0.1:53882”}
2023-04-28T08:51:54.537Z INFO collector deleted expired piece {“Process”: “storagenode”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Piece ID”: “VY3Y5UTSUHQCZ4CQQAVJMBANOQ7C7Z5WLHA4SYWCJEQWPPMHDQ3A”}
2023-04-28T08:51:54.542Z INFO collector collect {“Process”: “storagenode”, “count”: 101}
2023-04-28T08:51:57.555Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “SGPHLLR5PZ5M6BLCISHYLEKI5INUJVS5FNI242B3YW4MU77GWQQA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_REPAIR”, “Offset”: 0, “Size”: 181504, “Remote Address”: “172.17.0.1:53886”}
2023-04-28T08:51:58.520Z ERROR piecestore failed to add bandwidth usage {“Process”: “storagenode”, “error”: “bandwidthdb: database disk image is malformed”, “errorVerbose”: “bandwidthdb: database disk image is malformed\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:852\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func7:763\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:780\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
2023-04-28T08:51:58.525Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “SGPHLLR5PZ5M6BLCISHYLEKI5INUJVS5FNI242B3YW4MU77GWQQA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_REPAIR”, “Offset”: 0, “Size”: 181504, “Remote Address”: “172.17.0.1:53886”}
2023-04-28T08:51:59.018Z ERROR piecestore failed to add bandwidth usage {“Process”: “storagenode”, “error”: “bandwidthdb: database disk image is malformed”, “errorVerbose”: “bandwidthdb: database disk image is malformed\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:852\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func7:763\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:780\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
2023-04-28T08:51:59.022Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “FSJGVMJK5TQKMPSIZ7BHLLAQYAXR7MUNQAAFISE2U3TB676DYZ4A”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_REPAIR”, “Offset”: 0, “Size”: 2319360, “Remote Address”: “172.17.0.1:53882”}
2023-04-28T08:52:08.479Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “6U6AN7CYAY6EINLXN7L6OMUP7GN36PKUEVYT2XGOQI56AFM7IE5A”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “Offset”: 2224128, “Size”: 256, “Remote Address”: “172.17.0.1:53888”}
2023-04-28T08:52:09.722Z ERROR piecestore failed to add bandwidth usage {“Process”: “storagenode”, “error”: “bandwidthdb: database disk image is malformed”, “errorVerbose”: “bandwidthdb: database disk image is malformed\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:60\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).beginSaveOrder.func1:852\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func7:763\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:780\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
2023-04-28T08:52:09.725Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “6U6AN7CYAY6EINLXN7L6OMUP7GN36PKUEVYT2XGOQI56AFM7IE5A”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “Offset”: 2224128, “Size”: 256, “Remote Address”: “172.17.0.1:53888”}

Установить sqlite3 v3.25.2 or later
Создать папку C:\sqlite

Создал и установил

Остановить ноду в докере
Остановил еще раньше, вот подтверждение, что контейнера нет

PS C:\Windows\system32> Get-ChildItem F:\Storj4.3\storage*.db -File | %{$.Name + " " + $(C:\sqlite\sqlite3.exe $.FullName “PRAGMA integrity_check;”)}
Ответ:
bandwidth.db *** in database main *** On tree page 7348 cell 1: Child page depth differs Page 7454 is never used Page 7461 is never used Page 7463 is never used Page 7468 is never used Page 7898 is never used Page 7924 is never used Page 7954 is never used Page 7961 is never used Page 7971 is never used Page 7990 is never used Page 8028 is never used Page 8141 is never used Page 8144 is never used Page 8156 is never used Page 8160 is never used Page 8167 is never used Page 8172 is never used Page 8173 is never used Page 8182 is never used Page 8201 is never used Page 8223 is never used Page 8250 is never used Page 8260 is never used Page 10537 is never used Page 10554 is never used Page 10566 is never used Page 10568 is never used Page 10570 is never used Page 10577 is never used Page 10582 is never used Page 10616 is never used Page 10621 is never used Page 10624 is never used Page 10638 is never used Page 13561 is never used Page 13579 is never used Page 13609 is never used Page 13612 is never used Page 13641 is never used Page 13682 is never used Page 13697 is never used Page 13730 is never used Page 13748 is never used Page 13854 is never used Page 13859 is never used Page 13889 is never used Page 13923 is never used Page 13925 is never used Page 13935 is never used Page 13940 is never used Page 13944 is never used Page 13948 is never used Page 13951 is never used Page 13953 is never used Page 13969 is never used Page 13973 is never used Page 13980 is never used Page 14004 is never used Page 14040 is never used Page 14156 is never used Page 14162 is never used Page 14191 is never used Page 14199 is never used Page 14200 is never used Page 14201 is never used Page 14202 is never used Page 14203 is never used Page 14204 is never used row 92 missing from index idx_bandwidth_usage_satellite row 93 missing from index idx_bandwidth_usage_satellite row 95 missing from index idx_bandwidth_usage_satellite row 96 missing from index idx_bandwidth_usage_satellite row 97 missing from index idx_bandwidth_usage_satellite row 98 missing from index idx_bandwidth_usage_satellite row 100 missing from index idx_bandwidth_usage_satellite row 101 missing from index idx_bandwidth_usage_satellite row 102 missing from index idx_bandwidth_usage_satellite row 103 missing from index idx_bandwidth_usage_satellite row 104 missing from index idx_bandwidth_usage_satellite row 105 missing from index idx_bandwidth_usage_satellite row 106 missing from index idx_bandwidth_usage_satellite row 109 missing from index idx_bandwidth_usage_satellite row 111 missing from index idx_bandwidth_usage_satellite row 114 missing from index idx_bandwidth_usage_satellite row 115 missing from index idx_bandwidth_usage_satellite row 116 missing from index idx_bandwidth_usage_satellite row 117 missing from index idx_bandwidth_usage_satellite row 118 missing from index idx_bandwidth_usage_satellite row 119 missing from index idx_bandwidth_usage_satellite row 120 missing from index idx_bandwidth_usage_satellite row 121 missing from index idx_bandwidth_usage_satellite row 122 missing from index idx_bandwidth_usage_satellite row 123 missing from index idx_bandwidth_usage_satellite row 124 missing from index idx_bandwidth_usage_satellite row 125 missing from index idx_bandwidth_usage_satellite row 126 missing from index idx_bandwidth_usage_satellite row 127 missing from index idx_bandwidth_usage_satellite row 128 missing from index idx_bandwidth_usage_satellite row 129 missing from index idx_bandwidth_usage_satellite
heldamount.db ok
info.db ok
notifications.db ok
orders.db ok
pieceinfo.db ok
piece_expiration.db ok
piece_spaced_used.db ok
pricing.db ok
reputation.db ok
satellites.db ok
secret.db ok
storage_usage.db ok
used_serial.db ok
Понимаю, что bandwidth.db поврежден

Открываю базу данных
cp F:\Storj4.3\storage\bandwidth.db F:\Storj4.3\storage\bandwidth.db.bak

Затем внутри Sqlite выгружаю из нее данные:
.mode insert
.output F:\Storj4.3\dump_all.sql
.dump
.exit

Что-то делаю этой командой:
Get-Content F:\Storj4.3dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 F:\Storj4.3\dump_all_notrans.sql
Почему-то файл называется так Storj4.3dump_all.sql и находится в корне. Наверное где-то не прочитался \

Удаляю базу данных:
rm F:\Storj4.3\storage\bandwidth.db

c:\sqlite\sqlite3.exe F:\Storj4.3\storage\bandwidth.db “.read F:\Storj4.3\dump_all_notrans.sql”
Слэш между Storj4.3\dump_all_notrans.sql почему-то не читается системой. Заменил его на обратный слэш / и заработало

Удаляю базу данных:
rm F:\Storj4.3\storage\bandwidth.db

c:\sqlite\sqlite3.exe F:\Storj4.3\storage\bandwidth.db “.read F:\Storj4.3\dump_all_notrans.sql”
Слэш между Storj4.3\dump_all_notrans.sql почему-то не читается системой. Заменил его на обратный слэш / и заработало

ls F:\Storj4.3/storage/bandwidth.db

Нода запустилась. Обмен данными пошел. При этом в логе есть ошибки. Насколько это серьезно?

На всякий случай остановил ноду до получения ответа Алексея

PS C:\Windows\system32> docker logs --tail 20 storagenode3
2023-04-28T15:57:35.193Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “GKJEDHKVU2UIHAP3CAS5RVNUAB3LYCBYX2HT5LQQCOT7QC3N23AA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 877056, “Size”: 4864, “Remote Address”: “172.17.0.1:35810”}
2023-04-28T15:57:35.362Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “GKJEDHKVU2UIHAP3CAS5RVNUAB3LYCBYX2HT5LQQCOT7QC3N23AA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 877056, “Size”: 4864, “Remote Address”: “172.17.0.1:35810”}
2023-04-28T15:57:36.510Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “E2LJOHLIBHG76LLNSGG3WFPNXTZEHLCFOERYNIBWWDTXHXXXQ5TA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 0, “Size”: 10496, “Remote Address”: “172.17.0.1:35814”}
2023-04-28T15:57:36.859Z INFO piecestore downloaded {“Process”: “storagenode”, “Piece ID”: “E2LJOHLIBHG76LLNSGG3WFPNXTZEHLCFOERYNIBWWDTXHXXXQ5TA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 0, “Size”: 10496, “Remote Address”: “172.17.0.1:35814”}
2023-04-28T15:57:37.383Z INFO piecestore download started {“Process”: “storagenode”, “Piece ID”: “JYS46HWGDLXF3MXFCUMFVLKHC2UKAMEPBG7KJ3PL3KOH27V6SYFQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 0, “Size”: 8960, “Remote Address”: “172.17.0.1:35816”}
2023-04-28T15:57:37.729Z INFO piecestore download canceled {“Process”: “storagenode”, “Piece ID”: “JYS46HWGDLXF3MXFCUMFVLKHC2UKAMEPBG7KJ3PL3KOH27V6SYFQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 0, “Size”: 0, “Remote Address”: “172.17.0.1:35816”}
2023-04-28 15:57:37,860 WARN received SIGTERM indicating exit request
2023-04-28 15:57:37,861 INFO waiting for storagenode, processes-exit-eventlistener, storagenode-updater to die
2023-04-28T15:57:37.862Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode-updater”}
2023-04-28 15:57:37,868 INFO stopped: storagenode-updater (exit status 0)
2023-04-28T15:57:37.878Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode”}
2023-04-28T15:57:37.888Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: “pieces error: filestore error: context canceled\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).EmptyTrash:158\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n\tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:401\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1.1:83\n\tstorj.io/common/sync2.(*Workplace).Start.func1:89”}
2023-04-28T15:57:37.969Z ERROR piecestore:cache error getting current used space: {“Process”: “storagenode”, “error”: “context canceled; context canceled; context canceled”, “errorVerbose”: “group:\n— context canceled\n— context canceled\n— context canceled”}
2023-04-28T15:57:37.990Z INFO piecestore download canceled {“Process”: “storagenode”, “Piece ID”: “Y4BY4JDAHMJ6O7RZV33S62XXIWV75P34I6MPPQGK6K7MDQNNWLWA”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”, “Offset”: 0, “Size”: 1056768, “Remote Address”: “172.17.0.1:35702”}
2023-04-28T15:57:38.120Z INFO piecestore upload canceled {“Process”: “storagenode”, “Piece ID”: “PHED3OFJRPABCM62OBU2JOXHUATVASVXPCIUEBIUXZ6LDMCAF3LQ”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “PUT_REPAIR”, “Size”: 2097152, “Remote Address”: “172.17.0.1:35738”}
2023-04-28T15:57:38.204Z INFO piecestore upload canceled {“Process”: “storagenode”, “Piece ID”: “F5ZBLY2TG44QA5DC3DI627NBPBBCYMEE2EUPUQDOPKNSGFMREVOQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 393216, “Remote Address”: “172.17.0.1:35806”}
2023-04-28T15:57:38.286Z INFO piecestore upload canceled {“Process”: “storagenode”, “Piece ID”: “YKWYG7IQLLDD35JBC2DNTV3E7A5SPPNYUQFJ6YYYY7RH53MIVOBQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “Size”: 655360, “Remote Address”: “172.17.0.1:35282”}
2023-04-28 15:57:41,292 INFO waiting for storagenode, processes-exit-eventlistener to die
2023-04-28 15:57:42,543 INFO stopped: storagenode (exit status 0)
2023-04-28 15:57:42,546 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

Aka985 · April 28, 2023, 4:11pm

В этой ветке буду задавать вопросы по восстановлению ноды StorjD1.1

Это старая нода с 3.5Тб данных.
Я не забэкапил bandwidth.db. Делал всё по схеме, которая сработала на Storj4.3, но в конце я получил bandwidth.db с нулевым размером.
Удалил старый файл базы через командную строку, поэтому в корзине его нет. Это конец? Восстановить нельзя? Может я где-то опять перепутал путь и из созданных dump-ов можно пересобрать новую базу?

Бак имеет размер:

dump_all не имеет размера

Я попробую какую-нибудь Recuva, но не очень на это надеюсь, будет долго и 12 дней пройдут.

Прошу хотя-бы подсказать, из-за чего возникают базы нулевого размера

Команды, которые вводил:
StorjD1.1

Лог ноды до восстановления:
PS C:\Users\Storj D1> docker logs --tail 50 storagenodeD1.1
2023-04-28T16:43:20.525Z INFO Operator email {“Process”: “storagenode”, “Address”: “7437493@gmail.com”}
2023-04-28T16:43:20.526Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “0x8675290882f594227d9b69d1fc434bf54b2b5e6f”}
Error: Error starting master database on storagenode: group:
— stat config/storage/blobs: no such file or directory
— stat config/storage/temp: no such file or directory
— stat config/storage/garbage: no such file or directory
— stat config/storage/trash: no such file or directory
2023-04-28 16:43:20,542 INFO exited: storagenode (exit status 1; not expected)
2023-04-28T16:43:21.371Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode”, “Version”: “v1.76.2”}
2023-04-28T16:43:21.372Z INFO New version is being rolled out but hasn’t made it to this node yet {“Process”: “storagenode-updater”, “Service”: “storagenode”}
2023-04-28 16:43:21,373 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-28 16:43:21,373 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2023-04-28T16:43:21.485Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.76.2”}
2023-04-28T16:43:21.486Z INFO New version is being rolled out but hasn’t made it to this node yet {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}
2023-04-28 16:43:22,496 INFO spawned: ‘storagenode’ with pid 44
2023-04-28T16:43:23.008Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2023-04-28T16:43:23.048Z INFO Operator email {“Process”: “storagenode”, “Address”: “7437493@gmail.com”}
2023-04-28T16:43:23.049Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “0x8675290882f594227d9b69d1fc434bf54b2b5e6f”}
Error: Error starting master database on storagenode: group:
— stat config/storage/blobs: no such file or directory
— stat config/storage/temp: no such file or directory
— stat config/storage/garbage: no such file or directory
— stat config/storage/trash: no such file or directory
2023-04-28 16:43:23,063 INFO exited: storagenode (exit status 1; not expected)
2023-04-28 16:43:25,071 INFO spawned: ‘storagenode’ with pid 52
2023-04-28T16:43:25.502Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2023-04-28T16:43:25.527Z INFO Operator email {“Process”: “storagenode”, “Address”: “7437493@gmail.com”}
2023-04-28T16:43:25.528Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “0x8675290882f594227d9b69d1fc434bf54b2b5e6f”}
Error: Error starting master database on storagenode: group:
— stat config/storage/blobs: no such file or directory
— stat config/storage/temp: no such file or directory
— stat config/storage/garbage: no such file or directory
— stat config/storage/trash: no such file or directory
2023-04-28 16:43:25,544 INFO exited: storagenode (exit status 1; not expected)
2023-04-28 16:43:28,553 INFO spawned: ‘storagenode’ with pid 60
2023-04-28T16:43:28.925Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2023-04-28T16:43:28.970Z INFO Operator email {“Process”: “storagenode”, “Address”: “7437493@gmail.com”}
2023-04-28T16:43:28.970Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “0x8675290882f594227d9b69d1fc434bf54b2b5e6f”}
Error: Error starting master database on storagenode: group:
— stat config/storage/blobs: no such file or directory
— stat config/storage/temp: no such file or directory
— stat config/storage/garbage: no such file or directory
— stat config/storage/trash: no such file or directory
2023-04-28 16:43:28,987 INFO exited: storagenode (exit status 1; not expected)
2023-04-28 16:43:29,990 INFO gave up: storagenode entered FATAL state, too many start retries too quickly
2023-04-28 16:43:30,993 WARN received SIGQUIT indicating exit request
2023-04-28 16:43:30,993 INFO waiting for processes-exit-eventlistener, storagenode-updater to die
2023-04-28T16:43:30.994Z INFO Got a signal from the OS: “terminated” {“Process”: “storagenode-updater”}
2023-04-28 16:43:30,999 INFO stopped: storagenode-updater (exit status 0)
2023-04-28 16:43:32,002 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)
PS C:\Users\Storj D1>
Установить sqlite3 v3.25.2 or later
Создать папку C:\sqlite

Создал и установил

Остановить ноду в докере
Остановил

PS C:\Windows\system32>
Get-ChildItem D:\StorjD1.1\storage*.db -File | %{$.Name + " " + $(C:\sqlite\sqlite3.exe $.FullName “PRAGMA integrity_check;”)}
Ответ:
Error: in prepare, file is not a database (26)
bandwidth.db
heldamount.db ok
info.db ok
notifications.db ok
orders.db ok
pieceinfo.db ok
piece_expiration.db row 3012 missing from index sqlite_autoindex_piece_expirations_1 row 3013 missing from index sqlite_autoindex_piece_expirations_1 row 3014 missing from index sqlite_autoindex_piece_expirations_1 wrong # of entries in index sqlite_autoindex_piece_expirations_1
piece_spaced_used.db ok
pricing.db ok
reputation.db ok
satellites.db ok
secret.db ok
storage_usage.db ok
used_serial.db ok

///////Создаю бак
cp D:\StorjD1.1\storage\bandwidth.db D:\StorjD1.1\storage\bandwidth.db.bak

///////Открываю базу данных
c:\sqlite\sqlite3.exe D:\StorjD1.1\storage\bandwidth.db

Затем внутри Sqlite выгружаю из нее данные:
.mode insert
.output D:\StorjD1.1\dump_all.sql
.dump
.exit

Что-то делаю этой командой:
Get-Content D:\StorjD1.1dump_all.sql | Select-String -NotMatch TRANSACTION | Select-String -NotMatch ROLLBACK | Select-String -NotMatch COMMIT | Set-Content -Encoding utf8 D:\StorjD1.1\dump_all_notrans.sql
Почему-то файл называется так StorjD1.1dump_all.sql и находится в корне. Наверное, где-то не прочитался \

Удаляю базу данных:
rm D:\StorjD1.1\storage\bandwidth.db

c:\sqlite\sqlite3.exe D:\StorjD1.1\storage\bandwidth.db “.read D:/StorjD1.1/dump_all_notrans.sql”
Слэш между StorjD1.1\dump_all_notrans.sql почему-то не читается системой. Заменил его на обратный слэш / и заработало

ls D:\StorjD1.1/storage/bandwidth.db