Docker node - restarting - master database on storagenode: database: database is locked

Hi Everyone,

I’m building my first node on Debian/Docker.
Storage is a fstab mounted device and I’m able to do an sudo touch /mountpounr/storj/whatever

My docker Container is always retarting.
In docker log (sudo docker logs --tail 40 storagenode), i’ve following DB error : :

Error: Error starting master database on storagenode: database: database is locked

  •    storj.io/storj/storagenode/storagenodedb.(*DB).openDatabase:323*
    
  •    storj.io/storj/storagenode/storagenodedb.(*DB).openExistingDatabase:305*
    
  •    storj.io/storj/storagenode/storagenodedb.(*DB).openDatabases:281*
    
  •    storj.io/storj/storagenode/storagenodedb.OpenExisting:248*
    
  •    main.cmdRun:160*
    
  •    storj.io/private/process.cleanup.func1.4:363*
    
  •    storj.io/private/process.cleanup.func1:381*
    
  •    github.com/spf13/cobra.(*Command).execute:852*
    
  •    github.com/spf13/cobra.(*Command).ExecuteC:960*
    
  •    github.com/spf13/cobra.(*Command).Execute:897*
    
  •    storj.io/private/process.ExecWithCustomConfig:88*
    
  •    storj.io/private/process.ExecCustomDebug:70*
    
  •    main.main:385*
    
  •    runtime.main:204*
    

I’ve no idea how to troubleshoot…
What/were is this master database ? is their any specific right to setup on it ?

Any hints are welcome …

thanks

by default this database and the others are located in the storj folder you use in your docker run command…

Has this node worked in the past and have you done anything to it… ?
like a migration…
because might be related to permissions of your storj folder.

often database locked can be see with high disk latency, but that shouldn’t keep the node from starting…

So all DB are store in my mount point.
It’s an brand new node…

To create file in mount point, or to lunch docker…I must do an sudo
Mayby docker contener is lunch as user and this one cannot write in mount point ?

If it was the case, so I should not have files in, and that’s not what i’m having :

ls /media/StorJ/
config.yaml orders revocations.db storage test test.fds

How disk latency can be benmarked/test ?
Latency could be weird also, everyting is gigabyte connected and working fine.
I’m using same storage to read big 3D flat blueray without issues ^^

on a new node i bet you it’s a permissions issue…
you can do a sudo chown -R nobody:nogroup /media/StorJ
i think that would solve it
you can ofc use whatever users and groups you like… but that blanket one basically just gives everybody access because the files doesn’t belong to anyone, might not be optimal from a security perspective… but it sure won’t cause any permission problems.

and then when you get the node working you can go back and tinker with what folder / file permissions you need for storj.

any particular reason you like to suffer the pains of using semi random capital letters in the folder names?

any regular tool for measuring disk performance will do, but it’s not latency…

I dig an bit and saw lot of complain maping CIFS
So i’ve :

  • Create an new iSCSI lun
  • mount it as new /dev/sda
  • format it as ext4
  • stop docker node an move all files from old mount to new one
  • started the node

It fixed my issue …But now node is offline…grrrr

you will need to do

docker stop storagenode
docker rm storagenode

“storagenode” is ofc the name you give your storagenode, but storagenode is the default.

and then do your run command again, with the updated storage and identity locations.
would look something like this.

docker run -d --restart unless-stopped --stop-timeout 300 -p 192.168.1.100:28967:28967/tcp -p 192.168.1.100:28967:28967/udp \
-p 192.168.1.100:14002:14002 -e WALLET="0x111111111111111111111" \
-e EMAIL="your@email.com" -e ADDRESS="global.ip.inet:28967" \
-e STORAGE="4TB" --mount type=bind,source="/sn3/id-sn3",destination=/app/identity \
--mount type=bind,source="/sn3/storj",destination=/app/config --name sn3 storjlabs/storagenode:latest

I already did :

  • sudo docker stop storagenode
  • sudo docker rm storagenode
  • sudo docker run -d --restart unless-stopped --stop-timeout 300
    -p 28967:28967/tcp
    -p 28967:28967/udp
    -p 14002:14002
    -e WALLET=“0x************************”
    -e EMAIL="my@mail.com"
    -e ADDRESS=“MyPublicFixIp:28967”
    -e STORAGE=“5TB”
    –mount type=bind,source="/home/freebox/Library/Application Support/StorJ/identity/storagenode/",destination=/app/identity
    –mount type=bind,source="/media/iSCSI/",destination=/app/config
    –name storagenode storjlabs/storagenode:latestune
  • telnet MyPublicFixIP 28967 is OK also
  • grep -c BEGIN ~/.local/share/storj/identity/storagenode/ca.cert → return 2
  • grep -c BEGIN ~/.local/share/storj/identity/storagenode/identity.cert → return 3
  • port redirection also tested in Open Port Check Tool - Test Port Forwarding on Your Router

Hello @stygre ,
Welcome to the forum!

Yes, the network filesystems are not supported, the working network protocol for storage is iSCSI

When you checked on yougetsignal, is your port open?

1 Like

yes, port is open, tested also using an basic telnet on my fix public IP …

Please, try to stop the storagenode, is the port still open?

port closed if docker’s container is stoped

Please, start the node back

docker start storagenode

And check logs:

docker logs --tail 20 storagenode

If node is still running, try to use the local dashboard: Dashboard - Node Operator

good hint…I miss to check logs again after my DB issue ^^

now having following in logs:

*> 2021-07-28T18:56:14.899Z ERROR contact:service ping satellite failed {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “attempts”: 7, “error”: “ping satellite: check-in ratelimit: node rate limited by id”, “errorVerbose”: "ping satellite: check-in ratelimit: node rate limited by id\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:138\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(Group).Go.func1:57"}

I’ve no idea around the meaning of that …

look like having an cert issue no ?

ERROR contact:service ping satellite failed {“Satellite ID”: “121RTU84Ticf2L1ntiuUuvgk3vzoA6”, “attempts”: 8, “error”: “ping satellite: failed to dial storage node (ID: 1aWS6BZNdydfeCSWsmeTsr) at address MyPublixFixedIP:28967: rpc: tls peer certificate verification: not signed by any CA in the whitelist: CA cert", “errorVerbose”: "ping satellite: failed to dial storage node (ID: 1aW6BZNdydfekdEY5CSWsmeTsr) at address MyPublicFixedIp:28967: rpc: tls peer certificate verification: not signed by any CA in the whitelist: CA cert\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatelliteOnce:141\n\tstorj.io/storj/storagenode/contact.(*Service).pingSatellite:95\n\tstorj.io/storj/storagenode/contact.(*Chore).updateCycles.func1:87\n\tstorj.io/common/sync2.(*Cycle).Run:92\n\tstorj.io/common/sync2.(*Cycle).Start.func1:71\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

Your identity is not signed.

docker exec -it storagenode ls -l /app/identity

What is doing it ?

freebox@Debian:~$ sudo docker exec -it storagenode ls -l /app/identity
total 8
-rwxr-xr-x 1 1000 1000 1100 Jul 28 11:22 identity.cert
-rwxr-xr-x 1 1000 1000 241 Jul 28 11:22 identity.key

Just confirmation that it’s not signed. The signed identity should contain 6 files. But it’s better to check like this:

grep -c BEGIN "/home/freebox/Library/Application Support/StorJ/identity/storagenode/ca.cert"
grep -c BEGIN "/home/freebox/Library/Application Support/StorJ/identity/storagenode/identity.cert"

not the same as

The first one for Linux, but in Mac the path is different
See Identity - Node Operator on tab MacOS

grep -c BEGIN ~/Library/Application\ Support/Storj/identity/storagenode/ca.cert
grep -c BEGIN ~/Library/Application\ Support/Storj/identity/storagenode/identity.cert

ouuups …#newbies

freebox@Debian:~$ grep -c BEGIN “/home/freebox/Library/Application Support/StorJ/identity/storagenode/ca.cert”
1
freebox@Debian:~$ grep -c BEGIN “/home/freebox/Library/Application Support/StorJ/identity/storagenode/identity.cert”
2

1 Like

This was expected right away, when I saw 2 files instead of 6
It’s not a problem, just request a new auth token and sign the identity Identity - Node Operator

I’ve cp the two missing one, generated on my windows computer to get identity quickly :

freebox@Debian:~$ ls “/home/freebox/Library/Application Support/StorJ/identity/storagenode”/
ca.cert ca.key identity.cert identity.key

What are missing files ? oO