Unrecoverable error {"error": "transport error: connection error:

This morning I turned the computer on again and tried to start the storagenode. Here what happened:

fmas@delta:~$ sudo docker start storagenode
[sudo] password for fmas: 
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker stop -t 300 storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is not running
fmas@delta:~$ sudo docker start storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
OCI runtime exec failed: exec failed: cannot exec a container that has stopped: unknown
fmas@delta:~$ sudo docker start storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ 

@Alexey Here below the logs:

 fmas@delta:~$ sudo docker logs --tail 100 storagenode

goroutine 11 [select]:
database/sql.(*DB).connectionResetter(0xc0001c83c0, 0x104a820, 0xc00014bd00)
	/usr/local/go/src/database/sql/sql.go:1065 +0xfb
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:723 +0x193
2019-10-01T05:52:27.735Z	INFO	Configuration loaded from: /app/config/config.yaml
2019-10-01T05:52:27.771Z	INFO	Operator email: aseegy@gmail.com
2019-10-01T05:52:27.771Z	INFO	operator wallet: 0xFd0E17617947f96fd955BF89E2B006d28FE37d00
unexpected fault address 0x7f405bd67000
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f405bd67000 pc=0xb174f1]

goroutine 1 [running]:
runtime.throw(0xef76d1, 0x5)
	/usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc00018c998 sp=0xc00018c968 pc=0x4335e2
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:391 +0x455 fp=0xc00018c9c8 sp=0xc00018c998 pc=0x448e35
github.com/boltdb/bolt.(*DB).page(...)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:796
github.com/boltdb/bolt.(*DB).mmap(0xc000106780, 0x0, 0x0, 0x0)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:282 +0x251 fp=0xc00018ca88 sp=0xc00018c9c8 pc=0xb174f1
github.com/boltdb/bolt.Open(0xc000038527, 0x15, 0x180, 0xc00018cb98, 0xc00038c000, 0xc00007b6c0, 0xc00018cbb0)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:230 +0x2ae fp=0xc00018cb50 sp=0xc00018ca88 pc=0xb16ebe
storj.io/storj/storage/boltdb.New(0xc000038527, 0x15, 0xefd33b, 0xb, 0x2, 0x2, 0xc000362f00)
	/go/src/storj.io/storj/storage/boltdb/client.go:41 +0x7f fp=0xc00018cc30 sp=0xc00018cb50 pc=0xb2732f
storj.io/storj/pkg/revocation.newDBBolt(0xc000038527, 0x15, 0xc000038520, 0x4, 0xc000038527)
	/go/src/storj.io/storj/pkg/revocation/common.go:52 +0x4e fp=0xc00018cc80 sp=0xc00018cc30 pc=0xb73dbe
storj.io/storj/pkg/revocation.NewDB(0xc000038520, 0x1c, 0xe, 0xc0000d3020, 0x1c)
	/go/src/storj.io/storj/pkg/revocation/common.go:34 +0x1bf fp=0xc00018cce0 sp=0xc00018cc80 pc=0xb73caf
storj.io/storj/pkg/revocation.NewDBFromCfg(...)
	/go/src/storj.io/storj/pkg/revocation/common.go:21
main.cmdRun(0x1888ae0, 0xc0001f00d0, 0x0, 0xd, 0x0, 0x0)
	/go/src/storj.io/storj/cmd/storagenode/main.go:143 +0x521 fp=0xc00018d280 sp=0xc00018cce0 pc=0xc29711
storj.io/storj/pkg/process.cleanup.func1.2(0x104aae0, 0xc0001ee280)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:264 +0x13b fp=0xc00018d318 sp=0xc00018d280 pc=0xaec53b
storj.io/storj/pkg/process.cleanup.func1(0x1888ae0, 0xc0001f00d0, 0x0, 0xd, 0x0, 0x0)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:282 +0x17df fp=0xc00018dd50 sp=0xc00018d318 pc=0xaeddcf
github.com/spf13/cobra.(*Command).execute(0x1888ae0, 0xc0001bbee0, 0xd, 0xd, 0x1888ae0, 0xc0001bbee0)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:762 +0x460 fp=0xc00018de28 sp=0xc00018dd50 pc=0x62cbb0
github.com/spf13/cobra.(*Command).ExecuteC(0x1888880, 0xc0000b2120, 0x1, 0x1)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x2ea fp=0xc00018def8 sp=0xc00018de28 pc=0x62d5ea
github.com/spf13/cobra.(*Command).Execute(...)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800
storj.io/storj/pkg/process.Exec(0x1888880)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:73 +0x17f fp=0xc00018df48 sp=0xc00018def8 pc=0xae8f0f
main.main()
	/go/src/storj.io/storj/cmd/storagenode/main.go:296 +0x2d fp=0xc00018df60 sp=0xc00018df48 pc=0xc2af8d
runtime.main()
	/usr/local/go/src/runtime/proc.go:203 +0x21e fp=0xc00018dfe0 sp=0xc00018df60 pc=0x434f7e
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00018dfe8 sp=0xc00018dfe0 pc=0x463541

goroutine 22 [select]:
database/sql.(*DB).connectionResetter(0xc00038c000, 0x104a820, 0xc0000d5f80)
	/usr/local/go/src/database/sql/sql.go:1065 +0xfb
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:723 +0x193

goroutine 21 [select]:
database/sql.(*DB).connectionOpener(0xc00038c000, 0x104a820, 0xc0000d5f80)
	/usr/local/go/src/database/sql/sql.go:1052 +0xe8
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:722 +0x15d

goroutine 8 [syscall]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:29 +0x41

goroutine 20 [chan receive]:
storj.io/storj/pkg/process.Ctx.func1(0xc0002b9e00, 0xc00033f2c0)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:89 +0x41
created by storj.io/storj/pkg/process.Ctx
	/go/src/storj.io/storj/pkg/process/exec_conf.go:88 +0x1b6

goroutine 13 [IO wait]:
internal/poll.runtime_pollWait(0x7f405befeea8, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:184 +0x55
internal/poll.(*pollDesc).wait(0xc000159898, 0x72, 0x0, 0x0, 0xef9734)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc000159880, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:384 +0x1f8
net.(*netFD).accept(0xc000159880, 0xc000060cc0, 0xc000056a80, 0x7f405e25a6d0)
	/usr/local/go/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc00030c960, 0xc000060cf0, 0x4110a8, 0x30)
	/usr/local/go/src/net/tcpsock_posix.go:139 +0x32
net.(*TCPListener).Accept(0xc00030c960, 0xe60d20, 0xc00010b9e0, 0xdb22e0, 0x187c0a0)
	/usr/local/go/src/net/tcpsock.go:261 +0x47
net/http.(*Server).Serve(0xc0001102a0, 0x1044fa0, 0xc00030c960, 0x0, 0x0)
	/usr/local/go/src/net/http/server.go:2896 +0x286
storj.io/storj/pkg/process.initDebug.func2(0xc0002b9bc0, 0x1044fa0, 0xc00030c960, 0xc00007be40)
	/go/src/storj.io/storj/pkg/process/debug.go:52 +0x15d
created by storj.io/storj/pkg/process.initDebug
	/go/src/storj.io/storj/pkg/process/debug.go:50 +0x38f
fmas@delta:~$ 

Thank you all for your support.

Dear all,

do you have any suggestions on how I should proceed?

Thanks once more.

Unfortunately I have no idea, what is wrong with your setup.
Forwarded this issue to the team.

I got the same issue today with my setup, before it was running fine from May 1st this year
I even reinstalled Ubuntu 18.04 VM from scratch, still getting the same error every time:

docker exec -it storagenode /app/dashboard.sh
2019-10-03T19:45:23.594Z INFO Configuration loaded from: /app/config/config.yaml
2019-10-03T19:45:23.602Z INFO Node ID: 12p9V4idSd1zNt1k9gC6oWswrEnupc756vnyL4rMGBY6qMFirtp
2019-10-03T19:45:43.602Z FATAL Unrecoverable error {“error”: “transport error: context deadline exceeded”, “errorVerbose”: “transport error: context deadline exceeded\n\tstorj.io/storj/pkg/transport.DialAddressInsecure:31\n\tmain.dialDashboardClient:37\n\tmain.cmdDashboard:66\n\tstorj.io/storj/pkg/process.cleanup.func1.2:264\n\tstorj.io/storj/pkg/process.cleanup.func1:282\n\tgithub.com/spf13/cobra.(*Command).execute:762\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:852\n\tgithub.com/spf13/cobra.(*Command).Execute:800\n\tstorj.io/storj/pkg/process.Exec:73\n\tmain.main:296\n\truntime.main:203”}

My data is stored on FreeNAS server and is accessed via ISCSI, Data is across 4 drives in RaidZ, never had any issue.

Tried to rename the kademlia file no help
Please help me figure this out!!!

Nevermind, i got it working,
No idea what that was but after reinstalling the VM hosting docker it started to work

This error is a consequence of the previous main error. It could be seen only in the log via docker logs --tail 20 storagenode (in the current case the stack trace is longer, so you should use the 100 lines instead of only 20).

I’m glad that you do not have the current issue and got it working!

@syncro happy for you too!

@Alexey will the team write to you and you will add to this topic their suggestions? Or how is it supposed to work?

I have a second computer which is running windows 10. On that computer (computer name: echo) I also have a WD Red one terabyte drive. I could move the data over to it. But before doing this I will need to install Linux on it. I have already tried with Ubuntu and Debian in the past and with both I was not able to get my networking card to work properly. I believe the driver is not correct and needs to be installed manually. In order to do that I will need help… As of course withouth ethernet/internet I will not be able to use the common commands to download the correct drivers. Would be available to help me out on this?

A friend of mine suggested I should install suse Linux instead as it should be better in terms of drivers.

P.S.
If we decide to follow this path we may go off topic. Please let me know if I have to open a new topic, focused on installing Linux and adding drivers manually.

Thanks to all of you for your support. Just sad that my node is ruining all its career, being offline since a week now.

I created an issue in the internal project. Will keep this thread updated or Engineers can update it too.

With the latest Docker update seems the network problem is gone. You can try with a Docker desktop again.

This is definitely a different topic. If you plan to start it here, I will split it anyway :slight_smile:

Did I understand correctly that I can now use windows 10 as an OS since docker seems to have solved its past issues?

I didn’t see problems with a Docker desktop while ago (at least a month), so either every SNO is starting with Linux, or the Docker is fixed this issue.

I running the Windows node as well, but I could not be a representative because my setup have not had this issue at the day zero (back in February).

Ok, I will then proceed in moving the data from the Linux computer to the windows computer (although I liked the Linux version better).

How to do that in order to ensure a correct transfer of the data, considering that the file system is a different one (ext4 vs ntfs).

BTW, since the node is offline since two weeks, please let me know if it is possible to recover it or if it has been killed.

Thanks.

I can suggest you to use the rsync as described in the

The disqualification for downtime is currently disabled, so your reputation probably ruined, but it will recovers as soon as you bring your node online.

1 Like