Unexpected fault address

This morning I turned the computer on again and tried to start the storagenode. Here what happened:

fmas@delta:~$ sudo docker start storagenode
[sudo] password for fmas: 
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker stop -t 300 storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is not running
fmas@delta:~$ sudo docker start storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
OCI runtime exec failed: exec failed: cannot exec a container that has stopped: unknown
fmas@delta:~$ sudo docker start storagenode
storagenode
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ sudo docker exec -it storagenode /app/dashboard.sh
Error response from daemon: Container b80d027df5a06f6b74e67915e0d21edf245a1f23642345aa83d5bdc86cde0834 is restarting, wait until the container is running
fmas@delta:~$ 

@Alexey Here below the logs:

 fmas@delta:~$ sudo docker logs --tail 100 storagenode

goroutine 11 [select]:
database/sql.(*DB).connectionResetter(0xc0001c83c0, 0x104a820, 0xc00014bd00)
	/usr/local/go/src/database/sql/sql.go:1065 +0xfb
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:723 +0x193
2019-10-01T05:52:27.735Z	INFO	Configuration loaded from: /app/config/config.yaml
2019-10-01T05:52:27.771Z	INFO	Operator email: aseegy@gmail.com
2019-10-01T05:52:27.771Z	INFO	operator wallet: 0xFd0E17617947f96fd955BF89E2B006d28FE37d00
unexpected fault address 0x7f405bd67000
fatal error: fault
[signal SIGBUS: bus error code=0x2 addr=0x7f405bd67000 pc=0xb174f1]

goroutine 1 [running]:
runtime.throw(0xef76d1, 0x5)
	/usr/local/go/src/runtime/panic.go:774 +0x72 fp=0xc00018c998 sp=0xc00018c968 pc=0x4335e2
runtime.sigpanic()
	/usr/local/go/src/runtime/signal_unix.go:391 +0x455 fp=0xc00018c9c8 sp=0xc00018c998 pc=0x448e35
github.com/boltdb/bolt.(*DB).page(...)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:796
github.com/boltdb/bolt.(*DB).mmap(0xc000106780, 0x0, 0x0, 0x0)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:282 +0x251 fp=0xc00018ca88 sp=0xc00018c9c8 pc=0xb174f1
github.com/boltdb/bolt.Open(0xc000038527, 0x15, 0x180, 0xc00018cb98, 0xc00038c000, 0xc00007b6c0, 0xc00018cbb0)
	/go/pkg/mod/github.com/boltdb/bolt@v1.3.1/db.go:230 +0x2ae fp=0xc00018cb50 sp=0xc00018ca88 pc=0xb16ebe
storj.io/storj/storage/boltdb.New(0xc000038527, 0x15, 0xefd33b, 0xb, 0x2, 0x2, 0xc000362f00)
	/go/src/storj.io/storj/storage/boltdb/client.go:41 +0x7f fp=0xc00018cc30 sp=0xc00018cb50 pc=0xb2732f
storj.io/storj/pkg/revocation.newDBBolt(0xc000038527, 0x15, 0xc000038520, 0x4, 0xc000038527)
	/go/src/storj.io/storj/pkg/revocation/common.go:52 +0x4e fp=0xc00018cc80 sp=0xc00018cc30 pc=0xb73dbe
storj.io/storj/pkg/revocation.NewDB(0xc000038520, 0x1c, 0xe, 0xc0000d3020, 0x1c)
	/go/src/storj.io/storj/pkg/revocation/common.go:34 +0x1bf fp=0xc00018cce0 sp=0xc00018cc80 pc=0xb73caf
storj.io/storj/pkg/revocation.NewDBFromCfg(...)
	/go/src/storj.io/storj/pkg/revocation/common.go:21
main.cmdRun(0x1888ae0, 0xc0001f00d0, 0x0, 0xd, 0x0, 0x0)
	/go/src/storj.io/storj/cmd/storagenode/main.go:143 +0x521 fp=0xc00018d280 sp=0xc00018cce0 pc=0xc29711
storj.io/storj/pkg/process.cleanup.func1.2(0x104aae0, 0xc0001ee280)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:264 +0x13b fp=0xc00018d318 sp=0xc00018d280 pc=0xaec53b
storj.io/storj/pkg/process.cleanup.func1(0x1888ae0, 0xc0001f00d0, 0x0, 0xd, 0x0, 0x0)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:282 +0x17df fp=0xc00018dd50 sp=0xc00018d318 pc=0xaeddcf
github.com/spf13/cobra.(*Command).execute(0x1888ae0, 0xc0001bbee0, 0xd, 0xd, 0x1888ae0, 0xc0001bbee0)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:762 +0x460 fp=0xc00018de28 sp=0xc00018dd50 pc=0x62cbb0
github.com/spf13/cobra.(*Command).ExecuteC(0x1888880, 0xc0000b2120, 0x1, 0x1)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:852 +0x2ea fp=0xc00018def8 sp=0xc00018de28 pc=0x62d5ea
github.com/spf13/cobra.(*Command).Execute(...)
	/go/pkg/mod/github.com/spf13/cobra@v0.0.3/command.go:800
storj.io/storj/pkg/process.Exec(0x1888880)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:73 +0x17f fp=0xc00018df48 sp=0xc00018def8 pc=0xae8f0f
main.main()
	/go/src/storj.io/storj/cmd/storagenode/main.go:296 +0x2d fp=0xc00018df60 sp=0xc00018df48 pc=0xc2af8d
runtime.main()
	/usr/local/go/src/runtime/proc.go:203 +0x21e fp=0xc00018dfe0 sp=0xc00018df60 pc=0x434f7e
runtime.goexit()
	/usr/local/go/src/runtime/asm_amd64.s:1357 +0x1 fp=0xc00018dfe8 sp=0xc00018dfe0 pc=0x463541

goroutine 22 [select]:
database/sql.(*DB).connectionResetter(0xc00038c000, 0x104a820, 0xc0000d5f80)
	/usr/local/go/src/database/sql/sql.go:1065 +0xfb
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:723 +0x193

goroutine 21 [select]:
database/sql.(*DB).connectionOpener(0xc00038c000, 0x104a820, 0xc0000d5f80)
	/usr/local/go/src/database/sql/sql.go:1052 +0xe8
created by database/sql.OpenDB
	/usr/local/go/src/database/sql/sql.go:722 +0x15d

goroutine 8 [syscall]:
os/signal.signal_recv(0x0)
	/usr/local/go/src/runtime/sigqueue.go:147 +0x9c
os/signal.loop()
	/usr/local/go/src/os/signal/signal_unix.go:23 +0x22
created by os/signal.init.0
	/usr/local/go/src/os/signal/signal_unix.go:29 +0x41

goroutine 20 [chan receive]:
storj.io/storj/pkg/process.Ctx.func1(0xc0002b9e00, 0xc00033f2c0)
	/go/src/storj.io/storj/pkg/process/exec_conf.go:89 +0x41
created by storj.io/storj/pkg/process.Ctx
	/go/src/storj.io/storj/pkg/process/exec_conf.go:88 +0x1b6

goroutine 13 [IO wait]:
internal/poll.runtime_pollWait(0x7f405befeea8, 0x72, 0x0)
	/usr/local/go/src/runtime/netpoll.go:184 +0x55
internal/poll.(*pollDesc).wait(0xc000159898, 0x72, 0x0, 0x0, 0xef9734)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitRead(...)
	/usr/local/go/src/internal/poll/fd_poll_runtime.go:92
internal/poll.(*FD).Accept(0xc000159880, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
	/usr/local/go/src/internal/poll/fd_unix.go:384 +0x1f8
net.(*netFD).accept(0xc000159880, 0xc000060cc0, 0xc000056a80, 0x7f405e25a6d0)
	/usr/local/go/src/net/fd_unix.go:238 +0x42
net.(*TCPListener).accept(0xc00030c960, 0xc000060cf0, 0x4110a8, 0x30)
	/usr/local/go/src/net/tcpsock_posix.go:139 +0x32
net.(*TCPListener).Accept(0xc00030c960, 0xe60d20, 0xc00010b9e0, 0xdb22e0, 0x187c0a0)
	/usr/local/go/src/net/tcpsock.go:261 +0x47
net/http.(*Server).Serve(0xc0001102a0, 0x1044fa0, 0xc00030c960, 0x0, 0x0)
	/usr/local/go/src/net/http/server.go:2896 +0x286
storj.io/storj/pkg/process.initDebug.func2(0xc0002b9bc0, 0x1044fa0, 0xc00030c960, 0xc00007be40)
	/go/src/storj.io/storj/pkg/process/debug.go:52 +0x15d
created by storj.io/storj/pkg/process.initDebug
	/go/src/storj.io/storj/pkg/process/debug.go:50 +0x38f
fmas@delta:~$ 

Thank you all for your support.

Dear all,

do you have any suggestions on how I should proceed?

Thanks once more.

Unfortunately I have no idea, what is wrong with your setup.
Forwarded this issue to the team.

I got the same issue today with my setup, before it was running fine from May 1st this year
I even reinstalled Ubuntu 18.04 VM from scratch, still getting the same error every time:

docker exec -it storagenode /app/dashboard.sh
2019-10-03T19:45:23.594Z INFO Configuration loaded from: /app/config/config.yaml
2019-10-03T19:45:23.602Z INFO Node ID: 12p9V4idSd1zNt1k9gC6oWswrEnupc756vnyL4rMGBY6qMFirtp
2019-10-03T19:45:43.602Z FATAL Unrecoverable error {“error”: “transport error: context deadline exceeded”, “errorVerbose”: “transport error: context deadline exceeded\n\tstorj.io/storj/pkg/transport.DialAddressInsecure:31\n\tmain.dialDashboardClient:37\n\tmain.cmdDashboard:66\n\tstorj.io/storj/pkg/process.cleanup.func1.2:264\n\tstorj.io/storj/pkg/process.cleanup.func1:282\n\tgithub.com/spf13/cobra.(*Command).execute:762\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:852\n\tgithub.com/spf13/cobra.(*Command).Execute:800\n\tstorj.io/storj/pkg/process.Exec:73\n\tmain.main:296\n\truntime.main:203”}

My data is stored on FreeNAS server and is accessed via ISCSI, Data is across 4 drives in RaidZ, never had any issue.

Tried to rename the kademlia file no help
Please help me figure this out!!!

Nevermind, i got it working,
No idea what that was but after reinstalling the VM hosting docker it started to work

This error is a consequence of the previous main error. It could be seen only in the log via docker logs --tail 20 storagenode (in the current case the stack trace is longer, so you should use the 100 lines instead of only 20).

I’m glad that you do not have the current issue and got it working!

@syncro happy for you too!

@Alexey will the team write to you and you will add to this topic their suggestions? Or how is it supposed to work?

I have a second computer which is running windows 10. On that computer (computer name: echo) I also have a WD Red one terabyte drive. I could move the data over to it. But before doing this I will need to install Linux on it. I have already tried with Ubuntu and Debian in the past and with both I was not able to get my networking card to work properly. I believe the driver is not correct and needs to be installed manually. In order to do that I will need help… As of course withouth ethernet/internet I will not be able to use the common commands to download the correct drivers. Would be available to help me out on this?

A friend of mine suggested I should install suse Linux instead as it should be better in terms of drivers.

P.S.
If we decide to follow this path we may go off topic. Please let me know if I have to open a new topic, focused on installing Linux and adding drivers manually.

Thanks to all of you for your support. Just sad that my node is ruining all its career, being offline since a week now.

I created an issue in the internal project. Will keep this thread updated or Engineers can update it too.

With the latest Docker update seems the network problem is gone. You can try with a Docker desktop again.

This is definitely a different topic. If you plan to start it here, I will split it anyway :slight_smile:

Did I understand correctly that I can now use windows 10 as an OS since docker seems to have solved its past issues?

I didn’t see problems with a Docker desktop while ago (at least a month), so either every SNO is starting with Linux, or the Docker is fixed this issue.

I running the Windows node as well, but I could not be a representative because my setup have not had this issue at the day zero (back in February).

Ok, I will then proceed in moving the data from the Linux computer to the windows computer (although I liked the Linux version better).

How to do that in order to ensure a correct transfer of the data, considering that the file system is a different one (ext4 vs ntfs).

BTW, since the node is offline since two weeks, please let me know if it is possible to recover it or if it has been killed.

Thanks.

I can suggest you to use the rsync as described in the

The disqualification for downtime is currently disabled, so your reputation probably ruined, but it will recovers as soon as you bring your node online.

1 Like

@Alexey

Tomorrow morning I will have, finally, time to dedicate to the node.

I have decided that I will try to install Ubuntu once more on the second computer I have (pc name:delta), which is currently running Windows 10. I hopefully will manage to install the correct ethernet driver which was the main issue I encountered the first time I tried to install Ubuntu. Apparently the intel controller I217-V which is on my motherboard is not supported “out of the box” by Ubuntu.

If I get the Ubuntu running properly I will then approach the second challenge of transferring the storj data from the 2TB HDD to the 1TB HDD by using the rsync command as you suggested. The size of the HDD is smaller, but as I only have 800GB of storj data on the 2TB drive, it should not be an issue. Of course it is a pity… but by transferring the data to an another drive I will exclude the risk of the 2TB drive being somehow on its death path (as @beast suggested in the past).

I will keep you updated on my progress. :slight_smile:

Thank you.

Also, you can use the Windows Installer instead of docker.
Specify the storage folder for the data.
The docker container must be stopped and removed as well.

I found an issue on GitHub


The solution is

Recompilation with -tags disableunsafe fixed the issue.

@Alexey since I am not capable of using rsync I have decided to copy the data within the Windows computer. I have connected the 2TB HDD (EXT4) to my Windows 10 computer and I am now copying the data thanks to the following utility: ext2read.

For the moment being everyhting seems to proceed just fine. I believe it will take a couple of hours, I will leave it running.

Once the copying process is finished I will use the Windows Installer as per your suggestion.

Given the short amount of time at my disposal I have to postpone my linux learning project. :smiley:

@Alexey I have completed the copy and used the windows installer. The node shows as online.
Still, the dashboard says that the remaining disk space is of 800GB. Which should not be, as I have successfully transferred the data over.

Is there anything I can do?

Thanks.

Yes, make sure that you specified the storage folder for data.
If not, you should:

  1. Install the Notepad++
  2. Open the config file "%ProgramFiles%\Storj\Storage Node\config.yaml" with a Notepad++
  3. Change the path in the storage.path: option to the path to storage.
  4. Save the configuration file.
  5. Move the blobs folder from your current path to the storage without replacing existing files
  6. Restart the “Storj V3 Storage Node” service from the Services applet or from the elevated Powershell:
Restart-Service storagenode
1 Like

@Alexey It works now, node is back online! :smiley:

Thanks a lot for your time and patience.

1 Like

6 posts were split to a new topic: Is there a way to have a more detailed report on how the node is doing?