Seg fault on storagenode-updater service?

This node has been running for a few years, but today its down with these messages in the logs:

2025-08-27 21:56:11,518 INFO spawned: ‘storagenode-updater’ with pid 51
2025-08-27T21:56:11Z INFO Configuration loaded {“Process”: “storagenode-updater”, “Location”: “/app/config/config.yaml”}
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “kademlia.operator.wallet”}
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “kademlia.external-address”}
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “kademlia.operator.email”}
2025-08-27 21:56:11,635 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “server.address”}
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-disk-space”}
2025-08-27T21:56:11Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-bandwidth”}
2025-08-27T21:56:11Z INFO Anonymized tracing enabled {“Process”: “storagenode-updater”}
2025-08-27T21:56:11Z INFO Running on version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.135.5”}
2025-08-27T21:56:11Z INFO Downloading versions. {“Process”: “storagenode-updater”, “Server Address”: “https://version.storj.io”}
2025-08-27T21:56:11Z INFO Command output. {“Process”: “storagenode-updater”, “Output”: “”}
2025-08-27T21:56:11Z ERROR Error updating service. {“Process”: “storagenode-updater”, “Service”: “storagenode”, “error”: “signal: segmentation fault (core dumped)”, “errorVerbose”: “signal: segmentation fault (core dumped)\n\tmain.update:24\n\tmain.loopFunc:26\n\tstorj.io/common/sync2.(*Cycle).Run:102\n\tmain.cmdRun:139\n\tstorj.io/common/process.cleanup.func1.2:388\n\tstorj.io/common/process.cleanup.func1:406\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tstorj.io/common/process.ExecWithCustomConfigAndLogger:80\n\tmain.main:22\n\truntime.main:283”}

The node keeps restarting itself and is down. On startup, docker looks to be pulling the most recent binary v1.135.5. I don’t see any errors at the OS level.

Ideas?

Looks like it’s a storagenode that segfaults, not updater.

Delete the storagenode binary, maybe it’s corrupted, and restart the container?

What platform and OS is this? uname -a.

Also, plage log between tripples of backticks, it’s impossible to read them in variable width font. Formatting posts using markdown, BBCode, and HTML - Using Discourse - Discourse Meta

1 Like

Image has been deleted and redownloaded.

System is:

Linux storj1 5.4.275-434 #1 SMP PREEMPT Tue May 7 15:05:41 UTC 2024 armv7l armv7l armv7l GNU/Linux

Result is still:

$ sudo docker psCONTAINER ID   IMAGE                          COMMAND                  CREATED          STATUS                                  PORTS     NAMESca97d7aaece8   storjlabs/storagenode:latest   “/entrypoint --stora…”   29 seconds ago   Restarting (0) Less than a second ago             storagenode

Log:

2025-08-28 02:05:19,966 INFO Set uid to user 0 succeeded2025-08-28 02:05:19,975 INFO RPC interface ‘supervisor’ initialized2025-08-28 02:05:19,976 INFO supervisord started with pid 12025-08-28 02:05:20,980 INFO spawned: ‘processes-exit-eventlistener’ with pid 482025-08-28 02:05:20,988 INFO spawned: ‘storagenode’ with pid 492025-08-28 02:05:20,995 INFO spawned: ‘storagenode-updater’ with pid 502025-08-28T02:05:21Z	INFO	Configuration loaded	{“Process”: “storagenode-updater”, “Location”: “/app/config/config.yaml”}2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “kademlia.operator.email”}2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “storage.allocated-disk-space”}2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “storage.allocated-bandwidth”}2025-08-28 02:05:21,111 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “kademlia.external-address”}2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “kademlia.operator.wallet”}2025-08-28T02:05:21Z	INFO	Invalid configuration file key	{“Process”: “storagenode-updater”, “Key”: “server.address”}2025-08-28T02:05:21Z	INFO	Anonymized tracing enabled	{“Process”: “storagenode-updater”}2025-08-28T02:05:21Z	INFO	Running on version	{“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.135.5”}2025-08-28T02:05:21Z	INFO	Downloading versions.	{“Process”: “storagenode-updater”, “Server Address”: “https://version.storj.io”}2025-08-28T02:05:21Z	INFO	Command output.	{“Process”: “storagenode-updater”, “Output”: “”}2025-08-28T02:05:21Z	ERROR	Error updating service.	{“Process”: “storagenode-updater”, “Service”: “storagenode”, “error”: “signal: segmentation fault (core dumped)”, “errorVerbose”: “signal: segmentation fault (core dumped)\n\tmain.update:24\n\tmain.loopFunc:26\n\tstorj.io/common/sync2.(*Cycle).Run:102\n\tmain.cmdRun:139\n\tstorj.io/common/process.cleanup.func1.2:388\n\tstorj.io/common/process.cleanup.func1:406\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tstorj.io/common/process.ExecWithCustomConfigAndLogger:80\n\tmain.main:22\n\truntime.main:283”}2025-08-28T02:05:21Z	INFO	Current binary version	{“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.135.5”}2025-08-28T02:05:21Z	INFO	Version is up to date	{“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}2025-08-28 02:05:22,412 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)2025-08-28 02:05:22,416 INFO spawned: ‘storagenode’ with pid 732025-08-28 02:05:22,418 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)2025-08-28 02:05:23,457 INFO success: storagenode entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)2025-08-28 02:05:23,458 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)2025-08-28 02:05:24,464 INFO spawned: ‘storagenode’ with pid 742025-08-28 02:05:24,465 WARN received SIGQUIT indicating exit request2025-08-28 02:05:24,467 INFO waiting for processes-exit-eventlistener, storagenode, storagenode-updater to die2025-08-28T02:05:24Z	INFO	Got a signal from the OS: “terminated”	{“Process”: “storagenode-updater”}2025-08-28 02:05:24,478 INFO stopped: storagenode-updater (exit status 0)2025-08-28 02:05:25,505 WARN stopped: storagenode (terminated by SIGSEGV (core dumped))2025-08-28 02:05:25,507 WARN stopped: processes-exit-eventlistener (terminated by SIGTERM)

*using triple backticks on the log lines turned it into a single line

Tripple backticks, not single :). Like so:

```
text goes here
another line
```

will show as

text goes here
another line

Single backticks – ` – indeed create a single line.

This is a smoking gun. It’s a 32 bit architecture. There were issues in the past with alignment and structure padding (example: Uplink crash: "unaligned 64-bit atomic operation" on 32-bit systems).

Lets wait for someone from stoj to respond, perhaps they have gap in test coverage.

The backticks shall be on their own line. See example above.

I tried to run it in the emulator for linux/arm/v7 and it works.

$ docker run -it --rm -e WALLET=0x..... -v storagenode:/app/config -v identity:/app/identity --platform "linux/arm/v7" --name storagenode storjlabs/storagenode:latest
$ docker exec -it storagenode uname -a
Linux dde4cfad3c8c 5.15.0-122-generic #132-Ubuntu SMP Thu Aug 29 13:45:52 UTC 2024 armv7l GNU/Linux

But I have a newer kernel. Is it possible to update your OS?
My docker is not pretty new:

$ docker version
Client:
 Version:           24.0.7
 API version:       1.43
 Go version:        go1.21.1
 Git commit:        24.0.7-0ubuntu2~22.04.1
 Built:             Wed Mar 13 20:23:54 2024
 OS/Arch:           linux/amd64
 Context:           default

Server:
 Engine:
  Version:          24.0.7
  API version:      1.43 (minimum version 1.12)
  Go version:       go1.21.1
  Git commit:       24.0.7-0ubuntu2~22.04.1
  Built:            Wed Mar 13 20:23:54 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.7.12
  GitCommit:
 runc:
  Version:          1.1.12-0ubuntu2~22.04.1
  GitCommit:
 docker-init:
  Version:          0.19.0
  GitCommit:
$ docker exec -it storagenode /app/bin/storagenode version
2025-08-29T03:42:56Z    INFO    Anonymized tracing enabled      {"Process": "storagenode"}
Release build
Version: v1.135.5
Build timestamp: 12 Aug 25 17:16 UTC
Git commit: 131aa8e0aaefe4c82bf583c562b040fdfceebb35

Updated to kernel 6.6.13-9 #1 SMP PREEMPT Wed Mar 6 22:28:58 UTC 2024 armv7l armv7l armv7l GNU/Linux

Same result.

This system has been mostly untouched for years. Any other avenues to check?

Please stop and remove the container, delete the bin subfolder from the data location, delete the image

docker rmi storjlabs/storagenode:latest

Then please try to run with an explicit mention of the platform in your docker run command before the image name:

Getting some error on node startup.

storj  | 2025-09-04 21:32:20,827 INFO Set uid to user 0 succeeded
storj  | 2025-09-04 21:32:20,829 INFO RPC interface 'supervisor' initialized
storj  | 2025-09-04 21:32:20,829 INFO supervisord started with pid 1
storj  | 2025-09-04 21:32:21,831 INFO spawned: 'processes-exit-eventlistener' with pid 43
storj  | 2025-09-04 21:32:21,831 INFO spawned: 'storagenode' with pid 44
storj  | 2025-09-04 21:32:21,832 INFO spawned: 'storagenode-updater' with pid 45
storj  | 2025-09-04T21:32:21Z	INFO	Configuration loaded	{"Process": "storagenode-updater", "Location": "/app/config/config.yaml"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "collector.interval"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "storage2.piece-scan-on-startup"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "server.private-address"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "storage2.max-concurrent-requests"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "storage2.database-dir"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "server.address"}
storj  | 2025-09-04T21:32:21Z	INFO	Invalid configuration file key	{"Process": "storagenode-updater", "Key": "filestore.write-buffer-size"}
storj  | 2025-09-04T21:32:21Z	INFO	Anonymized tracing enabled	{"Process": "storagenode-updater"}
storj  | 2025-09-04T21:32:21Z	INFO	Running on version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.135.5"}
storj  | 2025-09-04T21:32:21Z	INFO	Downloading versions.	{"Process": "storagenode-updater", "Server Address": "https://version.storj.io"}
storj  | 2025-09-04T21:32:22Z	INFO	Command output.	{"Process": "storagenode-updater", "Output": ""}
storj  | 2025-09-04 21:32:22,340 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)
storj  | 2025-09-04T21:32:22Z	ERROR	Error updating service.	{"Process": "storagenode-updater", "Service": "storagenode", "error": "signal: segmentation fault (core dumped)", "errorVerbose": "signal: segmentation fault (core dumped)\n\tmain.update:24\n\tmain.loopFunc:26\n\tstorj.io/common/sync2.(*Cycle).Run:102\n\tmain.cmdRun:139\n\tstorj.io/common/process.cleanup.func1.2:388\n\tstorj.io/common/process.cleanup.func1:406\n\tgithub.com/spf13/cobra.(*Command).execute:985\n\tgithub.com/spf13/cobra.(*Command).ExecuteC:1117\n\tgithub.com/spf13/cobra.(*Command).Execute:1041\n\tstorj.io/common/process.ExecWithCustomOptions:115\n\tstorj.io/common/process.ExecWithCustomConfigAndLogger:80\n\tmain.main:22\n\truntime.main:283"}
storj  | 2025-09-04T21:32:22Z	INFO	Current binary version	{"Process": "storagenode-updater", "Service": "storagenode-updater", "Version": "v1.135.5"}
storj  | 2025-09-04T21:32:22Z	INFO	New version is being rolled out but hasn't made it to this node yet	{"Process": "storagenode-updater", "Service": "storagenode-updater"}
storj  | 2025-09-04 21:32:23,347 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
storj  | 2025-09-04 21:32:23,348 INFO spawned: 'storagenode' with pid 63
storj  | 2025-09-04 21:32:23,348 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
storj  | 2025-09-04 21:32:23,404 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)
storj  | 2025-09-04 21:32:25,407 INFO spawned: 'storagenode' with pid 64
storj  | 2025-09-04 21:32:25,461 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)
storj  | 2025-09-04 21:32:28,465 INFO spawned: 'storagenode' with pid 65
storj  | 2025-09-04 21:32:28,520 WARN exited: storagenode (terminated by SIGSEGV (core dumped); not expected)
storj  | 2025-09-04 21:32:29,521 INFO gave up: storagenode entered FATAL state, too many start retries too quickly

Fixed by clearing /bin.
Node redownloaded binaries and started normally.
Maybe it can be handled more gracefully. Didn’t investigated it in depth, but i assume maybe binaries was corrupted by fs (ran fsck recently and found some errors), or maybe binaries was partially downloaded. Maybe add some hashsum check for binaries at startup.

Same result. Node is now suspended. Not sure what else to do.

When you deleted the bin folder, was the container stopped or removed?
I saw some weird cases, when docker may rollback changes if you modify something on the bind local filesystem when the container only stopped.
Could you please try to stop and remove the container, then delete the bin folder from the data location and run the container back?

If that wouldn’t help too, then please try to run with a --platform linux/arm/v5 before the image name.
If that wouldn’t help too, then please run your docker run command but with adding --entrypoint /bin/bash before the image name, e.g.:

docker run ...\
...
  --entrypoint /bin/bash \
storjlabs/storagenode:latest

it should open a shell.
There run this command:

/config/bin/storagenode setup --help

Will it be executed?

Thanks!

Deleting the bin folder after the container was removed worked.

Just stopping the container and removing the bin folder was not enough.

I don’t understand why this came up now after 5 years of running this node without issue, but at least it started up and is (sadly) cleaning out TB’s of old data now.

1 Like

Likely because of the data corruption. And this time it was a binary file in the storage location.

I suspected that. Glad, that it finally worked.