New Storj node crashes frequently, stats console loses track of stats

Ok, here’s a fun one. (not).

New node operator here. Got it loaded in a (pre-existing) docker instance. Storage is backed on NFS4.1 (running TrueNAS CORE) over NFS, the Storj node container is running on an Ubuntu Linux host.

Three problems:

  1. the stats console occasionally crashes, complaining about some DB lock. Restart of the Storj instance clears that issue, nothing else does.

  2. The entire Storj instance ends up taking a dirt nap, with a massive (the chunk below is a partial snippet of the output) stack trace. After this happens, I can’t even stop or kill the container – I have to force-shutdown the VM the container host is running in and bring it all back up. This seems to happen after about 6 hours of run time, although I haven’t timed it yet.

  3. After about 3 rounds of the above, log output (docker logs ) no longer works. I can see stuff getting written to the JSON spool, but that’s it.

I want to be a good participant in the network here, but if the platform is going to be this unstable, I’m thinking I should bow out before I start getting too much data on my storage. Any help to avoid this would be appreciated.

agenode/blobstore/filestore/dir_unix.go:24 +0x5d\\nstorj.io/storj/storagenode/blobstore/filestore.(*Dir).Info(0xc005a21be0?, {0xc0401a51d8?, 0xbba6c0?})\\n\\t/go/src/storj.io/storj/storagenode/blobstore/filestore/dir.go:888 +0x3c\\nstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).FreeSpace(0x43a6fb?, {0x14a9f60?, 0xc00fd20640?})\\n\\t/go/src/storj.io/storj/storagenode/blobstore/filestore/store.go:277 +0x25\\nstorj.io/storj/storagenode/pieces.(*Store).StorageStatus(0xc000170150, {0x14a9f60, 0xc00fd20640})\\n\\t/go/src/storj.io/storj/storagenode/pieces/store.go:759 +0x18e\\nstorj.io/storj/storagenode/monitor.(*Service).AvailableSpace(0xc0005345a0, {0x14a9f60, 0xc011f9bb80})\\n\\t/go/src/storj.io/storj/storagenode/monitor/monitor.go:250 +0x1f5\\nstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload(0xc0002781e0, {0x14ad7b0, 0xc03e0f28c0})\\n\\t/go/src/storj.io/storj/storagenode/piecestore/endpoint.go:315 +0x825\\nstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1({0x10ea7a0?, 0xc0002781e0}, {0xc03b3fc3c0?, 0x1d?}, {0x1045200?, 0xc018255740}, {0x14994c0?, 0x0?})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/pb/piecestore2_drpc.pb.go:243 +0xab\\nstorj.io/drpc/drpcmux.(*Mux).HandleRPC(0xc0003b9500?, {0x14aa980, 0xc018255740}, {0xc03b3fc3c0, 0x1d})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmux/handle_rpc.go:33 +0x20d\\nstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC(0xc000012648, {0x14aad00, 0xc0182556a0}, {0xc03b3fc3c0, 0x1d})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/rpc/rpctracing/handler.go:61 +0x2e3\\nstorj.io/common/experiment.(*Handler).HandleRPC(0xc00051a4b0, {0x14aae00, 0xc03b4626c0}, {0xc03b3fc3c0, 0x1d})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/experiment/import.go:42 +0x167\\nstorj.io/drpc/drpcserver.(*Server).handleRPC(0xc03c359880?, 0x14a9d28?, {0xc03b3fc3c0?, 0x1c7d900?})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:124 +0x36\\nstorj.io/drpc/drpcserver.(*Server).ServeOne(0xc00018c480, {0x14aa318, 0xc0006691d0}, {0x14a4020?, 0xc026bafc80?})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:66 +0x1d2\\nstorj.io/drpc/drpcserver.(*Server).Serve.func2({0x14aa318, 0xc0006691d0})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:114 +0x59\\nstorj.io/drpc/drpcctx.(*Tracker).track(0xc0006691d0, 0xc03c3596c0?)\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcctx/tracker.go:35 +0x2e\\ncreated by storj.io/drpc/drpcctx.(*Tracker).Run in goroutine 1036\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcctx/tracker.go:30 +0x79\\n\\ngoroutine 72724 [syscall]:\\nsyscall.Syscall(0x13?, 0xc0000a1800?, 0xc03eea62a0?, 0x14?)\\n\\t/usr/local/go/src/syscall/syscall_linux.go:69 +0x25\\ngolang.org/x/sys/unix.Statfs({0xc03eea62a0?, 0x13?}, 0xc040591040)\\n\\t/go/pkg/mod/golang.org/x/sys@v0.15.0/unix/zsyscall_linux_amd64.go:363 +0x8c\\nstorj.io/storj/storagenode/blobstore/filestore.diskInfoFromPath({0xc03eea62a0?, 0x0?})\\n\\t/go/src/storj.io/storj/storagenode/blobstore/filestore/dir_unix.go:24 +0x5d\\nstorj.io/storj/storagenode/blobstore/filestore.(*Dir).Info(0xc03ee7ca80?, {0xc0405911d8?, 0xbba6c0?})\\n\\t/go/src/storj.io/storj/storagenode/blobstore/filestore/dir.go:888 +0x3c\\nstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).FreeSpace(0x43a6fb?, {0x14a9f60?, 0xc0378290e0?})\\n\\t/go/src/storj.io/storj/storagenode/blobstore/filestore/store.go:277 +0x25\\nstorj.io/storj/storagenode/pieces.(*Store).StorageStatus(0xc000170150, {0x14a9f60, 0xc0378290e0})\\n\\t/go/src/storj.io/storj/storagenode/pieces/store.go:759 +0x18e\\nstorj.io/storj/storagenode/monitor.(*Service).AvailableSpace(0xc0005345a0, {0x14a9f60, 0xc037828be0})\\n\\t/go/src/storj.io/storj/storagenode/monitor/monitor.go:250 +0x1f5\\nstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload(0xc0002781e0, {0x14ad7b0, 0xc03ea7ee20})\\n\\t/go/src/storj.io/storj/storagenode/piecestore/endpoint.go:315 +0x825\\nstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1({0x10ea7a0?, 0xc0002781e0}, {0xc02fa47320?, 0x1d?}, {0x1045200?, 0xc0400424e0}, {0x14994c0?, 0x0?})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/pb/piecestore2_drpc.pb.go:243 +0xab\\nstorj.io/drpc/drpcmux.(*Mux).HandleRPC(0xc0003b9500?, {0x14aa980, 0xc0400424e0}, {0xc02fa47320, 0x1d})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmux/handle_rpc.go:33 +0x20d\\nstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC(0xc000012648, {0x14aad00, 0xc0400424a0}, {0xc02fa47320, 0x1d})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/rpc/rpctracing/handler.go:61 +0x2e3\\nstorj.io/common/experiment.(*Handler).HandleRPC(0xc00051a4b0, {0x14aae00, 0xc02dd31440}, {0xc02fa47320, 0x1d})\\n\\t/go/pkg/mod/storj.io/common@v0.0.0-20231213124955-23aba17361c7/experiment/import.go:42 +0x167\\nstorj.io/drpc/drpcserver.(*Server).handleRPC(0xc03c359c00?, 0x14a9d28?, {0xc02fa47320?, 0x1c7d900?})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:124 +0x36\\nstorj.io/drpc/drpcserver.(*Server).ServeOne(0xc00018c480, {0x14aa318, 0xc0006691d0}, {0x14a4020?, 0xc026bafe00?})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:66 +0x1d2\\nstorj.io/drpc/drpcserver.(*Server).Serve.func2({0x14aa318, 0xc0006691d0})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcserver/server.go:114 +0x59\\nstorj.io/drpc/drpcctx.(*Tracker).track(0xc0006691d0, 0x0?)\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcctx/tracker.go:35 +0x2e\\ncreated by storj.io/drpc/drpcctx.(*Tracker).Run in goroutine 1036\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcctx/tracker.go:30 +0x79\\n\\ngoroutine 72725 [IO wait]:\\ninternal/poll.runtime_pollWait(0x7f380b5f58c8, 0x72)\\n\\t/usr/local/go/src/runtime/netpoll.go:343 +0x85\\ninternal/poll.(*pollDesc).wait(0xc03ed7b300?, 0xc005a21efc?, 0x0)\\n\\t/usr/local/go/src/internal/poll/fd_poll_runtime.go:84 +0x27\\ninternal/poll.(*pollDesc).waitRead(...)\\n\\t/usr/local/go/src/internal/poll/fd_poll_runtime.go:89\\ninternal/poll.(*FD).Read(0xc03ed7b300, {0xc005a21efc, 0x4, 0x4})\\n\\t/usr/local/go/src/internal/poll/fd_unix.go:164 +0x27a\\nnet.(*netFD).Read(0xc03ed7b300, {0xc005a21efc?, 0xc040538000?, 0x14?})\\n\\t/usr/local/go/src/net/fd_posix.go:55 +0x25\\nnet.(*conn).Read(0xc02ab1d510, {0xc005a21efc?, 0x4111a9?, 0xc005a21ef0?})\\n\\t/usr/local/go/src/net/net.go:179 +0x45\\nio.ReadAtLeast({0x7f380cc692e0, 0xc023dfc210}, {0xc005a21efc, 0x4, 0x4}, 0x4)\\n\\t/usr/local/go/src/io/io.go:335 +0x90\\nio.ReadFull(...)\\n\\t/usr/local/go/src/io/io.go:354\\ngithub.com/jtolio/noiseconn.(*Conn).readMsg(0xc026bafe00, {0xc040538000, 0x0, 0x12000})\\n\\t/go/pkg/mod/github.com/jtolio/noiseconn@v0.0.0-20230301220541-88105e6c8ac6/conn.go:209 +0x7f\\ngithub.com/jtolio/noiseconn.(*Conn).Read(0xc026bafe00, {0xc003099000, 0x3000, 0x3000})\\n\\t/go/pkg/mod/github.com/jtolio/noiseconn@v0.0.0-20230301220541-88105e6c8ac6/conn.go:171 +0x67a\\nstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing(0xc03b5cf0e0, {0xc04064e000?, 0x0?, 0x16000?})\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcwire/reader.go:96 +0x4eb\\nstorj.io/drpc/drpcmanager.(*Manager).manageReader(0xc03c359c00)\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmanager/manager.go:226 +0xfb\\ncreated by storj.io/drpc/drpcmanager.NewWithOptions in goroutine 72724\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmanager/manager.go:118 +0x416\\n\\ngoroutine 72726 [select]:\\nstorj.io/drpc/drpcmanager.(*Manager).manageStreams(0xc03c359c00)\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmanager/manager.go:313 +0x11d\\ncreated by storj.io/drpc/drpcmanager.NewWithOptions in goroutine 72724\\n\\t/go/pkg/mod/storj.io/drpc@v0.0.33/drpcmanager/manager.go:119 +0x456\\n\

NFS is not going to work. Only iSCSI as I understand it. NFS is too slow.

3 Likes

Hello @bluknight ,
Welcome to the forum!

The network filesystems are not supported. Please try to run the node on your TrueNAS directly, using docker/jail: CLI Install - Storj Docs

Please also post an error fully, not only part of the stack-trace.

3 Likes

I’d love to post the entire error/stack trace – unfortunately, this looks like multiples all rammed together, probably about 30k worth of wall-of-text. Trying not to spam the forum with crap here.

By the way, I apparently missed any citation in the documentation (probably pretty easily, too) saying that network filesystems aren’t supported. Mind pointing out the page so I can see where I went wrong?

I’ll work on moving this to a jail if at all possible.

Here is to save you some time. GitHub - arrogantrabbit/freebsd_storj_installer: Installer script for Storj on FreeBSD. Create a jail, mount the dataset, and run the script.

Thanks. That does look like it’ll save time.

docker

docker logs storagenode 2>&1 | grep FATAL | tail

PowerShell (if used Windows GUI)

sls fatal "c:\Program Files\Storj\Storage Node\storagenode.log" | select -last 10
1 Like