Reboot loop after upgrade

After upgrade to version v1.15.3 storage node working some time (about 10min) then:

2020-10-27T18:06:57.360696435Z 2020-10-27T18:06:57.360Z INFO    piecestore      upload started  {"Piece ID": "HL5CDHVUZ2SS7WFA36QFTMHVS7BQOQK56SWKO6OY4GGOQYZT5VGQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "Available Space": 983028596352}
2020-10-27T18:06:57.377417673Z 2020-10-27T18:06:57.377Z INFO    piecestore      uploaded        {"Piece ID": "HL5CDHVUZ2SS7WFA36QFTMHVS7BQOQK56SWKO6OY4GGOQYZT5VGQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT"}
2020-10-27T18:06:57.434510983Z 2020-10-27T18:06:57.433Z ERROR   piecestore:cache        error getting current used space:       {"error": "lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/gt/367w5ejuvljgu22lfiiuebswq2spdlkcrs3wthmtpjcdnf57wa.sj1: bad message; lstat config/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/bh/mq7gmkve5yx24emqrj5xmn4mf3apdm5w2xduhaifhd22fpybra.sj1: bad message", "errorVerbose": "group:\
n--- lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/gt/367w5ejuvljgu22lfiiuebswq2spdlkcrs3wthmtpjcdnf57wa.sj1: bad message\n\tstorj.io/storj/storage/filestore.walkNamespaceWithPrefix:787\n\tstorj.io/storj/storage/filestore.(*Dir).walkNamespaceInPath:725\n\tstorj.io/storj/storage/filestore.(*Dir).WalkNamespace:685\n\tstorj.io/storj/storage/filestore.(*blobStore).WalkNamespace:280\n\tstorj.io/storj/storagenode/pieces.(*Store).WalkSat
ellitePieces:489\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:654\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:54\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func1:57\n\tgolang.org/x/sync
/errgroup.(*Group).Go.func1:57\n--- lstat config/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/bh/mq7gmkve5yx24emqrj5xmn4mf3apdm5w2xduhaifhd22fpybra.sj1: bad message\n\tstorj.io/storj/storage/filestore.walkNamespa
ceWithPrefix:787\n\tstorj.io/storj/storage/filestore.(*Dir).walkNamespaceInPath:725\n\tstorj.io/storj/storage/filestore.(*Dir).WalkNamespace:685\n\tstorj.io/storj/storage/filestore.(*blobStore).WalkNamespace:280\n\tstorj.io/storj/stor
agenode/pieces.(*Store).WalkSatellitePieces:489\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:654\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:54\n\tstorj.io/storj/private/lifecycle.(*Group).Ru
n.func1:57\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-10-27T18:06:57.436759259Z 2020-10-27T18:06:57.436Z ERROR   services        unexpected shutdown of a runner {"name": "piecestore:cache", "error": "lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/gt/
367w5ejuvljgu22lfiiuebswq2spdlkcrs3wthmtpjcdnf57wa.sj1: bad message; lstat config/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/bh/mq7gmkve5yx24emqrj5xmn4mf3apdm5w2xduhaifhd22fpybra.sj1: bad message", "errorVerbos
e": "group:\n--- lstat config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/gt/367w5ejuvljgu22lfiiuebswq2spdlkcrs3wthmtpjcdnf57wa.sj1: bad message\n\tstorj.io/storj/storage/filestore.walkNamespaceWithPrefix:787\n\
tstorj.io/storj/storage/filestore.(*Dir).walkNamespaceInPath:725\n\tstorj.io/storj/storage/filestore.(*Dir).WalkNamespace:685\n\tstorj.io/storj/storage/filestore.(*blobStore).WalkNamespace:280\n\tstorj.io/storj/storagenode/pieces.(*St
ore).WalkSatellitePieces:489\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:654\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:54\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func1:57\n\tgolan
g.org/x/sync/errgroup.(*Group).Go.func1:57\n--- lstat config/storage/blobs/v4weeab67sbgvnbwd5z7tweqsqqun7qox2agpbxy44mqqaaaaaaa/bh/mq7gmkve5yx24emqrj5xmn4mf3apdm5w2xduhaifhd22fpybra.sj1: bad message\n\tstorj.io/storj/storage/filestore
.walkNamespaceWithPrefix:787\n\tstorj.io/storj/storage/filestore.(*Dir).walkNamespaceInPath:725\n\tstorj.io/storj/storage/filestore.(*Dir).WalkNamespace:685\n\tstorj.io/storj/storage/filestore.(*blobStore).WalkNamespace:280\n\tstorj.i
o/storj/storagenode/pieces.(*Store).WalkSatellitePieces:489\n\tstorj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAndBySatellite:654\n\tstorj.io/storj/storagenode/pieces.(*CacheService).Run:54\n\tstorj.io/storj/private/lifecycle
.(*Group).Run.func1:57\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}

Crashing and container restarting.

I cant confirm this yet being an issue, since my nodes havent updated yet. Did this happen on all your linux nodes or just one?

not that this helps you in any way, but one of my 3 linux nodes upgraded a couple of hours ago. Seemed to start right back up without issue.

Check your disk for errors. The free space could be marked as allocated which fsck could locate. You are a pro so you know the best methods already :slight_smile:

Thanks! :slightly_smiling_face:
I already fixed this issue, yep it was the filesystem error.
But I wonder… yesterday I have database corruption on another location, today main filesystem corruption… I will definitely investigate the root cause.

Please, also share details of that setup

Yes, sure, I will create a new thread in “Production Enthusiasts” and post all detailed information.