Many posts are about this subject already, but they’re pretty old so I though the issue had been resolved in the past.
My nodes updated to v1.18.1 15h ago, and some of them (not all of them) started receiving more data even though they were full already (with ~480MB free, so they had stopped receiving data many days ago).
They stopped receiving data again eventually, but now they have 241MB, 50MB and -194MB of free space:
The only thing common to these 3 nodes is that they are 3 small nodes (500GB) on the same 2.5" SMR disk which took something like more than 10 hours (!) to browse all files after the update.
So I was wondering if nodes were starting receiving data again after an update, until they “realize” the disk is full after the filewalker has finished browsing all files?
The only things that come up from the past 24h of logs when running grep error
are these:
2020-12-09T14:44:27.986Z ERROR piecestore download failed {"Piece ID": "RO36YWHA7X25G64SR7DJBCYNXQVNSBEKATDY5EB6OQDVR5HKZV3A", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "GET", "error": "write tcp 172.17.0.3:28967->176.9.121.114:51308: use of closed network connection", "errorVerbose": "write tcp 172.17.0.3:28967->176.9.121.114:51308: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:228\n\tstorj.io/drpc/drpcwire.SplitN:29\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:276\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:322\n\tstorj.io/common/pb.(*drpcPiecestoreDownloadStream).Send:1089\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:580\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
2020-12-09T18:04:51.471Z ERROR servers unexpected shutdown of a runner {"name": "debug", "error": "debug: http: Server closed", "errorVerbose": "debug: http: Server closed\n\tstorj.io/private/debug.(*Server).Run.func2:108\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-12-09T18:04:55.624Z FATAL Unrecoverable error {"error": "debug: http: Server closed", "errorVerbose": "debug: http: Server closed\n\tstorj.io/private/debug.(*Server).Run.func2:108\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2020-12-09T20:38:05.492Z ERROR piecestore download failed {"Piece ID": "SDYFTABGT4VKUNOPYPCXO2IDI6XFSGOK27N6EJ5B25SLMIGXM2GA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "error": "write tcp 172.17.0.8:28967->46.4.33.240:45008: use of closed network connection", "errorVerbose": "write tcp 172.17.0.8:28967->46.4.33.240:45008: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:228\n\tstorj.io/drpc/drpcwire.SplitN:29\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:276\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:322\n\tstorj.io/common/pb.(*drpcPiecestoreDownloadStream).Send:1089\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:580\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
2020-12-09T23:55:58.137Z ERROR piecestore download failed {"Piece ID": "LOH25XEPCPC6PSZYWNPUECDBGLTEEED665L432WJUTDC635CDW5A", "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "Action": "GET", "error": "write tcp 172.17.0.8:28967->176.9.121.114:52082: use of closed network connection", "errorVerbose": "write tcp 172.17.0.8:28967->176.9.121.114:52082: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:228\n\tstorj.io/drpc/drpcwire.SplitN:29\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:276\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:322\n\tstorj.io/common/pb.(*drpcPiecestoreDownloadStream).Send:1089\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:580\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
2020-12-10T05:38:56.633Z ERROR piecestore download failed {"Piece ID": "B7K5HQP45EYNASYPNPCKZOPYV22YFQXMJTYDWBBFPXCEO2GVI4HA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "error": "write tcp 172.17.0.8:28967->46.4.33.240:48330: use of closed network connection", "errorVerbose": "write tcp 172.17.0.8:28967->46.4.33.240:48330: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:228\n\tstorj.io/drpc/drpcwire.SplitN:29\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:276\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:322\n\tstorj.io/common/pb.(*drpcPiecestoreDownloadStream).Send:1089\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:580\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
I’m wondering if there is something that should be fixed in the Storj node software… ?
Anyway, running multiple nodes (even small ones) on a weak SMR drive is obviously a very bad idea, so I’ll be migrating some nodes to elsewhere, to ease the disk a bit ^^’