Stop spamming the logs with "New version is being rolled out but hasn't made it to this node yet"

So I’m trying out this storage node operator thing and monitoring my logs as a sysadmin should, but since yesterday the storagenode-updater spams my logs with:

Feb  9 08:10:23 localhost storagenode-updater[3444]: 2023-02-09T08:10:23.869+0100#011INFO#011New version is being rolled out but hasn't made it to this node yet#011{"Process": "storagenode-updater", "Service": "storagenode"}
Feb  9 08:10:23 localhost storagenode-updater[3444]: 2023-02-09T08:10:23.877+0100#011INFO#011New version is being rolled out but hasn't made it to this node yet#011{"Process": "storagenode-updater", "Service": "storagenode-updater"}

every 15 minutes.
Is there a way to stop spamming these while still having the ability to log when it actually updated?

And while I’m at it, with the storagenode service itself I already had to modify the loglevel to ERROR since it was spamming way too much useless infoformation on INFO level.
Still… on ERROR level it occasionally - like 1 or 2 times a day - I get a random log like one of these ones:

Feb  7 22:32:07 localhost storagenode[3442]: 2023-02-07T22:32:07.917+0100#011ERROR#011piecestore#011download failed#011{"Process": "storagenode", "Piece ID": "*", "Satellite ID": "*", "Action": "GET", "error": "context deadline exceeded", "errorVerbose": "context deadline exceeded\n\tstorj.io/common/rpc/rpcstatus.Wrap:73\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func6:745\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:763\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:122\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:112\n\tstorj.io/drpc/drpcctx.(*Tracker).track:52"}
Feb  7 21:50:49 localhost storagenode[3442]: 2023-02-07T21:50:49.002+0100#011ERROR#011piecestore#011download failed#011{"Process": "storagenode", "Piece ID": "*", "Satellite ID": "*", "Action": "GET", "error": "write tcp *:28967->*:17130: use of closed network connection", "errorVerbose": "write tcp *:28967->*:17130: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).rawWriteLocked:326\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:404\n\tstorj.io/common/pb.(*drpcPiecestore_DownloadStream).Send:341\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:697\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
Feb  7 10:55:31 localhost storagenode[3442]: 2023-02-07T10:55:31.219+0100#011ERROR#011piecedeleter#011could not send delete piece to trash#011{"Process": "storagenode", "Satellite ID": "*", "Piece ID": "*", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storage/filestore.(*blobStore).Stat:102\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:369\n\tstorj.io/storj/storagenode/pieces.(*Deleter).deleteOrTrash:185\n\tstorj.io/storj/storagenode/pieces.(*Deleter).work:135\n\tstorj.io/storj/storagenode/pieces.(*Deleter).Run.func1:72\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

(IDK if satellite/piece ids are sensitive info, but just in case I * them).

Why log those when the operator of the node can do nothing with them? It just makes it hard to monitor the real logs that needs real intervention from a sysadmin.

2 Likes

From the sounds of it you don’t want any logs but still want to see logs :man_shrugging:

The storagenode-updater logs are INFO level logs so are telling you that the update check occurred on it’s scheduled interval and your node isn’t selected for upgrade yet. To me that’s useful to know on an informational level.

The next ERROR logs you show are issues with your node and if occurring repeatedly could be a sign of a problem which does need resolving. It could be a sign of an internet problem or a slow storage, or for the final one even a lost file.

3 Likes

Could you please share here, how you did both - it could be useful for other Community Members.

In the Docker manual I found “–log-level” parameter. If you put it in front of storagenode container’s name, in “docker run…”, does it work the same as “–log.level” put after container’s name? Does it work also for watchtower run command?

Parameters placed before the image name will go to the docker engine, parameters placed after the image name will go to the storagenode (actually to the entrypoint script).

I found the correct flag for Docker watchtower: --notifications-level error
You can use this watchtower command for maximum comfort :slight_smile:

docker run -d --restart=always --log-opt max-size=10m --log-opt max-file=5 --name watchtower -v /var/run/docker.sock:/var/run/docker.sock storjlabs/watchtower --stop-timeout 300s --notifications-level error

It will update all running containers and keep only the last 5 logs of maximum 10MiB, and record only errors, just like the storagenode run command:

docker run -d --restart unless-stopped --stop-timeout 300 \
	-p 28967:28967/tcp \
	-p 28967:28967/udp \
	-p 14002:14002 \
	-p 5999:5999 \
	-e WALLET="xxx" \
	-e EMAIL="xxx" \
	-e ADDRESS="xxx:28967" \
	-e STORAGE="xxTB" \
	--mount type=bind,source="/volume1/Storj/Identity/storagenode/",destination=/app/identity \
	--mount type=bind,source="/volume1/Storj/",destination=/app/config \
	--log-opt max-size=10m \
	--log-opt max-file=5 \
	--name storagenode storjlabs/storagenode:latest \
	--server.address=":28967" \
	--console.address=":14002" \
	--debug.addr=":5999" \
	--log.level=error \
	--filestore.write-buffer-size 4MiB \
	--pieces.write-prealloc-size 4MiB \
	--storage2.piece-scan-on-startup=true \
	--operator.wallet-features=zksync