Errors after offline period

After having some downtime due to a power outage, my script informs about a lot of :

  ERROR    piecestore    download failed    

(arround 10% of the logs contain this error)

I checked the piece ID’s and they are not the same every time.
Is this some problem of my node or the network?

***Running on rasberry pi 4 on ubuntu, connection is pretty solid and it doesn’t have any downtime

Update :

2023-03-21T10:04:06.801Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "AYRGVQH7I7QSFTEVNHDX2Z27XKRLERLZRVNEITVWYCK4WVTLYDQA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 52224, "Size": 163840, "Remote Address": "184.104.224.98:28974", "error": "write tcp 172.17.0.3:28967->184.104.224.98:28974: write: connection reset by peer", "errorVerbose": "write tcp 172.17.0.3:28967->184.104.224.98:28974: write: connection reset by peer\n\tstorj.io/drpc/drpcstream.(*Stream).rawFlushLocked:401\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:462\n\tstorj.io/common/pb.(*drpcPiecestore_DownloadStream).Send:349\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func6.2:729\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
2023-03-21T10:04:09.941Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "7ABHFAXB2YZ26GSJQIWQWQMUGJEKOFDPJIEDEKAOL4OKEUDM5LWA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 1327104, "Size": 311296, "Remote Address": "184.104.224.99:42846", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:42846: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:42846: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:12.990Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "6PRENXMBGNH7FX4H22IDMPX5PECMGSCD5RDAKGICV2MT3AHEV3HA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 135168, "Size": 263680, "Remote Address": "184.104.224.99:42852", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:42852: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:42852: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:13.672Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "PE7XCG5M2HSL35UYOREKWJDSXTD73ROBRGIKOGKLRNH3TFZLPOSA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 240384, "Size": 158464, "Remote Address": "184.104.224.99:52014", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:52014: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:52014: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:16.091Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "JISJXN5XIIZY7WJCLTUVHTJ7OQKMURV34P2QGR5GG3MLYWDJKXYQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 573696, "Size": 163840, "Remote Address": "184.104.224.98:29186", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.98:29186: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.98:29186: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:19.879Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "XYMVK6TFBVTPO7BYWQL4Z66I3TFYJOGKBBICYKR4G45GT4JQUU6Q", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 193792, "Size": 205056, "Remote Address": "72.52.83.202:36124", "error": "manager closed: read tcp 172.17.0.3:28967->72.52.83.202:36124: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->72.52.83.202:36124: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:19.913Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "6JYFQFDIHKJXKRIDUYSGGY5F27FIW3JX6DMJTCJ25QWF7PYFIVNA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 1876224, "Size": 163840, "Remote Address": "184.104.224.98:29282", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.98:29282: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.98:29282: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:28.006Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "RVIYCDC6QC6QBHNQTFYNNZSUQQY5Y3HWY7TRTNYMQDANCTMGNBJQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 27392, "Size": 311296, "Remote Address": "72.52.83.202:59930", "error": "manager closed: read tcp 172.17.0.3:28967->72.52.83.202:59930: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->72.52.83.202:59930: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:28.249Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "WMNKNCZOW5SRMRZV3YE3X3Z3MST7EOT2FBQLSXVAC74EBAJBRDLQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 1230080, "Size": 163840, "Remote Address": "184.104.224.99:34626", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:34626: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:34626: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:52.073Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "SEDGJWTWEJOXSUP5DZ4PLLNZ5K7F7K3ARTPAK6LR2F5VFTKWZDCA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 1003520, "Size": 311296, "Remote Address": "72.52.83.203:53818", "error": "manager closed: read tcp 172.17.0.3:28967->72.52.83.203:53818: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->72.52.83.203:53818: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}
2023-03-21T10:04:55.193Z    ERROR    piecestore    download failed    {"Process": "storagenode", "Piece ID": "OLQ62KX54HJQTFMFMEWPDHD4OIOEGTQMQKDFDIAGHFN4KZWQRLKQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "Offset": 328448, "Size": 70400, "Remote Address": "184.104.224.99:60182", "error": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:60182: read: connection reset by peer", "errorVerbose": "manager closed: read tcp 172.17.0.3:28967->184.104.224.99:60182: read: connection reset by peer\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223"}

these are some of the errors but they all have simmilar output

Please show the whole error.
Please also note, that GET/PUT can be canceled on the client side due to a long tail cancellation (i.e. your node was slower than others). Those errors are not avoidable - your node cannot be close to everyone customer across the globe.
If you have failed GET_AUDIT or GET_REPAIR, it could affect suspension and audit scores.

Ok i just updated with some of the errors.

I dont see any GET_AUDIT or GET_REPAIR errors though.

Problem is that my logs where always very clean from errors, and now after some offline period i see all those errors.

Is this expected to happen?

I think there is no relation with offline event. But they are more likely related to noise protocol implementation

see Connection reset by peer errors

1 Like

12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs is also a lot in my log. like:

2023-03-26T15:12:57.891+0200 ERROR piecestore download failed {“Piece ID”: “FQHG6TPT25Z73SMTBVBMSBW3OZ2PEXY67UGL2TZZWYR2EO6TGV4A”, “Satellite ID”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “Action”: “GET”, “Offset”: 328448, “Size”: 70400, “Remote Address”: “149.6.140.106:12642”, “error”: “manager closed: read tcp 192.168.2.252:28967->149.6.140.106:12642: wsarecv: Eine vorhandene Verbindung wurde vom Remotehost geschlossen.”, “errorVerbose”: “manager closed: read tcp 192.168.2.252:28967->149.6.140.106:12642: wsarecv: Eine vorhandene Verbindung wurde vom Remotehost geschlossen.\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:183\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:143\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:223”}

This is a usual case with a long tail cancellation (your node is slower than others).
But if you see something similar for GET_AUDIT or GET_REPAIR - that’s a problem.
Do you?

Audit score is 100% and no get_audit or Repair errors in the log.
also i did the drive scan, it took ca.45min for 6 tb data. no errors found. i changed the usb port and restarted windows. run the nodesoftware and it runs as usual, no sign of problems with hardware so far.

Make sure that your external disk have an external power supply - the external USB disks are not designed to work 24/7

it has, there is no sing that the disk is broken or unresponsive, a restart of the nodeservice always worked for some hours so far.
can i set the loglevel further down to servere or someting?
since it shortens the lifespan of my main ssd, i want as less log as it is neccesary, i don’t care about server aborted down or uploads.

Yes, of course. You may change the log level in your config.yaml:

log.level: warn

or with an argument --log.level warn after the image name in your docker run command in case of docker.

1 Like