Hey all! I wanted to take the opportunity to compile a list of error codes, what they mean, and if you need to worry about the error. I will be updating this as I am able to get information, but wanted to create a central place that people can check. Find the error you are looking for in the list and click the Summary section below it. This will open up with a description of what the error means and its severity.
11/30/2020-A quick note since I have discovered threads that reference errors in this post albeit with updated error text compared to when this post was created and all the errors that are present in the post were added: Look for the common terms. Even if the message isnât the exact same, it will likely contain the same information as an error here. An example is how Context Canceled error is now reported as trust:rpc:Context Canceled rather than any of the previous Context Canceled errors provided in this post. I will try to take time over this upcoming weekend to look at all the different troubleshooting posts and update the post with the newer text for the errors.
UNIQUE Constraint failed:
2019-07-14T17:31:24.969Z ERROR piecestore internal: infodb: UNIQUE constraint failed: pieceinfo.satellite_id, pieceinfo.piece_id
Summary
You will only see this error when a satellite attempts to upload a piece to your node as part of a repair job, but your node already has the piece so it errors out as piece IDs are required to be unique. This is a normal error and requires no action from the SNO.
2019-08-31T23:35:57.036Z ERROR server piecestore protocol: rpc error: code = Internal desc = transport: transport: the stream is done or WriteHeader was already called
Summary
This error is much like Context Canceled except that it is almost guaranteed that other nodes were faster in receiving their pieces than yours. It is an informational error and should not be concerning for an SNO.
Upload Rejected:
2019-07-14T00:51:47.842Z ERROR piecestore upload rejected, too many requests {"live requests": 7}
Summary
This error results from a setting implemented in v0.14.9 and can be modified by the SNO. However, it does not impact your reputation so seeing it present is not indicative of a problem on the SNO side. 2 notes: 1)any ARM-based nodes should probably not go above 20 or 30 with the setting detailed below as it can overload them and 2)Windows users need to use Notepad++ to modify the required file as Notepad does not respect formatting. In order to modify this setting, stop your node and open config.yaml. This will be in the storage directory you specified in your run command.
Add this line to the bottom of the file (the number should be tweaked to your nodeâs performance):
storage2.max-concurrent-requests: 50
Save the file then start the node again to correct this informational error.
Context Canceled:
2019-07-14T19:54:19.570Z ERROR piecestore protocol: rpc error: code = Canceled desc = context canceled
2019-07-16T12:31:06.716Z ERROR piecestore internal: infodb: context canceled
2019-07-14T10:21:47.244Z ERROR piecestore internal: infodb: interrupted
Summary
Example: Upload failed in log files
Context Canceled indicates that your node was too slow in the transfer and the required number of nodes completed their transfer before you. This is a normal error to encounter and should only be concerning if youâre only getting these and errors regarding uploads/downloads starting and always failing. At that point, you will want to investigate your internal network for issues then check that your speeds meet the project requirements.
The interrupted error only shows up on uploads and occurs when a message that would trigger a context canceled error is received while the Storage Node is writing to the infodb and the connection to the db is immediately cut as a result.
Unexpected EOF:
2019-07-14T19:55:10.668Z ERROR piecestore protocol: unexpected EOF
Summary
This error is typically only encountered on uploads to the storagenode and is similar to the Context Canceled error in that it typically indicates a slower node but can also mean the uplink or satellite cancelled the request before the upload completed rather than other nodes being faster. This is an informational error and should not require any action from SNOs.
infodb: database disk image is malformed:
{"error": "infodb: database disk image is malformed", "errorVerbose": "infodb: database disk image is malformed\n\[tstorj.io/storj/storagenode/storagenodedb.(](http://tstorj.io/storj/storagenode/storagenodedb.()
* **ordersdb).ListUnsentBySatellite:151\n\[tstorj.io/storj/storagenode/orders.(](http://tstorj.io/storj/storagenode/orders.()** *Sender).runOnce:112\n\[tstorj.io/storj/internal/sync2.(*Cycle).Run:87](http://tstorj.io/storj/internal/sync2.(*Cycle).Run:87)\n\[tstorj.io/storj/storagenode/orders.(*Sender).Run:105](http://tstorj.io/storj/storagenode/orders.(*Sender).Run:105)\n\[tstorj.io/storj/storagenode.(*Peer)](http://tstorj.io/storj/storagenode.(*Peer))
.Run.func5:336\n\[tgolang.org/x/sync/errgroup.(*Group).Go.func1:57](http://tgolang.org/x/sync/errgroup.(*Group).Go.func1:57)"}
Summary
This error requires the SNO to perform the steps included in the link in order to recover the infodb database and is a critical error that requires immediate attention.
Voucher Errors:
ERROR vouchers Error requesting voucher{"satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "error": "voucher: unable to find satellite on the network: node not found", "errorVerbose": "voucher: unable to find satellite on the network: node not found\n\tstorj.io/storj/storagenode/vouchers.(*Service).request:127\n\tstorj.io/storj/storagenode/vouchers.(*Service).Request:116\n\tstorj.io/storj/storagenode/vouchers.(*Service).RunOnce.func1:103\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
Summary
NOTE: As of v0.21.1, this message should no longer appear as kademlia was removed in v0.21.1
If your node is new, you may encounter this error for any of the four satellites and that is normal. If your node has been up for awhile (7d+) however, it is likely indicative of DNS or other networking issues. If you cannot find issues with your DNS, you may try stopping the node and renaming kademlia
to kademlia.bak
and then starting the node again. You will get this message upon start but that is due to your node now having an empty kademlia routing table. It may or may not go away.
ERROR server rpc error: code = PermissionDenied desc = info requested from untrusted peer
Summary
With the latest update getting node info is restricted to only trusted nodes. When someone else tries to retrieve this information, youâll see this in your log.
Nodestats:cache messages:
ERROR nodestats:cache Get disk space usage query failed {"error": "node stats service error: rpc error: code = PermissionDenied desc = node not found"}
ERROR nodestats:cache Get stats query failed {"error": "node stats service error: unable to connect to the satellite"}
Summary
If the node ID that was not found matches your node ID, this is a temporary message caused by the fact that your node is new enough that not all of the satellites know of your node yet. It should go away with time.
For the second error, this error appears if the satellite is down or otherwise unavailable.
Download Errors:
2019-08-29T15:54:15.647Z INFO piecestore download failed {"Piece ID": "AXNYNZLQSU6FTH55AJPWK34BQCFDWG5EWFBPTNOVLOHA2KVXUT4Q", "SatelliteID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Action": "GET", "error": "piecestore: piecestore protocol: rpc error: code = Unavailable desc = transport is closing", "errorVerbose": "piecestore: piecestore protocol: rpc error: code = Unavailable desc = transport is closing\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func3:504\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2019-12-15T19:56:00.530Z INFO piecestore download failed {"Piece ID": "25JGFFHGSZBEHEQMTYZ5QUWIAVCN5CCHDGX5O2VGJ7BXFQAK66IQ", "Satellite ID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Action": "GET", "error": "piecestore: piecestore protocol: write tcp 172.17.0.2:28967->[redacted]:36716: use of closed network connection", "errorVerbose": "piecestore: piecestore protocol: write tcp 172.17.0.2:28967->[redacted]:36716: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:189\n\tstorj.io/drpc/drpcwire.SplitN:25\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:233\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:266\n\tstorj.io/storj/pkg/pb.(*drpcPiecestoreDownloadStream).Send:1078\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doDownload.func3:598\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
Summary
When either of these info messages appear, either the uplink cancelled the download request or 29 other storagenodes completed their transfers before yours could. Functionally, this is identical behavior to the Upload failed Context Cancelled errors. No action is required from the SNO as these messages are normal to see.
DRPC errors:
2019-11-02T02:56:42.455Z INFO piecestore download failed {"Piece ID": "5STYD4QTAXWG7VFYV5Y2DSQE4E7F4D22H5IVFGYDE46NZ6IBNWKQ", "SatelliteID": "118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW", "Action": "GET", "error": "piecestore: piecestore protocol: drpc: stream terminated by sending error", "errorVerbose": "piecestore: piecestore protocol: drpc: stream terminated by sending error\n\tstorj.io/drpc/drpcstream.(*Stream).SendError:261\n\tstorj.io/drpc/drpcmanager.(*Manager).manageStream:224"}
Summary
This info message will appear whenever either the Storage Node or the Uplink unexpectedly terminates the connection on a download. This is a generic error message and does not inherently mean that there is a problem that requires SNO intervention.
2021-11-03T21:15:01.649-0400 WARN contact:service Your node is still considered to be online but encountered an error. {âSatellite IDâ: â12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmoâ, âErrorâ: âcontact: failed to dial storage node (ID: ***) at address ***:28967 using QUIC: rpc: quic: timeout: no recent network activityâ}
Summary
This is mean that you did not setup UDP in your docker run
command and did not forward UDP
I know Iâve probably missed a few errors, but if youâd like to post them, I will find out and update my list here.