Poor online score on single satellite

We recently had some power outages, combined with me shuffling my hard drives, so I wasn’t too concerned with my lower online scores as they have been recovering, except for europe-north-1 on one of my nodes.

Small logfiles splice (The node was running with log level error at the time):

2021-06-15T22:29:28.254Z        ERROR   piecestore      download failed {"Piece ID": "QKZ5MG7LZ2OKAEHVA7DG6UIVQ4HNZ3P7PRLZ3TUA5UVI5TBNHUZQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "GET", "error": "context deadline exceeded", "errorVerbose": "context deadline exceeded\n\tstorj.io/common/rpc/rpcstatus.Wrap:74\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func6:630\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:648\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:217\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51"}
2021-06-15T23:30:31.992Z        ERROR   piecestore      upload failed   {"Piece ID": "SGVGGGPIDEZESRVZIS424Q6B6BPNXKA6YY3SIO42NBTQ4P3WIECA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "unexpected EOF", "errorVerbose": "unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:327\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 163840}
2021-06-15T23:41:07.372Z        ERROR   piecestore      download failed {"Piece ID": "P4YN33JPJ4KILSEQW5JCCU5AYLXZSZ3S6A2L3VY3JDXV2DUTSG3Q", "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Action": "GET", "error": "write tcp 172.17.0.4:28967->135.181.29.51:60962: use of closed network connection", "errorVerbose": "write tcp 172.17.0.4:28967->135.181.29.51:60962: use of closed network connection\n\tstorj.io/drpc/drpcstream.(*Stream).pollWrite:217\n\tstorj.io/drpc/drpcwire.SplitN:31\n\tstorj.io/drpc/drpcstream.(*Stream).RawWrite:265\n\tstorj.io/drpc/drpcstream.(*Stream).MsgSend:312\n\tstorj.io/common/pb.(*drpcPiecestore_DownloadStream).Send:302\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download.func5.1:582\n\tstorj.io/common/rpc/rpctimeout.Run.func1:22"}
2021-06-16T00:30:15.013Z        ERROR   piecestore      upload failed   {"Piece ID": "72XA46PDK6QGCRSCDBIC2H3PKJIC4ZCTPGNK2CNMIF4RVG2A5MIA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "unexpected EOF", "errorVerbose": "unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:327\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 2105344}
2021-06-16T01:39:19.430Z        ERROR   piecestore      upload failed   {"Piece ID": "FMXYBWKP2LUFREGPIVS5EWURBQGHQQLEZCEOUUO6JYSRSE34RWOQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "context deadline exceeded", "errorVerbose": "context deadline exceeded\n\tstorj.io/common/rpc/rpcstatus.Wrap:74\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:329\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 1056768}
2021-06-16T03:01:01.722Z        ERROR   piecestore      upload failed   {"Piece ID": "NLBQ64EJVJ4LKNGKWLJI7GECGLFLN2ULX55S4OWAZQFQDARTM33A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "unexpected EOF", "errorVerbose": "unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:327\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 2105344}
2021-06-16T03:23:57.327Z        ERROR   piecestore      upload failed   {"Piece ID": "GKXADB6BBAM4B6XCTCDH6BXIDNK3BHSSIECZIADY7TO4VKFWKD3A", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "unexpected EOF", "errorVerbose": "unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:327\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 2105344}
2021-06-16T03:54:35.228Z        ERROR   piecestore      upload failed   {"Piece ID": "IPSR4DHVN7BMCCUMIKOETDZNKYLKKTZNMHKC5EJDIIJ45HMCOWBA", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "context deadline exceeded", "errorVerbose": "context deadline exceeded\n\tstorj.io/common/rpc/rpcstatus.Wrap:74\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:329\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 311296}
2021-06-16T04:45:28.643Z        ERROR   piecestore      upload failed   {"Piece ID": "U6MBX7R4J4S5OPKXKVWJAZCQSVJGVICWPRJM6SAWPLOZSZVXT5PQ", "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "Action": "PUT", "error": "unexpected EOF", "errorVerbose": "unexpected EOF\n\tstorj.io/common/rpc/rpcstatus.Error:82\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:327\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:209\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:58\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:111\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:62\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:99\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51", "Size": 163840}

Any ideas as to what might have caused this and if the node has already been suspended with this score?

After restarting the node with log level info I’m seeing some lines like the following (along with successful uploads):

2021-06-16T05:19:20.552Z        WARN    contact:service Your node is still considered to be online but encountered an error.      {"Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Error": "contact: failed to dial storage node (ID: <node_id>) at address <my_address>:28000 using QUIC: rpc: quic: Timeout: No recent network activity"}

i don’t see anything that should cause it…

the you should enable UDP / QUIC but currently there is no punishment for not doing so.

the other errors in logs are normal random customer stuff… when they loose connection for whatever reason.

my guess… and i have the same issue for one of my nodes with eu north…
you like i don’t have a lot of data to eu north and thus you got hit with a couple of audits during your downtime…

i was only down for like 2 days nearly 3 but i should have 12 days to reach the point of a suspension 60% online score.
but one of the nodes was unlucky ended up at 50%
now 14 days later its at 75%

long story short it doesn’t matter, because its a satellite you most likely don’t have any data for.

it happens because lets say due to the low amount of data for a particular satellite is audited lets say twice a week… but its a bit random… so if it just happens to be audited 4 times while its down, then it counts as two weeks of downtime…

it’s a known issue with very limited real world implications, because as a satellite or node increases in data the number of audits goes up and thus the granularity of the data (online score) becomes so good that it can’t happen.

my 15.6tb node is audited multipled times a minute… so its remarkable at detecting my downtime lol… can’t be gone for even a minute before the satellites notice.

but keep an eye on it… should return to 100% after 30 days of uptime.

2 Likes

That makes sense, as since posting this the node has now leaped forward from 33.x% to 66.7%. This is a tiny, newer, node which has very little data from that satellite.

1 Like

It’s working not exactly like @SGC have described (you can check your audit history to see how often it’s audited: I would love a way to see when downtime events are recorded - #2 by BrightSilence), the main reason for quick drop in online score that your node is new, it has too low audits number and worked less than a month, but the online score is calculated for the last 30 days.
So, missing one audit from three would mean 33% online score loss.
See How is the online score calculated? - Node Operator

2 Likes