Node Suspension

restarted the node with the above command, still showing more free space on dashboard then with df -h

I have received an email that my node is suspended.
But when I use the dashboard it says it is up and running.
So was there a mistake?

1 Like

Ive pinged the on-call engineer and asked them to hop in with reply @brandon

1 Like

Do they need my node’s ID or any other info from me?
Thanks!

I got similar if not the same e-mail saying that my Node is suspended referring one of the '“us-central-1” satellite. I have checked the “uptime checks” and that one has the lowest from all other satellites 83.2%. Audit check is 98.5%. What do I suppose to do with that? Thank you. Node_suspended

You may need to do it a few times to soft restart your node.

what should I do now?

The email is not really clearing out how should I behave to get back on track: have no idea at all on how to “resolve the issue causing audit failures on your node”, because have no idea about the issue…

Can anybody please help?

Thanks

1 Like

Hi @itnok, have you seen this thread ?

Suspension mode and disqualification emails

Blueprint: Downtime Disqualification

I also asked our oncall engineer to hop in and clarify further when they are online. But those 2 posts are a great place to start

its the same for me… so much for transparency :slight_smile:

1 Like

Please, search for failed audits in your logs

posted them above, but have no idea how to fix it. Help :woozy_face:

1 Like

I do not see audits errors in your screenshots.
Regarding ordersdb is locked: Ordersdb error: database is locked
Please, post logs in text, the picture is hard to analyze.
Please, restart your node and check is your error regarding boltdb is gone?
How is your hdd connected?

2 Likes

2020-04-22T23:32:50.275Z ERROR piecestore failed to add bandwidth usage {“error”: “bandwidthdb error: database is locked”, “errorVerbose”: “bandwidthdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:59\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:721\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doUpload:443\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Upload:215\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:987\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:105\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:56\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:93\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-04-22T23:32:50.276Z INFO piecestore uploaded {“Piece ID”: “VGJMND5WD4AGFSXUGGKUAASSOZM43BIDMDEASOH7ILIUXYKVUTGQ”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “PUT”}
2020-04-22T23:32:50.290Z ERROR piecestore failed to add bandwidth usage {“error”: “bandwidthdb error: database is locked”, “errorVerbose”: “bandwidthdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:59\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:721\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doUpload:443\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Upload:215\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:987\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:105\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:56\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:93\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-04-22T23:32:50.290Z INFO piecestore uploaded {“Piece ID”: “33WBH4GHDZQTDNLKJVXUWKA5AVRPMZ36I33M7SB7WUMAXMQSH5CQ”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “PUT”}
2020-04-22T23:32:50.857Z INFO piecestore download started {“Piece ID”: “F7YHOQQCC3EVZYYWKFLHARRNQVK5S2CPGNLQ3VD2XXGJ62M2SRQA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”}
2020-04-22T23:32:53.458Z ERROR piecestore failed to add bandwidth usage {“error”: “bandwidthdb error: database is locked”, “errorVerbose”: “bandwidthdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:59\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:721\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doUpload:443\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Upload:215\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:987\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:105\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:56\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:93\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-04-22T23:32:53.459Z INFO piecestore upload canceled {“Piece ID”: “J4R2R4NE5Y7P7YV2SLSKWYLDMQ3LA6Y6ADTUJPWJD74WGM2IN3UA”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “PUT”, “error”: “context canceled”, “errorVerbose”: “context canceled\n\tstorj.io/common/pb/pbgrpc.init.0.func3:70\n\tstorj.io/common/rpc/rpcstatus.Wrap:77\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doUpload:452\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Upload:215\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:987\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:105\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:56\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:93\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-04-22T23:32:56.541Z ERROR piecestore failed to add bandwidth usage {“error”: “bandwidthdb error: database is locked”, “errorVerbose”: “bandwidthdb error: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*bandwidthDB).Add:59\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).saveOrder:721\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doDownload.func6:674\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doDownload:695\n\tstorj.io/storj/storagenode/piecestore.(*drpcEndpoint).Download:466\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:995\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:107\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:105\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:56\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:93\n\tstorj.io/drpc/drpcctx.(*Tracker).track:51”}
2020-04-22T23:32:56.542Z INFO piecestore download canceled {“Piece ID”: “SJXFBZSYJ434HQH5MXEO5XDZ6ZG7I7ZY64DXMLNTUVJHVH2YEBNQ”, “Satellite ID”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “Action”: “GET”, “error”: “context canceled”, “errorVerbose”: “context canceled\n\tstorj.io/common/pb/pbgrpc.init.0.func3:70\n\tstorj.io/common/rpc/rpcstatus.Wrap:77\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).doDownload.func5:646\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}

My hdd is connected via USB 3

Please, try to restart the storagenode

restarted it multiple times, my current uptime on the node is 17 minutes
also mounted hdd in fstab

How does a failed audit look like in the logs?

If I found failed audits (it’s likely because of this… NTP Blocked by Provider... storagenode does not start), what should I do, then?

Thanks

Your node can fail audit only if data is inaccessible when the node is online.
Downtime is a separate metric and your node will not be suspended until this check will be enabled back.

pi@storjpi:~ $ ./audits_satellites.sh
Fetching satellite audits stat information. Please wait…
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error creating revocation database: revocation database error: boltdb error: timeout
storj.io/storj/storage/boltdb.New:44
storj.io/storj/pkg/revocation.newDBBolt:50
storj.io/storj/pkg/revocation.NewDB:33
storj.io/storj/pkg/revocation.NewDBFromCfg:21
main.cmdRun:165
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error during preflight check for storagenode databases: storage node preflight database error: context canceled
storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:327
main.cmdRun:199
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Error: Error during preflight check for storagenode databases: storage node preflight database error: context canceled
storj.io/storj/storagenode/storagenodedb.(*DB).Preflight:327
main.cmdRun:199
storj.io/private/process.cleanup.func1.2:312
storj.io/private/process.cleanup.func1:330
github.com/spf13/cobra.(*Command).execute:840
github.com/spf13/cobra.(*Command).ExecuteC:945
github.com/spf13/cobra.(*Command).Execute:885
storj.io/private/process.ExecWithCustomConfig:84
storj.io/private/process.ExecCustomDebug:66
main.main:329
runtime.main:203
Sat ID: :
Unrecoverable Failed Audits: 0
Recoverable Failed Audits: 0
Successful Audits: 5832

This is the output of (your?) script for node stats…

========== AUDIT ==============
Password:
Critically failed:     0
Critical Fail Rate:    0.000%
Recoverable failed:    0
Recoverable Fail Rate: 0.000%
Successful:            15
Success Rate:          100.000%
========== DOWNLOAD ===========
Failed:                0
Fail Rate:             0.000%
Canceled:              10
Cancel Rate:           3.165%
Successful:            306
Success Rate:          96.835%
========== UPLOAD =============
Rejected:              4168
Acceptance Rate:       40.711%
---------- accepted -----------
Failed:                0
Fail Rate:             0.000%
Canceled:              1246
Cancel Rate:           43.536%
Successful:            1616
Success Rate:          56.464%
========== REPAIR DOWNLOAD ====
Failed:                0
Fail Rate:             0.000%
Canceled:              0
Cancel Rate:           0.000%
Successful:            16
Success Rate:          100.000%
========== REPAIR UPLOAD ======
Failed:                0
Fail Rate:             0.000%
Canceled:              32
Cancel Rate:           37.647%
Successful:            53
Success Rate:          62.353%
========== DELETE =============
Failed:                0
Fail Rate:             0.000%
Successful:            321
Success Rate:          100.000%

So, you do not have failed audits since the recreation of the container.