Mon node est offline

Mon node fonctionne depuis plus de 2 ans sur un PI4 avec OS à jour, sur docker en version v1.61.1.
Depuis ce matin, il est offline. Je n’ai pas reçu de message de bannissement. Je n’ai pas changé le script de lancement, mon réseau n’a pas change de configuration. Et je ne trouve pas l’origine de la panne.
Une idée ou une cause ?

docker run -d --restart always --stop-timeout 300
-p 28967:28967/tcp
-p 28967:28967/udp
-p 14002:14002
-e WALLET=“XXXXXXXXXXXXXXXXX”
-e EMAIL=“XXXXXXX@XXXx”
-e ADDRESS=“XX.XX.XX.XX:28967”
-e STORAGE=“3TB”
–memory=800m
–log-opt max-size=50m
–log-opt max-file=10
–mount type=bind,source=/home/pi/.local/share/storj/identity/storagenode,destination=/app/identity
–mount type=bind,source=/media/stocke,destination=/app/config
–name storagenode storjlabs/storagenode:latest

What do the storagenode logs say?

2022-08-27T22:47:23.982Z INFO db.migration.54 Add interval_end_time fi eld to storage_usage db, backfill interval_end_time with interval_start, rename interval_start to timestamp {“Process”: “storagenode”}
Error: Error creating tables for master database on storagenode: migrate: databa se disk image is malformed
storj.io/storj/storagenode/storagenodedb.(*DB).Migration.func21:2028
storj.io/storj/private/migrate.Func.Run:307
storj.io/storj/private/migrate.(*Migration).Run.func1:197
storj.io/private/dbutil/txutil.withTxOnce:75
storj.io/private/dbutil/txutil.WithTx:36
storj.io/storj/private/migrate.(*Migration).Run:196
storj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:347
main.cmdRun:226
storj.io/private/process.cleanup.func1.4:378
storj.io/private/process.cleanup.func1:396
github.com/spf13/cobra.(*Command).execute:852
github.com/spf13/cobra.(*Command).ExecuteC:960
github.com/spf13/cobra.(*Command).Execute:897
storj.io/private/process.ExecWithCustomConfigAndLogger:93
main.main:479
runtime.main:255
2022-08-27 22:47:24,050 INFO stopped: storagenode (exit status 1)
2022-08-27 22:47:24,052 INFO stopped: processes-exit-eventlistener (terminated b y SIGTERM)

Je pense que la solution se trouve ici : https://support.storj.io/hc/en-us/articles/360029309111-How-to-fix-a-database-disk-image-is-malformed- ??

I think so too, although I never had to do such a repair

Les nouveaux logs après application de la procédure :slight_smile: 2022-08-28T00:17:09.405Z INFO Telemetry enabled {“Process”: “storagenode”, “instance ID”: “1AGh7wuwcYghY1dwvXnqp2V4PziCPcjZbJuniCJWrs1PakK4m8”}
2022-08-28T00:17:09.459Z INFO db.migration.26 Add Trash column to pieceExpirationDB {“Process”: “storagenode”}
Error: Error creating tables for master database on storagenode: migrate: no such table: piece_expirations
storj.io/storj/private/migrate.SQL.Run:296
storj.io/storj/private/migrate.(*Migration).Run.func1:197
storj.io/private/dbutil/txutil.withTxOnce:75
storj.io/private/dbutil/txutil.WithTx:36
storj.io/storj/private/migrate.(*Migration).Run:196
storj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:347
main.cmdRun:226
storj.io/private/process.cleanup.func1.4:378
storj.io/private/process.cleanup.func1:396
github.com/spf13/cobra.(*Command).execute:852
github.com/spf13/cobra.(*Command).ExecuteC:960
github.com/spf13/cobra.(*Command).Execute:897
storj.io/private/process.ExecWithCustomConfigAndLogger:93
main.main:479
runtime.main:255
2022-08-28 00:17:09,477 INFO stopped: storagenode (exit status 1)
2022-08-28 00:17:09,480 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

sqlite3 piece_expirations.db “PRAGMA integrity_check;”
ok

On dirait que la corruption est allée un peu plus loin, vous devez recréer ce fichier de base de données en suivant ce guide: https://support.storj.io/hc/en-us/articles/4403032417044-How-to-fix-database-file-is-not-a-database-error

2022-08-28T07:45:28.111Z INFO failed to sufficiently increase receive buffer size (was: 176 kiB, wanted: 2048 kiB, got: 352 kiB). See https://github.com/lucas-clemente/quic-g o/wiki/UDP-Receive-Buffer-Size for details. {“Process”: “storagenode”}
2022-08-28T07:45:28.865Z INFO trust Scheduling next refresh {“Process”: “storagenode”, “after”: “6h33m56.034753615s”}
2022-08-28 07:45:28,867 INFO waiting for storagenode, processes-exit-eventlistener to die
2022-08-28T07:45:28.872Z INFO bandwidth Performing bandwidth usage rollups {“Process”: “storagenode”}
2022-08-28T07:45:28.874Z WARN piecestore:monitor Disk space is less than requested. Allocated space is {“Process”: “storagenode”, “bytes”: 172524158976}
2022-08-28T07:45:28.874Z ERROR piecestore:monitor Total disk space is less than required minimum {“Process”: “storagenode”, “bytes”: 500000000000}
2022-08-28T07:45:28.874Z ERROR services unexpected shutdown of a runner {“Process”: “storagenode”, “name”: “piecestore:monitor”, “error”: “piecestore monitor: disk spac e requirement not met”, “errorVerbose”: “piecestore monitor: disk space requirement not met\n\tstorj.io/storj/storagenode/monitor.(*Service).Run:125\n\tstorj.io/storj/private/lifecycle .(*Group).Run.func2.1:87\n\truntime/pprof.Do:40\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2022-08-28T07:45:28.875Z ERROR bandwidth Could not rollup bandwidth usage {“Process”: “storagenode”, “error”: “sql: transaction has already been committed or roll ed back”}
2022-08-28T07:45:28.875Z ERROR gracefulexit:chore error retrieving satellites. {“Process”: “storagenode”, “error”: “satellitesdb: context canceled”, “errorVerbose”: “s atellitesdb: context canceled\n\tstorj.io/storj/storagenode/storagenodedb.(*satellitesDB).ListGracefulExits.func1:152\n\tstorj.io/storj/storagenode/storagenodedb.(*satellitesDB).ListGr acefulExits:164\n\tstorj.io/storj/storagenode/gracefulexit.(*Service).ListPendingExits:58\n\tstorj.io/storj/storagenode/gracefulexit.(*Chore).AddMissing:58\n\tstorj.io/common/sync2.(*C ycle).Run:99\n\tstorj.io/storj/storagenode/gracefulexit.(*Chore).Run:51\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:40\n\tstorj.io/storj/private/lif ecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57”}
2022-08-28T07:45:28.876Z ERROR nodestats:cache Get pricing-model/join date failed {“Process”: “storagenode”, “error”: “context canceled”}
2022-08-28T07:45:28.880Z ERROR piecestore:cache error getting current used space: {“Process”: “storagenode”, “error”: “context canceled; context canceled; context canceled; context canceled; context canceled; context canceled; context canceled”, “errorVerbose”: “group:\n— context canceled\n— context canceled\n— context canceled\n— conte xt canceled\n— context canceled\n— context canceled\n— context canceled”}
2022-08-28T07:45:28.887Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2022-08-28T07:45:28.888Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2022-08-28T07:45:28.890Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2022-08-28T07:45:28.891Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2022-08-28T07:45:28.892Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
2022-08-28T07:45:28.893Z ERROR pieces:trash emptying trash failed {“Process”: “storagenode”, “error”: “pieces error: filestore error: context canceled”, “errorVerbose”: " pieces error: filestore error: context canceled\n\tstorj.io/storj/storage/filestore.(*blobStore).EmptyTrash:154\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).EmptyTrash:316\n \tstorj.io/storj/storagenode/pieces.(*Store).EmptyTrash:367\n\tstorj.io/storj/storagenode/pieces.(*TrashChore).Run.func1:51\n\tstorj.io/common/sync2.(*Cycle).Run:99\n\tstorj.io/common/ sync2.(*Cycle).Start.func1:77\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:57"}
Error: piecestore monitor: disk space requirement not met
2022-08-28 07:45:29,037 INFO stopped: storagenode (exit status 1)
2022-08-28 07:45:29,041 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

2022-08-28T07:48:21.285Z INFO Invalid configuration file value for key {“Process”: “storagenode-updater”, “Key”: “log.output”}
2022-08-28T07:48:21.285Z INFO Invalid configuration file value for key {“Process”: “storagenode-updater”, “Key”: “log.caller”}
2022-08-28T07:48:21.285Z INFO Invalid configuration file value for key {“Process”: “storagenode-updater”, “Key”: “log.level”}
2022-08-28T07:48:21.286Z INFO Anonymized tracing enabled {“Process”: “storagenode-updater”}
2022-08-28T07:48:21.306Z INFO Running on version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.62.3”}
2022-08-28T07:48:21.307Z INFO Downloading versions. {“Process”: “storagenode-updater”, “Server Address”: “https://version.storj.io”}
2022-08-28T07:48:21.483Z INFO Configuration loaded {“Process”: “storagenode”, “Location”: “/app/config/config.yaml”}
2022-08-28T07:48:21.484Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2022-08-28T07:48:21.501Z INFO Operator email {“Process”: “storagenode”, “Address”: “XXXXXXX”}
2022-08-28T07:48:21.502Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “XXXXXXXXX”}
2022-08-28T07:48:21.957Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode”, “Version”: “v1.62.3”}
2022-08-28T07:48:21.957Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode”}
2022-08-28T07:48:22.032Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.62.3”}
2022-08-28T07:48:22.032Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}
2022-08-28T07:48:22.293Z INFO Telemetry enabled {“Process”: “storagenode”, “instance ID”: “1AGh7wuwcYghY1dwvXnqp2V4PziCPcjZbJuniCJWrs1PakK4m8”}
2022-08-28 07:48:22,294 INFO success: processes-exit-eventlistener entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-08-28 07:48:22,294 INFO success: storagenode entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-08-28 07:48:22,295 INFO success: storagenode-updater entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2022-08-28T07:48:22.391Z INFO db.migration Database Version {“Process”: “storagenode”, “version”: 54}
2022-08-28T07:48:22.658Z INFO preflight:localtime start checking local system clock with trusted satellites’ system clock. {“Process”: “storagenode”}

cette erreur indique que vous n’avez pas assez d’espace disque libre:

si l’emplacement des données est correct, le nœud ne peut pas reconnaître ses données. Cela peut se produire si la base de données storage_usage.db est endommagée ou a été recréée. Avez-vous corrigé cette base de données ?
Pour que le nœud remplisse les informations manquantes, vous devez abaisser temporairement le seuil de l’option storage2.monitor.minimum-disk-space: dans le fichier de configuration à 170 Go ou moins. Après cela, vous devez recréer le conteneur : arrêtez et supprimez le conteneur, puis recommencez avec toutes vos options.

C’est exact, les bases de données corrompues étaient bandwidth heldamount piece_expirations et storage_usage.
J’ai appliqué https://support.storj.io/hc/en-us/articles/360029309111-How-to-fix-a-database-disk-image-is-malformed- sur les quatre bases puis https://support.storj.io/hc/en-us/articles/4403032417044-How-to-fix-database-file-is-not-a-database-error comme demandé tout à l’heure.
Je viens de modifier storage2.monitor.minimum-disk-space

  --log.output string                can be stdout, stderr, or a filename (default "stderr")
  --log.stack                        if true, log stack traces
  --metrics.addr string              address(es) to send telemetry to (comma-separated) (default "collectora.storj.io:9000")
  --metrics.app string               application name for telemetry identification. Ignored for certain applications. (default "storagenode")
  --metrics.app-suffix string        application suffix. Ignored for certain applications. (default "-release")
  --metrics.instance-prefix string   instance id prefix
  --metrics.interval duration        how frequently to send up telemetry. Ignored for certain applications. (default 1m0s)
  --tracing.agent-addr string        address for jaeger agent (default "agent.tracing.datasci.storj.io:5775")
  --tracing.app string               application name for tracing identification (default "storagenode")
  --tracing.app-suffix string        application suffix (default "-release")
  --tracing.buffer-size int          buffer size for collector batch packet size
  --tracing.enabled                  whether tracing collector is enabled (default true)
  --tracing.interval duration        how frequently to flush traces to tracing agent (default 0s)
  --tracing.queue-size int           buffer size for collector queue size
  --tracing.sample float             how frequent to sample traces

2022-08-28 08:34:57,612 WARN received SIGQUIT indicating exit request
2022-08-28 08:34:57,612 INFO waiting for storagenode, processes-exit-eventlistener to die
2022-08-28 08:34:57,617 INFO exited: storagenode (exit status 1; not expected)
2022-08-28 08:34:58,622 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

Pouvez-vous s’il vous plaît me montrer exactement comment vous l’avez spécifié?
le message d’erreur indique que vous l’avez spécifié de manière incorrecte ou que vous avez utilisé des caractères illisibles (si vous avez utilisé des guillemets, ils doivent être droits et non bouclés).

how much disk space a node at minimum has to advertise

storage2.monitor.minimum-disk-space: 170.0 GB

how long after OrderLimit creation date are OrderLimits no longer accepted

Manifestement j’ai foiré un truc dans la restauration des bases ! y-a-t’il un moyen de repartir à zéro avec les données stockées en repartant avec des bases neuves ???

2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-disk-space”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “contact.external-address”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “operator.email”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “server.private-address”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “server.address”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage.allocated-bandwidth”}
2022-08-28T14:48:07.497Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage2.monitor.minimum-disk-space”}
2022-08-28T14:48:07.498Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “operator.wallet”}
2022-08-28T14:48:07.498Z INFO Invalid configuration file key {“Process”: “storagenode-updater”, “Key”: “storage2.database-dir”}
2022-08-28T14:48:07.499Z INFO Anonymized tracing enabled {“Process”: “storagenode-updater”}
2022-08-28T14:48:07.532Z INFO Running on version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.62.3”}
2022-08-28T14:48:07.532Z INFO Downloading versions. {“Process”: “storagenode-updater”, “Server Address”: “https://version.storj.io”}
2022-08-28T14:48:07.725Z INFO Configuration loaded {“Process”: “storagenode”, “Location”: “/app/config/config.yaml”}
2022-08-28T14:48:07.727Z INFO Anonymized tracing enabled {“Process”: “storagenode”}
2022-08-28T14:48:07.745Z INFO Operator email {“Process”: “storagenode”, “Address”: “XX@XX”}
2022-08-28T14:48:07.746Z INFO Operator wallet {“Process”: “storagenode”, “Address”: “XX”}
2022-08-28T14:48:08.190Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode”, “Version”: “v1.62.3”}
2022-08-28T14:48:08.190Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode”}
2022-08-28T14:48:08.248Z INFO Current binary version {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”, “Version”: “v1.62.3”}
2022-08-28T14:48:08.248Z INFO Version is up to date {“Process”: “storagenode-updater”, “Service”: “storagenode-updater”}

2022-08-28T14:49:18.354Z INFO Telemetry enabled {“Process”: “storagenode”, “instance ID”: “1AGh7wuwcYghY1dwvXnqp2V4PziCPcjZbJuniCJWrs1PakK4m8”}
2022-08-28T14:49:18.416Z INFO db.migration.26 Add Trash column to pieceExpirationDB {“Process”: “storagenode”}
Error: Error creating tables for master database on storagenode: migrate: no such table: piece_expirations
storj.io/storj/private/migrate.SQL.Run:296
storj.io/storj/private/migrate.(*Migration).Run.func1:197
storj.io/private/dbutil/txutil.withTxOnce:75
storj.io/private/dbutil/txutil.WithTx:36
storj.io/storj/private/migrate.(*Migration).Run:196
storj.io/storj/storagenode/storagenodedb.(*DB).MigrateToLatest:347
main.cmdRun:226
storj.io/private/process.cleanup.func1.4:378
storj.io/private/process.cleanup.func1:396
github.com/spf13/cobra.(*Command).execute:852
github.com/spf13/cobra.(*Command).ExecuteC:960
github.com/spf13/cobra.(*Command).Execute:897
storj.io/private/process.ExecWithCustomConfigAndLogger:93
main.main:479
runtime.main:255
2022-08-28 14:49:18,435 INFO stopped: storagenode (exit status 1)
2022-08-28 14:49:18,437 INFO stopped: processes-exit-eventlistener (terminated by SIGTERM)

Oui, vous pouvez. Vous devrez supprimer chacun des endommagés et les recréer selon ce guide: https://support.storj.io/hc/en-us/articles/4403032417044-How-to-fix-database-file-is-not-a-database-error
ou essayez de restaurer selon ce guide: https://support.storj.io/hc/en-us/articles/360029309111-How-to-fix-a-database-disk-image-is-malformed-

On est bien d’accord mais quand c’est deux procédures ne fonctionnent pas ou plus… Que puis je faire maintenant ? En fait considérant que mes bases de données sont définitivement HS, que puis je faire. Tout formater et recommencer ?