Greetings,
The retain process is not currently running on my US Storage node.
I have been monitoring the process run cycle on my nodes and I noted that the process is running on my other three regions, but not the US one.
Upon reviewing the logs, it looks like it ran into an issue during the 3/1/2024 run and deleted all but six pieces and then has not run since.
I restarted the node on 3/10/24, but it did not run as expected on 3/15/2024. See attached screenshot summarization of the previous two weeks of retain process runs. Here are the relevant logs from the errors noted during the 3/1/2024 run. There are no errors associated with the retain process for the US node dated after 3/1/2024.
Storj is running via Windows, via ISCSI drive to NAS. Data on NAS is Synology’s SHR2 RAID config.
I would appreciate your advice on debugging this issue.
Thanks in advance.
2024-03-01T11:23:11-06:00 INFO retain Prepared to run a Retain request. {“Created Before”: “2024-02-21T17:59:59Z”, “Filter Size”: 4100003, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-03-01T11:23:11-06:00 INFO lazyfilewalker.gc-filewalker starting subprocess {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-03-01T11:23:11-06:00 INFO lazyfilewalker.gc-filewalker subprocess started {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}
2024-03-01T11:23:11-06:00 INFO piecestore download started {“Piece ID”: “7X7Y5MSVCUJNRHKRAPR5ICOZDOM47BCI3ZJVMLVOSYFDV4ISCNHQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET”, “Offset”: 0, “Size”: 217600, “Remote Address”: “79.127.223.129:52630”}
2024-03-01T11:23:12-06:00 INFO lazyfilewalker.gc-filewalker.subprocess Database started {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “process”: “storagenode”}
2024-03-01T11:23:12-06:00 INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker started {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “process”: “storagenode”, “createdBefore”: “2024-02-21T17:59:59Z”, “bloomFilterSize”: 4100003}
2024-03-01T12:02:28-06:00 INFO lazyfilewalker.gc-filewalker.subprocess gc-filewalker completed {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “piecesCount”: 34948349, “piecesSkippedCount”: 0, “process”: “storagenode”}
2024-03-01T12:02:29-06:00 INFO lazyfilewalker.gc-filewalker subprocess finished successfully {“satelliteID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”}2024-03-01T12:10:25-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T12:10:25-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “4DK27CWN232HFKYML3OVV76OAAPLI545KDQB46DDFXJBLA3YD5GA”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T12:43:24-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T12:43:24-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “HFLK3WGVOCC7TTXNRCC3CICYP2MFXQG3XUCY7A5KTWJFZ334Q5ZA”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T12:53:48-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T12:53:48-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “KTKX4KVTA6TWM2RXPPEDEVH33DM6UR2JONH3H232P7MDU5HWEY3A”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T12:54:07-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T12:54:07-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “KWSRNMTO4VIZ7OC2R2NDWMGYVRP3BBH5N625T3SZPLKHXERDVAUA”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T13:03:31-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T13:03:31-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “NP33MQY6MYXGWNWVJFXXFW2ZH5SZUW3DEWU4RR5YE2WGPOR22DYA”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T13:20:04-06:00 WARN pieces failed to migrate v0 piece. Piece may not be recoverable
2024-03-01T13:20:04-06:00 WARN retain failed to delete piece {“Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Piece ID”: “SX2PXXJHZBJ3DDKAEOLH6JJYBTIXQCDPPPQNZCWJVCLNM66MF7UQ”, “error”: “pieces error: filestore error: file does not exist”, “errorVerbose”: “pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:373\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75”}
2024-03-01T13:42:37-06:00 INFO retain Moved pieces to trash during retain {“Deleted pieces”: 1736436, “Failed to delete”: 6, “Pieces failed to read”: 0, “Pieces count”: 34948349, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Duration”: “2h19m25.3197935s”, “Retain Status”: “enabled”}