Us1.storj.io disk average 30% lower than actual disk usage

This is related exclusively to the filesystem. If the garbage collector is unable to finish its work, the deleted data will remain on the disk.
I also see an error on my node

2023-11-01T14:36:45Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-10-25T17:59:59Z", "Filter Size": 2097155, "Satellite ID":"12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-11-02T05:03:31Z   WARN    retain  failed to delete piece  {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:364\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

but not a finishing message, like for other satellites, i.e.

2023-11-02T15:19:50Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-09-19T17:00:07Z", "Filter Size": 364472, "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-11-02T15:23:54Z   INFO    retain  Moved pieces to trash during retain {"process": "storagenode", "num deleted": 0, "Retain Status": "enabled"}

The full sequence:

2023-10-28T22:25:29Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-10-24T17:59:59Z", "Filter Size": 1835488, "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-10-29T00:53:37Z   INFO    retain  Moved pieces to trash during retain {"process": "storagenode", "num deleted": 140893, "Retain Status": "enabled"}
2023-11-01T14:36:45Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-10-25T17:59:59Z", "Filter Size": 2097155, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2023-11-02T05:03:31Z   WARN    retain  failed to delete piece  {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:364\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-11-02T15:19:50Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-09-19T17:00:07Z", "Filter Size": 364472, "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"}
2023-11-02T15:23:54Z   INFO    retain  Moved pieces to trash during retain {"process": "storagenode", "num deleted": 0, "Retain Status": "enabled"}
2023-11-03T09:15:06Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-10-30T17:59:58Z", "Filter Size": 554450, "Satellite ID": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"}
2023-11-03T09:36:32Z   INFO    retain  Moved pieces to trash during retain {"process": "storagenode", "num deleted": 16467, "Retain Status": "enabled"}
2023-11-04T22:49:08Z   INFO    retain  Prepared to run a Retain request. {"process": "storagenode", "Created Before": "2023-10-31T17:59:59Z", "Filter Size": 1847037, "Satellite ID": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"}
2023-11-05T01:21:20Z   INFO    retain  Moved pieces to trash during retain {"process": "storagenode", "num deleted": 95503, "Retain Status": "enabled"}

So, I believe we have a bug in the retain process - it’s silently crashed if the file not found.

And another one:

2023-10-24T20:29:13Z    INFO    piecestore      upload started  {"process": "storagenode", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Available Space": 588397721856, "Remote Address": "172.18.0.1:42534"}
2023-10-24T20:29:13Z    INFO    piecestore      uploaded        {"process": "storagenode", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Action": "PUT", "Size": 768, "Remote Address": "172.18.0.1:42534"}
2023-11-01T21:22:35Z    INFO    collector       deleted expired piece   {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA"}
2023-11-02T05:03:31Z    WARN    retain  failed to delete piece  {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "23AJFTOZUDJCEUQRQRDG7FHIM63FKPYTO6XVLUMXBCQCKZ4PSMMA", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:110\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:245\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).Trash:290\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:404\n\tstorj.io/storj/storagenode/retain.(*Service).trash:364\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces:341\n\tstorj.io/storj/storagenode/retain.(*Service).Run.func2:221\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}

The expired piece has been deleted, then the retain process is trying to delete it again and crashes.

1 Like