Pieces failed to migrate v0 piece. Piece may not be recoverable

I’m getting lots of these warnings on my only node that is on 1.110. Haven’t seen those before. Anything of concern?

2024-08-11T20:50:19Z	WARN	pieces	failed to migrate v0 piece. Piece may not be recoverable	{"Process": "storagenode"}
2024-08-11T20:50:19Z	WARN	retain	failed to trash piece	{"Process": "storagenode", "cachePath": "config/retain", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "74MBENAXPOZNXY2MNI37UFR2FBK3SCJ27SKVWCCFKKP7WMA2PHQQ", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:124\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:340\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).TrashWithStorageFormat:406\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:422\n\tstorj.io/storj/storagenode/retain.(*Service).trash:428\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces.func1:387\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).processTrashPiece:112\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).writeLine:99\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).Write:78\n\tio.copyBuffer:431\n\tio.Copy:388\n\tos.genericWriteTo:269\n\tos.(*File).WriteTo:247\n\tio.copyBuffer:411\n\tio.Copy:388\n\tos/exec.(*Cmd).writerDescriptor.func1:578\n\tos/exec.(*Cmd).Start.func2:728"}
2024-08-11T20:50:19Z	WARN	pieces	failed to migrate v0 piece. Piece may not be recoverable	{"Process": "storagenode"}
2024-08-11T20:50:19Z	WARN	retain	failed to trash piece	{"Process": "storagenode", "cachePath": "config/retain", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "754TS5J5DYBC7G746J4DYIVJ4DRSZOG7C27LQLV4VMAIOSGYUSAA", "error": "pieces error: filestore error: file does not exist", "errorVerbose": "pieces error: filestore error: file does not exist\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobStore).Stat:124\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).pieceSizes:340\n\tstorj.io/storj/storagenode/pieces.(*BlobsUsageCache).TrashWithStorageFormat:406\n\tstorj.io/storj/storagenode/pieces.(*Store).Trash:422\n\tstorj.io/storj/storagenode/retain.(*Service).trash:428\n\tstorj.io/storj/storagenode/retain.(*Service).retainPieces.func1:387\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).processTrashPiece:112\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).writeLine:99\n\tstorj.io/storj/storagenode/pieces/lazyfilewalker.(*TrashHandler).Write:78\n\tio.copyBuffer:431\n\tio.Copy:388\n\tos.genericWriteTo:269\n\tos.(*File).WriteTo:247\n\tio.copyBuffer:411\n\tio.Copy:388\n\tos/exec.(*Cmd).writerDescriptor.func1:578\n\tos/exec.(*Cmd).Start.func2:728"}
2024-08-11T20:50:19Z	INFO	lazyfilewalker.gc-filewalker	subprocess finished successfully	{"Process": "storagenode", "satelliteID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"}
2024-08-11T20:50:19Z	INFO	retain	Moved pieces to trash during retain	{"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 83551, "Failed to delete": -83551, "Pieces failed to read": 0, "Pieces count": 3479638, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "8m0.71244498s", "Retain Status": "enabled"}

That’s a funny total: "Deleted pieces": 83551, "Failed to delete": -83551

These ones - yes, the piece has been corrupted.

These are not. This means that the data with TTL was moved to the trash by the garbage collector before the expiration. The customer can delete the TTL data before the expiration date. So kind of “normal”.

Yes, this is interesting.

I’m having the same logs since upgrade.

@Alexey Here you said that it’s OK Failed to migrate v0 piece. Piece may not be recoverable - #7 by Alexey

If it’s tried to delete them? Yes, it’s ok. However, if you would get the same for GET_AUDIT or GET_REPAIR, it would mean the failed audit.

But you should be worried because this means that there may be damage in other pieces.

My nodes don’t have any issues with HDD filesystem or SSD database. There are other people with these types of problems. So I would expect a software bug.

Did you have them before upgrade to 1.110.x?
Because my nodes (1.109.2) don’t have such an error, and they are quite old (when the v0 parts were uploaded).
Maybe related:

then this warning could be just a consequence of attempt to remove the piece, which was already moved by the garbage collector to the trash.

Yes. Had them here in Nov 23: Piecedeleter could not send delete piece to trash .. "error": "pieces error: v0pieceinfodb: sql: no rows in result set" - #27 by flo82

Node is still healthy.

Good to know. Then it is related to the some corrupted v0 pieces. And the good thing, that they appears only for deletions.
They were on one of my nodes and for the Saltlake satellite where it was eventually disqualified (an auditor found them). No errors have occurred since I moved this node back to Windows, and before that there were corruptions of v0 pieces due to kernel panics and using NTFS under Linux.

1 Like

You monster! :astonished:

:stuck_out_tongue_winking_eye:

4 Likes

Thanks, I did check that my audit is still at 100% :sweat_smile:. They consistently come in pair with failed to trash warnings and stop exactly at end of retain/gc-filewalker job. So It doesn’t sound like a random corruption problem?

Still getting the same entries this morning in logs on a different satellite. I count 30000 pairs of warnings with total 31989 in last entry:

2024-08-12T11:17:16Z	INFO	retain	Moved pieces to trash during retain	{"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 31989, "Failed to delete": -31989, "Pieces failed to read": 0, "Pieces count": 16548404, "Satellite ID": "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE", "Duration": "31m55.699918659s", "Retain Status": "enabled"}

Edit: Did a e2fsck, all good. Since they’re warnings I won’t worry too much.

Otherwise how would I know that using NTFS under Linux is horrible idea?
But actually I tried to migrate to Linux and convert the NTFS disk with data in-place:

Unfortunately my setup has a hardware issues, incompatible with Linux, so it working normally only under Windows (after two of three patches of every new Windows version).

I confused these warnings from retain with warnings from the TTL collector. These ones from retain are new.
So, definitely an interesting issue. I need to wait until my nodes are upgraded to 1.110.3, may be it’s related to that version.

1 Like

I see them on a 1.110.3 node too:

2024-08-13T02:11:06Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:06Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:06Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:07Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:08Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:08Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:08Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:08Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:08Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}
2024-08-13T02:11:09Z    WARN    pieces  failed to migrate v0 piece. Piece may not be recoverable        {"Process": "storagenode"}

and there are many of them:

docker logs storagenode | grep migrate | wc -l
253785
2 Likes

I noticed the same errors for the first time on one of my nodes running v1.110.3. The node audit score also went down a fraction of a percent on a few satellites then the score went back up to 100%.

@mtone @jammerdan @TWH
Could you please share what’s the OS and a platform?

The node is running in a Debian 11.10 VM using docker.

1 Like

Ubuntu 24.04, docker 27.1.x, ext4 on lvm cache, DBs on SSD, no badger, lazy enabled. Also a VM on Windows hyper-v. 100% audit & 99.x% online on all nodes atm.

1 Like

I cant remember on which node I saw it. Some Linux Debian OS (10/11 or 12).

1 Like

I just checked and I also had over 300K entries for that error message. And also the “failed to delete” with the negative number but only for the period of time between 2024-08-11 - 2024-08-16.
I only see this on my node that stores data for US1 satellite.

pi@Four:/seagate4/storj2us/storagenode $ grep -i "failed to delete" node.log
2024-07-25T03:05:21Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 232093, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 944055, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "7m52.93481316s", "Retain Status": "enabled"}
2024-07-27T18:29:12Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 68683, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 1741430, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "16m40.792855444s", "Retain Status": "enabled"}
2024-07-29T17:32:15Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 37983, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 1892800, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "14m18.100645327s", "Retain Status": "enabled"}
2024-07-31T17:44:51Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 41527, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 2260114, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "23m35.78412322s", "Retain Status": "enabled"}
2024-08-03T03:34:13Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 132689, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 2598168, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "22m52.403881233s", "Retain Status": "enabled"}
2024-08-04T08:09:37Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 60524, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 2714337, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "22m21.1962619s", "Retain Status": "enabled"}
2024-08-06T13:33:45Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 150743, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 2909073, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "31m34.026949571s", "Retain Status": "enabled"}
2024-08-08T14:08:24Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 114531, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3027871, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "22m46.246403346s", "Retain Status": "enabled"}
2024-08-10T03:31:44Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 101697, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3090103, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "1h44m30.828066249s", "Retain Status": "enabled"}
2024-08-11T22:52:00Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 93382, "Failed to delete": -93382, "Pieces failed to read": 0, "Pieces count": 3319424, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "41m31.783135804s", "Retain Status": "enabled"}
2024-08-13T14:43:00Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 144342, "Failed to delete": -144342, "Pieces failed to read": 0, "Pieces count": 3480229, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "39m18.865287763s", "Retain Status": "enabled"}
2024-08-16T02:12:24Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 102729, "Failed to delete": -102729, "Pieces failed to read": 0, "Pieces count": 3737899, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "57m57.583758772s", "Retain Status": "enabled"}
2024-08-17T20:52:59Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 148267, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3907217, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "32m8.716430777s", "Retain Status": "enabled"}
2024-08-20T07:10:14Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 256161, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3970789, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "29m50.056435378s", "Retain Status": "enabled"}
2024-08-22T11:45:56Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 119612, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3785692, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "26m33.822417085s", "Retain Status": "enabled"}
2024-08-24T19:17:06Z    INFO    retain  Moved pieces to trash during retain     {"Process": "storagenode", "cachePath": "config/retain", "Deleted pieces": 114100, "Failed to delete": 0, "Pieces failed to read": 0, "Pieces count": 3662110, "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Duration": "26m43.135920421s", "Retain Status": "enabled"}
pi@Four:/seagate4/storj2us/storagenode $ cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Could you please check what was a reason of failed deletions?