“error”: “pieces error: filestore error: rename config/storage/temp/blob-yyyy.partial config/storage/blobs/xxxx.sj1: input/output error

The same mysterious thing kept happening to me for the last few days. My node is running on unraid and previously I have removed one disc from the array. The data from the removed disc was properly migrated and there was no data inconsistency after migration. I found the following error records in my container log

2023-11-23T02:18:26-08:00 WARN console:service unable to get Satellite URL {“process”: “storagenode”, “Satellite ID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”, “error”: “console: trust: satellite is untrusted”, “errorVerbose”: “console: trust: satellite is untrusted\n\tstorj.io/storj/storagenode/trust.init:29\n\truntime.doInit1:6740\n\truntime.doInit:6707\n\truntime.main:249”}

Then another one, this time yellow

2023-11-23T02:26:19-08:00 ERROR piecestore upload failed {“process”: “storagenode”, “Piece ID”: “55NZUZ23EGNIY6QWYT3HNYHNI4V5WWZQDBUEB2WYE767OCDO75BA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “error”: “pieces error: filestore error: rename config/storage/temp/blob-1769455123.partial config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/55/nzuz23egniy6qwyt3hnyhni4v5wwzqdbueb2wye767ocdo75ba.sj1: input/output error”, “errorVerbose”: “pieces error: filestore error: rename config/storage/temp/blob-1769455123.partial config/storage/blobs/ukfu6bhbboxilvt7jrwlqk7y2tapb5d2r2tsmj2sjxvw5qaaaaaa/55/nzuz23egniy6qwyt3hnyhni4v5wwzqdbueb2wye767ocdo75ba.sj1: input/output error\n\tstorj.io/storj/storagenode/blobstore/filestore.(*blobWriter).Commit:133\n\tstorj.io/storj/storagenode/pieces.(*MonitoredBlobWriter).Commit:69\n\tstorj.io/storj/storagenode/pieces.(*Writer).Commit.func1:131\n\tstorj.io/storj/storagenode/pieces.(*Writer).Commit:199\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload.func6:478\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Upload:541\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func1:243\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”, “Size”: 289536, “Remote Address”: “5.161.128.79:12194”}

Hello @Bigfoot,
Welcome to the forum!

This satellite is decommissioned:

You may forget it (and the second one too):

You need to stop and remove the container, unmount the disk, check and fix errors, then mount it back and try to run the container using all your parameters (you must not run it with SETUP step, otherwise you may destroy it, use the normal full docker run command instead), then check your logs.
If it’s still produce the same error - check permissions. If you use --user $(id -u):$(id -g) in your docker run command, make sure that the user $(id -u):$(id -g) is the owner of data location and has full permissions.

3 Likes

Hi Alexey. Please forgive me, but I’m new to Linux and not very proficient with CLI and Docker. Are there somewhere more detailed instructions for these steps?

Perhaps it’s better if you would show your script.

This error most probably refers to a file system problem. So your drive disconnected, cooked because of too low bandwidth and/or power, or whatever …

Reboot the whole system if possible. If at restart the problem still exists, than check the file system.

2 Likes

To stop and remove the conainer:

docker stop -t 300 storagenode
docker rm storagenode

To unmount a disk (replace /mnt/storj with your actual mount point):

sudo umount /mnt/storj

To check the disk and fix issues (replace /dev/sdb with your actual drive):

sudo fsck /dev/sdb

See also: How to Use Fsck in Linux

Mount the disk back (I assume, that you followed How do I setup static mount via /etc/fstab for Linux? - Storj Docs):

sudo mount -a

Then execute your full docker run command with all your parameters, then check logs.

docker logs --tail 20 storagenode

This looks much clearer thanks a lot. Will report on the progress.

One of the latest warning logs:
2023-11-25T01:33:20-08:00 ERROR piecestore upload failed {“process”: “storagenode”, “Piece ID”: “UVRQG7EAWBCHKA6AJLDYGMJR3VPL5QOA6PARQBUS6TSLJL6UJRRA”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “PUT”, “error”: “manager closed: unexpected EOF”, “errorVerbose”: “manager closed: unexpected EOF\n\tgithub.com/jtolio/noiseconn.(*Conn).readMsg:225\n\tgithub.com/jtolio/noiseconn.(*Conn).Read:171\n\tstorj.io/drpc/drpcwire.(*Reader).ReadPacketUsing:96\n\tstorj.io/drpc/drpcmanager.(*Manager).manageReader:226”, “Size”: 196608, “Remote Address”: “5.161.143.41:27784”}

Sattelite 12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S is not among decomssioned ones, right?

No, only those ones which doesn’t present in the trust list: https://www.storj.io/dcs-satellites

The provided error is a usual context canceled error - your node was too slow and loosed the race for pieces.
It cannot be close to the everyone customer in the world, so this is normal.

1 Like

I suspect I have skipped that step since I was young and foolish then and considered it wasn’t necessary if you run the node on docker. How can I check for the status of the mount, and if my suspicion is right how should I correct it?

Try to follow this guide, it has all necessary steps.
Moreover now I suspect that you tried to mount a device to the docker container. This will not work. You must mount the device to the mount point in your OS, format the drive to ext4 and then (only then) run the docker container.
To do not mount manually every time when you reboot the system, you need to make this mount automatic and permanent, as described in the guide above.

1 Like

Ok, my update so far. I used all the little free time I had in the last few days and could not make the node forget the unused satellites and clean the data. After the recent update, the node became more stable restarting every few hours instead of every 20 minutes. As suggested, I decided to deal with file corruption first and then move the node to a dedicated mounted disk. ZFS tool indicated this (I’m using unRaid with an array of individual ZFS discs):

I tried to repair or remove these 3 folders without any luck. I believe all that is left for me to do is to transfer the node to a new unassigned device (disk). I’m waiting for a new drive to arrive in a couple of days. One problem here - it seems like unRaid doesn’t offer a formatting option for xt4. And another question, what is the best way to move the node to a new disc?

Do you know how much data in these folders?
Try to copy these folders to some other place, then copy back. After that try to run the scrub again.

If you want to migrate to a new disk, you may use this instruction:

You did not say that you use Unraid. It supports formatting to xfs by default and mounting of ext4 disk, so for Unraid you seems may format either to xfs or to zfs. zfs is a good choice, but without redundancy (parity or mirroring) it cannot resist corruption and is slower than xfs and ext4. If you don’t use zfs features, you might be better off using xfs.

you need to run the command

please note the --force flag - it will remove data. This command will exit, when it finish its work.

I get this every time (on Storj docker terminal), what I’m doing wrong?

perhaps the docker package is not installed?
Are you use a WSL2 under Windows? If so, you need to unpause the Docker desktop app, it going to a sleep mode from time to time… very annoying

No, unRaid standard docker package is installed. Anyone here with unRaid, would appreciate your feedback.

do you mean that you opened a terminal to the container? If so, your command should not include docker exec -it, you need to run the command directly, i.e.:

My mistake, sorry.
But now I’m Getting this


What does this mean, did it work or what?

No, because you don’t seem to be aware in which environment you’re executing the code.

When you attach to the docker, your current directory is the /app directory. In that directory, you do have a ‘config’ and ‘identity’ folder on which your /mnt/user/storj_01/(…)-folders have been mounted. These folders are reachable by the names they have in your docker environment, but not by using the /mnt/user/storj_01…

Essentially this means, you shouldn’t adapt the code. I used this:

docker exec -it storagenode /app/storagenode forget-satellite --force 12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB 12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo --config-dir /app/config

This shows the directory structure, that’s available. As far as I’m aware, it doesn’t need your identity (because it’s in your config.yaml).

Please, next time just copy your output, instead of images. Than we’re more easily able to correct / help you.

2 Likes

Now it worked, though seems that the satellites were already removed from the trust list, but not the data. THANKS A LOT !!

Blockquote # ./storagenode forget-satellite --force 12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB 12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo --config-dir /app/config --identity-dir /app/identity/
2023-12-04T09:26:39-08:00 INFO Configuration loaded {“process”: “storagenode”, “Location”: “/app/config/config.yaml”}
2023-12-04T09:26:39-08:00 INFO Anonymized tracing enabled {“process”: “storagenode”}
2023-12-04T09:26:40-08:00 INFO Identity loaded. {“process”: “storagenode”, “Node ID”: “12UnrByKkfGWRrERBPMUjMdk5H4ZsScj6VzkjvBJzDRRRY4AqRG”}
2023-12-04T09:26:46-08:00 WARN Satellite not found in satelliteDB cache. Forcing removal of satellite data. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Removing satellite from trust cache. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Cleaning up satellite data. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Cleaning up the trash. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Removing satellite info from reputation DB. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Removing satellite v0 pieces if any. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 INFO Removing satellite from satellites DB. {“process”: “storagenode”, “satelliteID”: “12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB”}
2023-12-04T09:26:46-08:00 WARN Satellite not found in satelliteDB cache. Forcing removal of satellite data. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Removing satellite from trust cache. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Cleaning up satellite data. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Cleaning up the trash. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Removing satellite info from reputation DB. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Removing satellite v0 pieces if any. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}
2023-12-04T09:26:46-08:00 INFO Removing satellite from satellites DB. {“process”: “storagenode”, “satelliteID”: “12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo”}