Disqualification & Suspension

I’m running this node on a TrueNAS system. I tried the commands you mention without success. The logs I mentioned are available in TrueNAS app. Otherwise, maybe I can access these logs from shell? I also have access to that. It seems that this TrueNAS build is not the best (ran into problems at setup too).

You can access logs from the shell:

  1. Open a shell from System settings
  2. List containers
sudo docker ps
  1. Search for container, contained storj in its name, copy its container ID
  2. Filter logs
sudo docker logs c55ff406647a 2>&1 | grep GET_AUDIT | grep failed

However, since you didn’t redirect logs to the file, they can be lost if the container was re-created.
You may also use the web UI to check logs, you need to check the storj container, but it’s not convenient to search for failed audit in web UI. You may also download logs and search locally in the log file.

Alexey:

Thank you.

I did have my node configured with a working email. I’ve not received any emails for being offline in the last month; and I’ve never received any suspension warnings. LKastly, I checked in on my node daily; it was always Status: Online QUIC OK (though yesterday, it did say it was only Online for 8 hours, which makes no sense, since I received no emails).

The log output from your commands is as follows:

2023-09-30T11:23:33Z ERROR piecestore download failed {“process”: “storagenode”, “Piece ID”: “IMTCD6B4FKWQIIIUST3NNKSZYHOEDKD7I6T7LYRBADANHOTUQT4A”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “Offset”: 8448, “Size”: 0, “Remote Address”: “34.146.139.227:58876”, “error”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:671\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
2023-09-30T13:09:17Z ERROR piecestore download failed {“process”: “storagenode”, “Piece ID”: “Y6CK5XLVCEAOUFXIYQFJBXYC7ZM5HF3HCLAB4LF64Q6MKWQTGORQ”, “Satellite ID”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “Action”: “GET_AUDIT”, “Offset”: 1354496, “Size”: 0, “Remote Address”: “34.148.62.121:55084”, “error”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:671\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}
2023-09-30T16:46:03Z ERROR piecestore download failed {“process”: “storagenode”, “Piece ID”: “6QTWSBJFU42FKD4S52KXI5JG4P2M5SOD5YWXH7VBJXH7ICS6GCLA”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “Offset”: 884224, “Size”: 0, “Remote Address”: “34.146.139.227:36776”, “error”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:671\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}

The bold parts are the bad parts.
It means that the satellite has sent audit requests to your node to verify that your node has the data pieces that the satellite expects it to hold.
But your node responded that it does not have the requested files at all. That is the worst possible scenario.
So either they have been deleted or the node software does not have access to the correct location.
Depending on how much data you have already lost, this means that your node may get disqualified.
By the satellite id you can see that your node has lost data for different satellites.
You need to check what was/is the cause for the data loss. My suggestion first of all would be to take the node offline but keeping it running but cutting it off from the internet. Why? Because it can be offline for 30 days but if it remains online and keeps getting audited for lost pieces it might get disqualified from all satellites much faster and maybe even sooner than you have been able to diagnose and fix the underlying issue.

1 Like

OK, thank you, good suggestion taking node offline. It’s really stressful when the proverbial clock is ticking. I simply removed the port forwarding.

So I have been reading about various reasons for this situation, and they seem abundant. And yet, none that I have read so far seem applicable/likely.

The drives are all healthy, according to TrueNAS. And they are top quality, nearly new HDDs (about four months old).

I have made no changes that could either screw up permissions or change dir locations, since initial setup.

I only have this one node, so it isn’t two nodes with same identity.

If relevant, I have this node setup with ZFS RAIDZ1 configuration.

Next steps?

I would pick a piece id which has been logged as failed audit and search for it in the logs.
Then you can check if something had been logged about that file. Like what happened when it got uploaded and maybe some other additional log messages.

Second I would check in the storage folder if the piece is truly not there.
From the satellite id you can get the storage blobs folder: Satellite info (Address, ID, Blobs folder, Hex).
The first 2 characters of the piece id is the name of the folder. Check the storage and the trash folder if the piece is there.

If it has been successfully uploaded and it’s not there, then you have a problem. And if it is there, you have another problem.
Maybe it is a file system problem and you need to run a file system check and repair.

1 Like

OK, excellent. I will look into these matters. Quick follow-up: is it sufficient to remove port forwarding? The dashboard still shows the node online, though QUIC as misconfigured.

So searching the logs for other errors with the problematic piece IDs, I find the following (as an example). Is it of any use? Not able to make any sense of it, beyond it confirming the piece does not exist.

2023-09-30T16:46:03Z ERROR piecestore download failed {“process”: “storagenode”, “Piece ID”: “6QTWSBJFU42FKD4S52KXI5JG4P2M5SOD5YWXH7VBJXH7ICS6GCLA”, “Satellite ID”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “Action”: “GET_AUDIT”, “Offset”: 884224, “Size”: 0, “Remote Address”: “34.146.139.227:36776”, “error”: “file does not exist”, “errorVerbose”: “file does not exist\n\tstorj.io/common/rpc/rpcstatus.Wrap:75\n\tstorj.io/storj/storagenode/piecestore.(*Endpoint).Download:671\n\tstorj.io/common/pb.DRPCPiecestoreDescription.Method.func2:251\n\tstorj.io/drpc/drpcmux.(*Mux).HandleRPC:33\n\tstorj.io/common/rpc/rpctracing.(*Handler).HandleRPC:61\n\tstorj.io/common/experiment.(*Handler).HandleRPC:42\n\tstorj.io/drpc/drpcserver.(*Server).handleRPC:124\n\tstorj.io/drpc/drpcserver.(*Server).ServeOne:66\n\tstorj.io/drpc/drpcserver.(*Server).Serve.func2:114\n\tstorj.io/drpc/drpcctx.(*Tracker).track:35”}

That is just the message about the failed audit again.
Maybe the upload messages concerning these specific pieces are no longer present in the log as it gets deleted normally when you remove the container.

But as said, additionally you have to check in the storage and trash folders if the piece is there or not.

Would it be reasonable to assume that, at this point, knowing Docker is necessary to any possible troubleshooting here? I had planned to slowly learn Docker over the next year, but maybe I jumped the gun (e.g., I can’t even find the trash!). I discovered the Storj app in TrueNAS charts, loved the project, so I gave it a shot. But I just started studying Linux less than a year ago for fun.

I’d be open to a paid tutor, if anyone’s interested :slight_smile:

You can go through Storj’s docs at docs.storj.io and learn whatever you need or ask in the forum.

I searched the docs for how to find the trash, but nothing (I’m guessing that is just obvious to most everyone).

Well, if people are willing to help with my oh, so basic questions: how do I find the trash in the storj docker container?

Unfortunately I am not on TrueNAS so I am not familiar with how the node setup looks on such a device or where to look for the underlying folders.
But I am sure others will be able to tell you exactly where you need to look for them.

I understand. It’s a been a bit of a pickle: those in the TrueNAS Discord room, don’t know Storj, and those here don’t know TrueNAS well, though Alexey seems generally familiar with it.

Generally there are many knowledgeable users on here. I am sure somebody will be able to tell you exactly where to look on a TrueNAS.

Its the weekend so you will have to wait till others see your post :slight_smile:

Thank you all so much for the help and encouragement!

I’ve fully shut my node down to be sure I’m not online. That gives me 30 days to work on this without stressing.

Hey, @Alexey: able to give me a hand with this? Know TrueNAs well enough to problem solve it with me?

The node’s data folders are located in the dataset, which your TrueNAS used to store the application’s data. It’s not hardcoded, so you need to search it.
You may execute the command in the shell:

df -T --si

It should show mount points and you can locate, where your dataset is mounted. Next you need to navigate to the folder with node’s data.
For example, for the dataset with name data, the output may look like:

Filesystem                                                          Type      Size  Used Avail Use% Mounted on
udev                                                                devtmpfs  819M     0  819M   0% /dev
tmpfs                                                               tmpfs     201M  8.1M  193M   5% /run
boot-pool/ROOT/22.12.2                                              zfs       115G  2.9G  112G   3% /
tmpfs                                                               tmpfs     1.1G  107k  1.1G   1% /dev/shm
tmpfs                                                               tmpfs     105M     0  105M   0% /run/lock
tmpfs                                                               tmpfs     1.1G   13k  1.1G   1% /tmp
boot-pool/grub                                                      zfs       112G  8.7M  112G   1% /boot/grub
data                                                                zfs       127G  132k  127G   1% /mnt/data
data/ix-applications                                                zfs       127G  132k  127G   1% /mnt/data/ix-applications
data/ix-applications/k3s                                            zfs       127G  141M  127G   1% /mnt/data/ix-applications/k3s
data/ix-applications/docker                                         zfs       129G  2.2G  127G   2% /mnt/data/ix-applications/docker
data/ix-applications/catalogs                                       zfs       127G   59M  127G   1% /mnt/data/ix-applications/catalogs
data/ix-applications/releases                                       zfs       127G  132k  127G   1% /mnt/data/ix-applications/releases
data/ix-applications/default_volumes                                zfs       127G  132k  127G   1% /mnt/data/ix-applications/default_volumes
data/ix-applications/releases/storj                                 zfs       127G  132k  127G   1% /mnt/data/ix-applications/releases/storj
data/ix-applications/releases/storj/volumes                         zfs       127G  132k  127G   1% /mnt/data/ix-applications/releases/storj/volumes
data/ix-applications/releases/storj/charts                          zfs       127G  394k  127G   1% /mnt/data/ix-applications/releases/storj/charts
data/ix-applications/releases/storj/volumes/ix_volumes              zfs       127G  132k  127G   1% /mnt/data/ix-applications/releases/storj/volumes/ix_volumes
data/ix-applications/releases/storj/volumes/ix_volumes/ix_data      zfs       127G  263k  127G   1% /mnt/data/ix-applications/releases/storj/volumes/ix_volumes/ix_data
data/ix-applications/releases/storj/volumes/ix_volumes/ix_identity  zfs       127G  263k  127G   1% /mnt/data/ix-applications/releases/storj/volumes/ix_volumes/ix_identity
...

So, your data for that example should be in /mnt/data/ix-applications/releases/storj/volumes/ix_volumes:

[-]$ ls -l /mnt/data/ix-applications/releases/storj/volumes/ix_volumes          
total 17
drwxr-xr-x 4 apps apps 7 Sep 30 18:28 ix_data
drwxr-xr-x 2 apps apps 8 May  1 01:48 ix_identity

ix_data should contain all folders, include blobs and trash.

HI, Alexey:
Thank you for working through this with me. Sorry about the delay getting back to this.

So, I have no directories within /mnt/Main/ix-applications/releases/storj/volumes/ix_volumes

I have attached a screenshot, so you know I am not making error in your directions.