Error: operator wallet address isn't valid. Is it possible to fix?

Description of the situation.
I have noticed that my computer with 5.5Tb storage gives less profit than the weaker computer with 2Tb storage on the same network.
I checked this computer and found it to be old. There was Debian 10 and the current startup file for storj at docs.storj.io has already changed the format a bit.

So I did the following:

  1. Upgraded Debian 10 to Debian 11

  2. According to the instructions from docs.storj.io, I deleted docker and reinstalled it for the new OS version

  3. Execute setup once:
    docker run --rm -e SETUP=ā€œtrueā€
    ā€“user $(id -u):$(id -g)
    ā€“mount type=bind,source=ā€œme-pathā€,destination=/app/identity
    ā€“mount type=bind,source=ā€œ/mnt/storeā€,destination=/app/config
    ā€“name storagenode storjlabs/storagenode:latest

  4. Tried to start node.
    Got errors that I believe were related to storage permissions.

  5. Set for all folders in the storage the 777 permissions with command
    find. -type d -exec chmod 777 ā€˜{}ā€™ ;

  6. Tried to start node again.
    Now I get the error ā€œoperator wallet address isnā€™t validā€:

2022-12-11T10:19:19.463Z INFO Configuration loaded {ā€œProcessā€: ā€œstoragenodeā€, ā€œLocationā€: ā€œ/app/config/config.yamlā€}
2022-12-11T10:19:19.464Z INFO Anonymized tracing enabled {ā€œProcessā€: ā€œstoragenodeā€}
2022-12-11T10:19:19.467Z INFcurrentO Operator email {ā€œProcessā€: ā€œstoragenodeā€, ā€œAddressā€: ā€œā€“my-emailā€“ā€}
2022-12-11T10:19:19.467Z ERROR Invalid configuration. {ā€œProcessā€: ā€œstoragenodeā€, ā€œerrorā€: ā€œoperator wallet address isnā€™t validā€}
Error: operator wallet address is not valid
2022-12-11 10:19:19,470 INFO exited: storagenode (exit status 1; not expected)
2022-12-11 10:19:22,479 INFO spawned: ā€˜storagenodeā€™ with pid 56

The wallet address is correct. It is the same as it was before and as on other comps.
Therefore, my question is: Is it possible to somehow correct the situation?

Hi @kmserg1

Please post your full docker run command. Remove any information you wish to keep secret but there is most likely a typo or fault.

Do you want to migrate your old node or setup a new and empty node? This command will setup a new node. Are you sure you want to get your old node disqualified like that?

Most people copy paste without second thought and expect software to do the right thing. This SETUP flag not only is extremely dangerous, but also ultimately, useless. It introduces avoidable risk but gives no reward in return.

Instead, the startup script can figure out if the identity has already been created and/or node authorized and/or setup and only do what hasnā€™t been done yet. And definitely never overwrite user data. If user wanted to setup the new node ā€” they could have deleted the identity folder manually and on the next start the container could have created a new identity.

There has been too many posts already when people accidentally call with SETUP=1, due to mistake, or misunderstanding, etc. Remove that flag. Problem solved. Itā€™s literally two lines in the shell script to check for existence of the filesā€¦

Same outcome. They still get disqualified for mounting the wrong folder. I donā€™t see how that would fix anything.

Thatā€™s how it used to be. At that time there was a problem that the startup script sometimes did not notice the setup has already been done and was overwriting things of a perfectly good node, losing data in the process. Turns out, itā€™s actually quite difficult to figure out in an automatic way if the setup has been done. Then, an explicit flag has been introduced to have the operator be fully in control of whether they want to create a new node, or ask the code to verify whether anything is missing.

2 Likes

This wonā€™t happen; documentation recommends to move identity to the storage folder. I would go as far as suggesting that this should not just be a recommendation (that most wonā€™t read); the container must follow that recommendation and create identity by default in the storage folder.

That way users would need to only mount one folder, that contains the whole node.

That way it would be impossible to screw up using everything in default config. If advanced users want to tweak and move databases elsewhere ā€” they still can. But that would be on them. The default shall be rock solid and safe.

User errors will always outperform the app :wink:

3 Likes

Well, then it was a bug that needed to be fixed. A solution to move the responsibility to the user is a copout

Iā€™m not aware of the prior history ā€“ can you point me to threads or summarize here why? My understanding was that:

  1. if identity.cert exists ā€“ identity was created,
  2. if identity.*.cert exists ā€“ storage node was authorized
  3. if config.yaml exists ā€“ storagenode setup was done.

(I do this in my script but if Iā€™m missing or misunderstand something ā€“ Iā€™d really love to know!)

Thatā€™s fine, but that flag should not just wipe out everything without asking. Because mistakes do happen ā€“ and this is thermonuclear node annihilation with no confirmation. At least it should save the existing identity to backup folder, so users have a chance to recover.

LOL, of course. Thatā€™s precisely why the ā€œattack surfaceā€ shall be reduced. No user visible options besides those absolutely necessary in the setup flow, and no matter what user does ā€“ no data shall be lost. Even keeping the SETUP flag can be made to work ā€“ let it still check for presence of config.yaml and identity.*.cert, and if any is there ā€“ abort and fail with the message ā€“ ā€œthe node seems to be already setupā€ (Then remove ā€œseems to beā€ once the detection is polished out) This at least would give a user a hint ā€“ the thing they are trying to do is failing ā€“ whatā€™s the reason and not punish for a copy-paste error.

If for some reason your data drive isnā€™t mounted, it will see an empty folder and thus run setup with the same node ID. Instantly disqualifying the original node because none of the data is there. Having the setup step separate prevents this happening by accident.

3 Likes

But for the container use case storage folder is always a mount point, so itā€™s trivial to check if it isnā€™t ā€“ and abort.

The goal is to prevent user error to have catastrophic consequences.

I would differenciate between my storage nodes and other storage nodes.

For my own storage node the setup flag works great. There is a big red warning in the documentation that gives me a very good idea what this flag is used for. As long as I keep that warning in mind the storage node will catch any other user error I have done so far. And I tried some of these users errors more than once. For example my hole ZFS pool had a problem and was not mounted. Luckily the storage node did not try to create a new folder. It just failed, grafana fired an email alert and I was able to fix it within an hour. I would like to keep it that way. I would like to keep my own storage node healty.

I canā€™t control what other users are doing. They might ignore the red warning in the documentation. Now what you are describing is still the behavior of that flag. If the user makes no other mistake the storage node will just renew the config file but otherwise continue just fine.

There is a group of people that try to get disqualified real hard. Initially they donā€™t have the setup flag set. The storage node will fail to start to prevent the disqualification. These people will start to play try and error instead of checking if they mounted everything correct. They add the setup flag and also disable all the other checks that would still prevent them from getting disqualified. So the reallity is some people will still manage to get disqualified no matter how hard you make it for them.

3 Likes

The actual data could easily be in a subfolder and not the actual mount point. Though that doesnā€™t really matter, since Iā€™m pretty sure inside the container it will always look like a mount point, regardless of whether the drive is actually mounted or not outside the container.

I guess instead the setup command could add a file to the identity folder signifying that setup has run for that identity and refuse to run it again if that file already exists.

2 Likes

Oh, I misunderstood then. I thought that if you run with setup again with the otherwise properly mounted and configured node it will destroy it by re-creating everything.

Ok, I guess the present behavior is the least of two evils.

Indeed, this can definitely be the case in some scenarios

This would not help when node is unmounted, and if the config folder is mounted it seems the current behavior is still safe::

So essentially the issue is with not accidentally initializing a new node when nothing is mounted.
Looks like the SETUP flag is a good explicit solution. Iā€™m convinced!

That was one of the user errors I tried. I just mounted the wrong subfolder but the folder existed so docker itself didnā€™t complain.

Yeah, thatā€™s why I use subfolders by default. If the drive isnā€™t mounted, docker wonā€™t even start the container. Itā€™s just that you canā€™t assume every node operator will do that.

Additionally, I use unique subfolder names per node. Just in case my Synology decides to switch mounts for some reason. That has saved my butt at least once when I moved an HDD to a different bay in my external 10 bay USB case. Although these days the storage node software would detect the node ID vs identity mismatch as well. But I figured itā€™s better to have more safeguards anyway.

First of all - never run setup step for worked node, if you messed up with folders, you can disqualify your node. The setup step should be performed only for the new node.

Since you already messed it up, you need to search where your data is actually located and provide correct paths in your full docker run command with all your parameters include ADDRESS, WALLET and STORAGE. And do not run the setup step again, of course.
So please show your current full docker run command (you may mask the private information).

1 Like

So, yeah, just for the record, BrightSilence in post 11 explained the most common failure point at that time: identity is there, but no data, no config.yaml, no databases, no directory structure with blobs, storage, trash, etc.

What was then being wiped out is not physical state kept on disk, but reputation in form of failing audits. My original post was unclear on that point, sorry!

1 Like