Use this script to generate bulk StorJ Identities

Ottetal · January 8, 2024, 7:27pm

Github link: https://github.com/Jikkelsen/StorjBulkIdentityCreator/blob/1a8ba4d0086fd9491c2aa13d0529cbff1ce2f0f0/Readme.md

Create StorJ Identities in bulk

Are you trying to create bulk StorJ identities for a project you’re working on? I had the same issue, and have automated the task - so fear not, you’ve come to the right place: StorjBulkIdentityCreator

Quick Start

Download the files
Populate the NodeInfo.csv file
Run StorjBulkIdentityCreator.ps1
Copy the files to their final destination
Start your node

Video walkthrough

I’ve made a short video showing how the script works. That can be seen below ↓
… at some point. I copied this text from the Github - video is not live

Step By step guide

Download the files from this github repository. Either use git, or download the files as a .zip archive directly from here
Populate the NodeInfo.csv file. I’ve included sample text, that you must replace with your own information. Below is the information you should replace in the file:

NodeName - This is your own choice, call your nodes whatever you want, as long as there are no duplicates
Token - Copy/Paste the token you get from the [Signup page](Sign up and host a Node on Storj
Dashboard Port - The port of the dashboard, that each token will have
External Port - The external port of the node, that it will communicate with the world on
IP - The external IP of your node
Wallet - Your StorJ wallet address.

Open PowerShell
Navigate to the directory of your downloaded files
Run the script by issuing .\StorjBulkIdentityCreator.ps1 If you don’t supply any options, the script will default to look for the nodeinfo.csv, TEMPLATE_docker-compose.yaml and TEMPLATE_setup.commands file in the same directory as the main script. If you’ve not moved any files around, this is where they will be. The script will dump your newly created identities in $HOME\documents\StorjIdentitycreator. This is to ensure that all identities are in the same place for future reference. You can override this behavior by issuing the -WorkingDirectory flag with a new directory
The script will now create your identities one by one. I have not implemented any multithreading, but am will work on that in the future
Copy or move your identities to their target locations

Known issues

“I get an error akin to: File cannot be loaded because the execution of scripts is disabled on this system. Please see "get-help about_signing" for more details”

Open PowerShell as Administrator
Paste in the command below:

Set-ExecutionPolicy Unrestricted -confirm:$false

Rerun the the script

Alexey · January 9, 2024, 3:58am

Do you know, that many new nodes will affect each other and the vetting process could be longer in the same amount of times as an amount of vetting nodes behind the same /24 subnet of public IPs?

It also will not give you more traffic, just because you run multiple nodes - all nodes behind the same /24 subnet of public IPs will get the same amount of traffic as only one node (this is the reason of probability to have a vetting period longer - to get 100 audits, the node should have auditable data, but this data will be distributed between all such nodes, so each node could have less audits due insufficient amount of auditable data).

So what’s the point in bulk identity generation, especially if the authorization token will be the same until you use it?

Ottetal · January 9, 2024, 6:42am

Hiya friend. In my case, the reason for creating the script is threefold

I like to plan ahead. I don’t have oodles and boodles of compute power (or luck), so I like to generate a few identities into the future. That way I can immediately create a new node when I want
I found a bunch of old drives in sizes ranging between 500 and 1000GB. I think it would be fun to create a handfuld og “trash nodes”… for which I need multiple identites. The goal here is not really vetting the nodes as fast as possible, or generating as much revenue as possible for that matter, the goal is just to have fun
I have not contributed to the Github project yet - this was fun challenge, that could still help the community

Roxor · January 16, 2024, 6:12pm

I like the idea of having a directory of 10 or so spare Identities sitting around. You may never need them: but if you do want to spin up a few test nodes, or start vetting on small ones, or need to replace a DQ’d one… then you don’t have to wait. Nice job!

Toyoo · January 16, 2024, 9:00pm

I find it interesting just how much code did it need in PowerShell to do a simple loop. I recall doing the following on linux (as a sort of a CPU benchmark ^^):

for i in `seq 100`; do ./identity create node$i; done

s-t-o-r-j-user · January 16, 2024, 9:19pm

Yours seems to be a sequential loop.

How about such a piece?

for i in `seq 100`; do
    ./identity create node$i &
done
# Wait for all background processes to finish
wait

I am not sure but the last time I tried creating identities I am recalling not all cores were utilized thus maybe the ampersand provides solution?

Toyoo · January 16, 2024, 9:40pm

I don’t remember either (-:

s-t-o-r-j-user · January 16, 2024, 9:49pm

I just took a look at my notes, hard to say, its subjective, but to be on a safe side, I would probably go for the ampersand and leave it for a night.

Alexey · January 17, 2024, 3:30am

./identity create --help | grep conc

      --concurrency uint                              number of concurrent workers for certificate
authority generation (default 4)

Ottetal · January 17, 2024, 5:03am

While it’s a funny message, it’s not entirely true is it?

The PowerShell code could be boiled down like below, with no error handling, no files, no backing up or moving the generated identities … and worst of all - commented in K&R. Not my cup of tea ↓


$CSV = import-CSV PathToACSV
$CSV | ForEach -Parallel -ThrottleLimit 8 {
    $Token = $_.TOKEN
    Start-Process -NoNewWindow -wait "$ProgramFolderPath\identity.exe" -ArgumentList "create storagenode"
    Start-Process -NoNewWindow -wait "$ProgramFolderPath\identity.exe" -ArgumentList "authorize storagenode $Token" }

The problem here, at least on windows, is that all newly generated identities will end up in %AppDdata%, and will be overwritten by the next iteration.

Short and efficient code, but not really that effective

Toyoo · January 17, 2024, 5:06am

I see. Sorry, I didn’t expect more after the thread title.

Ottetal · January 17, 2024, 5:08am

My mistake lad. It could have been more descriptive

s-t-o-r-j-user · January 19, 2024, 11:28am

So to do a kind of CPU benchmarks, in reality to speed up the computations on modern computers, it seems the snippet on Linux would be:

for i in seq 100; do ./identity create --concurrency <uint> node$i; done

where <uint> equals the number of physical cores on the machine.

In addition, in case of setups with multiple processors, particularly processors with Hyper-Threading Technology, it might be advisable to use Likwid “Like I Knew What I’m Doing" Performance Tools in order to pin threads to physical cores and correctly handle all available NUMA nodes.

Do you agree with it @Toyoo and @Alexey?

@Ottetal As far as I understand your code, it might be advisable to add a disclaimer, that one needs multiple, different e-mail addresses to follow you code, as Storj requires previously generated token to be utilized before allowing to generate a new one (to be confirmed, as I was not checking Storj policy recently with regard to this issue or this fact in general).

Roxor · January 19, 2024, 12:42pm

I haven’t read the scripts mentioned by OP, but the “identity create” step is the only one that’s CPU heavy and takes time (and no email). The “identity authorize” part is instant, based on the code you grab from www.storj.io/host-a-node after giving an email and completing a captcha.

So you can definately fill a directory full of Identites to use later. Then just authorize them as-needed.

Ottetal · January 19, 2024, 1:07pm

Hiya friend. Huh, that is true. I’ve always done my testing using different email addresses, but you’re right, using my script will need different emails for each node. I do need to add a disclaimer

Alexey · January 20, 2024, 1:26am

this is correct.
Hyper-V is not related, every vendor invent an own method to use CPU for multiple VMs.

s-t-o-r-j-user · January 20, 2024, 9:42am

@Roxor

So you can definately fill a directory full of Identites to use later. Then just authorize them as-needed.

Yeah, thats why we are talking CPU benchmark snippet proposed by @Toyoo, actually slightly improving it.

@Ottetal
Always happy to make friends.

@Alexey

Hyper-V is not related, every vendor invent an own method to use CPU for multiple VMs.

Hey, I cant recall talking Hyper-V on this forum. Recently I was referring to HT (Hyper-Threading Technology). To be more explicit, lets assume you have two Intel CPUs on your board, each is having 64 physical cores, so you have 128 physical cores at your disposal which means 256 Hyper-Threaded cores visible in your BIOS (if HT is on), as well the same number of cores, 256 visible in your OS. And now, what Im implying is: i) HT cores are probably beneficial for running storegenodes on your machine as they provide additional hardware separated threads and ii) they do not make any positive difference if you want to carry on CPU benchmarks [ ] as suggested by @Toyoo (actually they may slow down the calculations). The reason that it is highly probable that the HT cores will slow down your calculations if you: a) dont switch them off in your BIOS or b) do not pin your software threads to physical cores is related to the way OS is handling and scheduling all the cores that are available to it. If you dont do a) or b) you may end up that your calculations instead of utilizing 128 physical cores (optimal scenario) may be utilizing lets say 100 physical and 28 HT cores. Hope I provided additional info.

Alexey · January 20, 2024, 9:46am

You are correct. I just state it’s not related to Hyper-V at all.

s-t-o-r-j-user · January 20, 2024, 9:49am

I guess almost nobody was talking Hyper-V in relation to CPU benchmarks proposed here by @Toyoo. :- )