Use this script to generate bulk StorJ Identities

Github link:

Create StorJ Identities in bulk

Are you trying to create bulk StorJ identities for a project you’re working on? I had the same issue, and have automated the task - so fear not, you’ve come to the right place: StorjBulkIdentityCreator

Quick Start

  • Download the files
  • Populate the NodeInfo.csv file
  • Run StorjBulkIdentityCreator.ps1
  • Copy the files to their final destination
  • Start your node

Video walkthrough

I’ve made a short video showing how the script works. That can be seen below ↓
… at some point. I copied this text from the Github - video is not live

Step By step guide

  1. Download the files from this github repository. Either use git, or download the files as a .zip archive directly from here
  2. Populate the NodeInfo.csv file. I’ve included sample text, that you must replace with your own information. Below is the information you should replace in the file:
  • NodeName - This is your own choice, call your nodes whatever you want, as long as there are no duplicates
  • Token - Copy/Paste the token you get from the [Signup page](Sign up and host a Node on Storj
  • Dashboard Port - The port of the dashboard, that each token will have
  • External Port - The external port of the node, that it will communicate with the world on
  • IP - The external IP of your node
  • Wallet - Your StorJ wallet address.
  1. Open PowerShell
  2. Navigate to the directory of your downloaded files
  3. Run the script by issuing .\StorjBulkIdentityCreator.ps1 If you don’t supply any options, the script will default to look for the nodeinfo.csv, TEMPLATE_docker-compose.yaml and TEMPLATE_setup.commands file in the same directory as the main script. If you’ve not moved any files around, this is where they will be. The script will dump your newly created identities in $HOME\documents\StorjIdentitycreator. This is to ensure that all identities are in the same place for future reference. You can override this behavior by issuing the -WorkingDirectory flag with a new directory
  4. The script will now create your identities one by one. I have not implemented any multithreading, but am will work on that in the future :slight_smile:
  5. Copy or move your identities to their target locations

Known issues

“I get an error akin to: File cannot be loaded because the execution of scripts is disabled on this system. Please see "get-help about_signing" for more details

  • Open PowerShell as Administrator
  • Paste in the command below:

Set-ExecutionPolicy Unrestricted -confirm:$false

  • Rerun the the script

Do you know, that many new nodes will affect each other and the vetting process could be longer in the same amount of times as an amount of vetting nodes behind the same /24 subnet of public IPs?

It also will not give you more traffic, just because you run multiple nodes - all nodes behind the same /24 subnet of public IPs will get the same amount of traffic as only one node (this is the reason of probability to have a vetting period longer - to get 100 audits, the node should have auditable data, but this data will be distributed between all such nodes, so each node could have less audits due insufficient amount of auditable data).

So what’s the point in bulk identity generation, especially if the authorization token will be the same until you use it?

1 Like

Hiya friend. In my case, the reason for creating the script is threefold

  1. I like to plan ahead. I don’t have oodles and boodles of compute power (or luck), so I like to generate a few identities into the future. That way I can immediately create a new node when I want
  2. I found a bunch of old drives in sizes ranging between 500 and 1000GB. I think it would be fun to create a handfuld og “trash nodes”… for which I need multiple identites. The goal here is not really vetting the nodes as fast as possible, or generating as much revenue as possible for that matter, the goal is just to have fun
  3. I have not contributed to the Github project yet - this was fun challenge, that could still help the community :slight_smile:

I like the idea of having a directory of 10 or so spare Identities sitting around. You may never need them: but if you do want to spin up a few test nodes, or start vetting on small ones, or need to replace a DQ’d one… then you don’t have to wait. Nice job!

1 Like

I find it interesting just how much code did it need in PowerShell to do a simple loop. I recall doing the following on linux (as a sort of a CPU benchmark ^^):

for i in `seq 100`; do ./identity create node$i; done

Yours seems to be a sequential loop.

How about such a piece?

for i in `seq 100`; do
    ./identity create node$i &
# Wait for all background processes to finish

I am not sure but the last time I tried creating identities I am recalling not all cores were utilized thus maybe the ampersand provides solution?

I don’t remember either (-:

I just took a look at my notes, hard to say, its subjective, but to be on a safe side, I would probably go for the ampersand and leave it for a night.

./identity create --help | grep conc

      --concurrency uint                              number of concurrent workers for certificate
authority generation (default 4)
1 Like

While it’s a funny message, it’s not entirely true is it?

The PowerShell code could be boiled down like below, with no error handling, no files, no backing up or moving the generated identities … and worst of all - commented in K&R. Not my cup of tea ↓

$CSV = import-CSV PathToACSV
$CSV | ForEach -Parallel -ThrottleLimit 8 {
    $Token = $_.TOKEN
    Start-Process -NoNewWindow -wait "$ProgramFolderPath\identity.exe" -ArgumentList "create storagenode"
    Start-Process -NoNewWindow -wait "$ProgramFolderPath\identity.exe" -ArgumentList "authorize storagenode $Token" }

The problem here, at least on windows, is that all newly generated identities will end up in %AppDdata%, and will be overwritten by the next iteration.

Short and efficient code, but not really that effective

Look ma!

I’ve created a video explaining the script. It’s not really good, takes too long time to show the points, and holy shit I look like a mong in that thumbnail. Check it out

I see. Sorry, I didn’t expect more after the thread title.

My mistake lad. It could have been more descriptive :slight_smile:

1 Like

So to do a kind of CPU benchmarks, in reality to speed up the computations on modern computers, it seems the snippet on Linux would be:

for i in seq 100; do ./identity create --concurrency <uint> node$i; done

where <uint> equals the number of physical cores on the machine.

In addition, in case of setups with multiple processors, particularly processors with Hyper-Threading Technology, it might be advisable to use Likwid “Like I Knew What I’m Doing" Performance Tools in order to pin threads to physical cores and correctly handle all available NUMA nodes.

Do you agree with it @Toyoo and @Alexey? :slight_smile:

@Ottetal As far as I understand your code, it might be advisable to add a disclaimer, that one needs multiple, different e-mail addresses to follow you code, as Storj requires previously generated token to be utilized before allowing to generate a new one (to be confirmed, as I was not checking Storj policy recently with regard to this issue or this fact in general). :slight_smile:

I haven’t read the scripts mentioned by OP, but the “identity create” step is the only one that’s CPU heavy and takes time (and no email). The “identity authorize” part is instant, based on the code you grab from after giving an email and completing a captcha.

So you can definately fill a directory full of Identites to use later. Then just authorize them as-needed.

Hiya friend. Huh, that is true. I’ve always done my testing using different email addresses, but you’re right, using my script will need different emails for each node. I do need to add a disclaimer

this is correct.
Hyper-V is not related, every vendor invent an own method to use CPU for multiple VMs.


So you can definately fill a directory full of Identites to use later. Then just authorize them as-needed.

Yeah, thats why we are talking CPU benchmark snippet proposed by @Toyoo, actually slightly improving it.

Always happy to make friends.


Hyper-V is not related, every vendor invent an own method to use CPU for multiple VMs.

Hey, I cant recall talking Hyper-V on this forum. Recently I was referring to HT (Hyper-Threading Technology). To be more explicit, lets assume you have two Intel CPUs on your board, each is having 64 physical cores, so you have 128 physical cores at your disposal which means 256 Hyper-Threaded cores visible in your BIOS (if HT is on), as well the same number of cores, 256 visible in your OS. And now, what Im implying is: i) HT cores are probably beneficial for running storegenodes on your machine as they provide additional hardware separated threads and ii) they do not make any positive difference if you want to carry on CPU benchmarks [ :slight_smile: ] as suggested by @Toyoo (actually they may slow down the calculations). The reason that it is highly probable that the HT cores will slow down your calculations if you: a) dont switch them off in your BIOS or b) do not pin your software threads to physical cores is related to the way OS is handling and scheduling all the cores that are available to it. If you dont do a) or b) you may end up that your calculations instead of utilizing 128 physical cores (optimal scenario) may be utilizing lets say 100 physical and 28 HT cores. Hope I provided additional info.

1 Like

You are correct. I just state it’s not related to Hyper-V at all.

I guess almost nobody was talking Hyper-V in relation to CPU benchmarks proposed here by @Toyoo. :- )

1 Like