Help with Running Multiple Storj Nodes on Docker – Identity Certificate Issues

Hi everyone, I’ve been trying to run multiple Storj nodes on a single system using Docker. Here’s my setup:

System specs:
Intel Core i5-12600K (10 cores / 16 threads), 64 GB DDR4 RAM, ASUS ROG Strix Z690-I Gaming WiFi motherboard, ZFS pool floki with 5 × 10TB HDDs (RAID-Z1), SLOG, and L2ARC.

Nodes: storj1, storj2, storj3

Error messages:
When I try to start the nodes, they fail with errors saying “Failed to load identity: file or directory not found: open identity/identity.cert.”

What I’ve tried so far:

  • Configured each node with unique ports and identity directories.
  • Ran identity authorize for each node, but still seeing the same error.
  • Checked file permissions and paths, verified the identity.cert file is present in the directories.
  • Restarted Docker and the containers multiple times.
  • Tried running Storj on my TrueNAS box again, but it crashes with the same error message.

Additional Info:

  • I’m running this setup in Docker on my TrueNAS box and also on a separate Proxmox host, but I keep encountering issues with the identity loading.
  • Logs show multiple “waiting for Storj to be set up…” and “Failed to load identity” messages.

Could anyone point me in the right direction? I’m not sure if this is an issue with my Docker setup, file paths, or something else. Any help is appreciated!

You need to setup each node from scratch.

  • get a new token
  • Get a new identity

Bad idea.

Show your docker command

1 Like

Hi there,

Thanks for the feedback. I appreciate the reminder about setting up each node from scratch, and I did indeed create new tokens and identities for each node.

For context:

  • I’ve created fresh identities for each of my three nodes.
  • I used the identity create storagenode command with the recommended concurrency, and I’ve made sure the identity directories are in place for each node.
  • I set up a separate volume for each node’s identity and storage.

Regarding the SLOG (970 Pro NVMe), I’ve read that its write performance can be quite impactful. In my case, I’ve got the NVMe dedicated as SLOG to improve write throughput.

The error message Failed to load identity: file or directory not found: open identity/identity.certpersists even after verifying that the identity.cert file is present in the identity directory.

I’ll include my Docker run commands below to ensure everything is correctly configured:


services:
  storagenode1:
    container_name: storagenode1
    environment:
      - WALLET=your_wallet_address_here
      - EMAIL=your_email_here
      - ADDRESS=*.ddns.net:28967
      - BANDWIDTH=1TB
      - STORAGE=5TB
    volumes:
      - /mnt/floki/storj/storj1/identity/storagenode:/app/identity
      - /mnt/floki/storj/storj1/storage:/storage
      - /mnt/floki/storj/storj1/config:/etc/storj
    ports:
      - 14002:14002
      - 28967:28967
    image: storjlabs/storagenode:latest
    networks: []

  storagenode2:
    container_name: storagenode2
    environment:
      - WALLET=your_wallet_address_here
      - EMAIL=your_email_here
      - ADDRESS=*.ddns.net:28968
      - BANDWIDTH=1TB
      - STORAGE=5TB
    volumes:
      - /mnt/floki/storj/storj2/identity/storagenode:/app/identity
      - /mnt/floki/storj/storj2/storage:/storage
      - /mnt/floki/storj/storj2/config:/etc/storj
    ports:
      - 14003:14002
      - 28968:28967
    image: storjlabs/storagenode:latest
    networks: []

  storagenode3:
    container_name: storagenode3
    environment:
      - WALLET=your_wallet_address_here
      - EMAIL=your_email_here
      - ADDRESS=*.ddns.net:28969
      - BANDWIDTH=1TB
      - STORAGE=5TB
    volumes:
      - /mnt/floki/storj/storj3/identity/storagenode:/app/identity
      - /mnt/floki/storj/storj3/storage:/storage
      - /mnt/floki/storj/storj3/config:/etc/storj
    ports:
      - 14004:14002
      - 28969:28967
    image: storjlabs/storagenode:latest
    networks: []

networks: {}

Would you suggest re-creating the identities or verifying specific paths in the setup? I’ve double-checked the permissions, and the files seem to be in place. Appreciate your insights!

Was this done from within the container?

BTW:
Identity authorization can still be done but it is not required anymore.

Where did /etc/storj come from?

Slog helps with sync writes. Storj does not do sync writes. Pretty much none of other software does, and the few exceptions can have sync disabled as long as you have UPS. Furthermore, the whole point of slog is that it must survive power loss, and yet your SSD does not have PLP. If you need slog — use a small, (even 16GB is an overkill) Optane device. And then if you wanted write performance you would not have chosen a single raidz. It’s also helpful to have a special device in addition to L2ARC — the latter will eventually accelerate subsequent metadata and data access. The former — metadata including the first access. For storj this makes L2ARC pretty much irrelevant. But I can see if you need it for other uses.

asus rant

ROG STRIX Z690-I GAMING WIFI

Intel® Z690 LGA 1700 ITX motherboard with PCIe® 5.0, 10+1 power stages, Two-Way AI Noise Cancelation, AI Overclocking, AI Cooling, AI Networking, WiFi 6E (802.11ax), Intel® 2.5 Gb Ethernet, two M.2 slots with heatsinks and backplates, two Thunderbolt™ 4 USB Type-C®, SATA and Aura Sync RGB lighting

At one point asus was the motherboard vendor to use, but then then progressively at the accelerated rate turned into bullshit churning marketing machine.

Why the fuck AI is in every sentence?! pi controller is now AI? What the fuck is AI Networking?

It’s just vomit inducing bullshit.

I tell you what: sell this crap to some unsuspecting “gamer” on eBay, let them deal with armory crate and other security holes, and you buy yourself a supermicro or AsRock Rack board at a fraction of the cost with none of the bullshit and significantly better reliability.

I’m so tired of this shit, and it’s a pity asus was overrun by morons.

it must be

      - /mnt/floki/storj/storj1/:/app/config

unless you also changed the storage location either in the config.yaml or via arguments (which is absent in your current docker-compose.yaml).

Please show the content of the identity directory:

ls -l /mnt/floki/storj/storj1/identity/storagenode

Is that against the rules?
You have a single RaidZ1 pool which counts as a single disks for Storj - so you should only be running a single node

Or have I got that wrong

OPs setup isn’t broken because of any terms-of-service issue. It’s because they should have left the names of their identity/config paths alone inside the container (what the app expects), and only changed the referenced paths outside (basically what Alexey is saying).

1 Like

Why would several disks count as one? And besides, it’s hard to take tos seriously in its current state - with errors, repetitions, and outdated data. If storj does not care about it enough to keep it clean - why should we?

Because its one “disk” to lose.

  1. If the OP loses the pool then he loses multiple nodes
  2. From a IO PoV - multiple nodes on a Z1 (or a Z2/Z3 for that matter) isn’t ideal. RAIDZ is not meant for high IO loads. Note that I admit I have no idea (haven’t looked) how many nodes is sensible for a RAIDZ array of HDD’s.

The OP clearly doesn’t understand ZFS due to his “read that its write performance can be quite impactful”. And whilst he is technically correct - it only (as was pointed out) refers to Sync writes and not the more usual async writes

  • If op loses a machines they lose all nodes
  • if op has a house fire they lose all nodes
  • if op city is swallowed by a sinkhole then more than one node will be affected.

Either way storj will be fine, it is part of risk management. On the other hand, having 4 nodes on the single array is more reliable than having 4 nodes on 4 individual disks. And because one should not bringing new hardware online for storj – and nobody has any use for single hard drives. People use arrays. That’s what modern drives are designed to be used in. Not solitarily.

Correct. However, storagenode does not produce high IO data load. Not even close. It does produce a fair amount of peak metadata IO – but metadata shall be on an SSD, so this is not relevant. The remaining IO is packaged into transaction groups and written at once. I have 6 nodes on the same array with a few radiz1. I don’t notice their presence. Even when all six decide to run fielwalkers at once – I see 15-20kiops on my SSD for half an hour, but that’s the only symptom visible if I look.

Not everyone is born all-knowing. The first step of becoming an expert in something is to suck at that thing not know much and make a lot of mistakes. Kudos to OP.

A bit off-topic: but I understand why Storj says you don’t need resiliency if erasure-encoding provides it at the service level - and they want to show the most available-space as possible. And I can understand it for a SNO too: that they may not want to lose any possible extra-paid-space by using some drives for mirroring/parity.

However…

…since so many SNOs never fill their HDD: even over years… it’s more likely a failed drive loses those years of progress. Mirror/parity drives are unlikely to ever cap your earnings… it’s more likely they’ll protect you from having to start over.

I would love to have the problem of a RAID array completely filling with Storj data… and I felt my earnings were being constrained, and that I “wasted” some possible-paid-space on resiliency :money_mouth_face:

3 Likes

Re: AR’s Optane suggestion (a much better option, for your use case.)

Just a note, personally I’ve had the worst experience with Samsung Pro & Plus drives in the past: 4 dead of 5. None of which were used in excessive load situations.
I haven’t even bothered with warranty returns, that’s how low I think of them.

2 cents,
Julio

1 Like

Of over a dozen brands of currently-running SSDs, and over 15+ years, I’ve only had two WD SATA, two OCZ SATA, and two SiliconPower M.2s fail. Nine Samsungs (SATA/M.2/U.2) have been running like tops. My oldest are three Intel X-25M’s from 2009 running as OS boot drives: they’ll probably outlive me :winking_face_with_tongue:

I think sometimes you just get unlucky.

Re: Support: I’m around 6 weeks into a WD SATA SSD RMA. When I bought the drive WD was still doing consumer SSDs… but now they’ve handed everything consumer over to Sandisk… who have support systems that don’t even accept WD serial numbers. Yeah. But the ticket is still slowly moving…

Well I’ll certainly give you that, I don’t recall a Samsung Sata3 SSD failing yet, even after tossing many out the door to resurect/regurgitate old laptops, etc., for people. No body’s ever complained.
Yet sometimes, it behooves one to check for past firmware bugs for m.2s, and ensure you’re as up to date as can be, before it’s too late.
P.S. Pretty sure Silicon Power M.2’s are rebranded Xiansing whatever stuff. (Which incidentally have had production runs totally excluding unique IDs, making it only possible to run one in any given system -lol)

2 more cents,
Julio

Thanks everyone for the detailed responses — especially @Alexey, @arrogantrabbit, @Roxor, and @aardvarkl for the guidance and insights.

:puzzle_piece: Issue Summary

The core problem was related to identity path expectations inside the container.
The Storj node expects the identity files at /app/identity/identity.cert, which should be inside a folder named storagenode. At some point, I had flattened the identity directory (moving the contents out of storagenode), and that broke everything.

:white_check_mark: What Fixed It

Restoring the correct folder hierarchy and updating my volume mappings like so:

      - /mnt/floki/storj/storj1/config:/app/config
      - /mnt/floki/storj/storj1/identity:/app/identity

This allowed the container to locate identity.cert in the expected location. Once updated, the node booted cleanly. I’ve since applied the same fix across all three of my nodes.

:memo: Other Notes

  • All permission checks and file operations were done from inside the container to ensure UID alignment and hidden file visibility.
  • The problem wasn’t related to permissions or ownership — it was entirely path-based.
  • All three nodes are running on a ZFS RAIDZ1 array with 64 GB of RAM and an L2ARC device, which significantly improves read performance.
  • We had a side discussion about SLOG endurance and NVMe wear — interesting topic, but in this case not relevant. My 970 Pro is still in great shape.
  • As @Alexey noted, identity authorization is no longer required for new node setups — good info for future deployments.

Thanks again to the community — this was a great learning experience. Everything is working smoothly now.

root@docker:~# ls -l /mnt/floki/storj/storj1/identity/storagenode
total 27
-rw------- 1 1000 1000 558 Jul 25 01:29 ca.1753406965.cert
-rw------- 1 1000 1000 1088 Jul 25 01:29 ca.cert
-rw------- 1 1000 1000 241 Jul 25 01:29 ca.key
-rw------- 1 1000 1000 1096 Jul 25 01:29 identity.1753406965.cert
-rw------- 1 1000 1000 1626 Jul 25 01:29 identity.cert
-rw------- 1 1000 1000 241 Jul 25 01:29 identity.key
root@docker:~#

:white_check_mark: Final Update — All Nodes Online & Healthy

Thanks again to @Alexey, @arrogantrabbit, @Roxor, @aardvarkl, and others for the support. Everything is up and stable now.


Issue Summary:

My Storj node container was failing to start due to missing expected internal paths, triggering errors like:

Error opening database on storagenode: group:
--- stat config/storage/blobs: no such file or directory

It wasn’t a permissions issue — it was purely a volume mapping problem. I had mistakenly flattened the identity structure and misaligned the config path.


What Fixed It:

Once I corrected the volume mounts to explicitly point to the correct directories, the container came up cleanly:

volumes:
  - /mnt/floki/storj/storj1/identity:/app/identity
  - /mnt/floki/storj/storj1/config:/app/config

This ensured that:

  • /app/identity/identity.cert existed inside the container
  • /app/config/storage pointed to the actual database and data folders

After applying this fix across all nodes, everything started correctly.


Key Lessons:

  • Identity paths matter: The Storj container expects /app/identity/identity.cert, so structure accordingly.
  • Container pathing is absolute: Host-side assumptions mean nothing if the container’s internals don’t line up.
  • Inspect from inside the container: Bash in and check the actual paths. Permissions, files, and layout — validate them from the container’s POV.
  • Standardize storage layout: I now use:
storj1/
  ├── config/
  │   └── storage/
  └── identity/
      └── identity.cert, etc.
  • ZFS tuned setup: These nodes run on a 5x10TB RAIDZ1 array with 64GB RAM, L2ARC, and a lightly used 970 Pro SLOG. The setup is well-suited to read-heavy workloads like Storj.

QUIC Status & Recovery:

QUIC is now green (“OK”) across all nodes after container structure was corrected. One node had startup issues due to an improperly mapped config/storage path — once fixed, QUIC recovered without further action.


Current Status:

All nodes are online, accepting ingress, and entering vetting. A few screenshots below show them ramping up. Recovery will take some time, but everything’s functioning correctly.


Final Thought:

This was a classic case of Docker volume mappings silently breaking container logic. Storj is rock-solid once configured properly, but the startup logic is extremely sensitive to the directory layout. Hope this helps someone avoid the same multi-hour detour.

Thanks again to everyone who contributed.

3 nodes, but all running on the same array, and presumably behind the same IP address?

  1. It won’t do you any good from a “getting more data” perspective. Storj will divide your inbound data by all nodes on the same IP.

  2. against storj rules to run multiple nodes on the same “disk” (or array).

  3. If the array poops out then you lose all three nodes.

Better to run three separate notes on three separate disks, or just run a single node. Less hassle.

As mentioned, the SLOG doesn’t help with storj at all, it’s all async writes.

L2ARC for metadata is indeed very useful for storj.

A special VDEV is the MOST useful, but it adds an extra point of value so often the special vdev itself should be mirrored SSD. (at least if you’re using it for your “real” data as well not just storj).

note this is all with the traditional storj “pieces store” which uses millions of files. Storj is working on a hash store which uses much fewer has files, and would make the file overhead less significant. Eventually.

#2 nobody cares about - really :squinting_face_with_tongue:

#3 he’d have to lose two disks - and if he really is just using Storj to fill spare space: you want RAIDZ/Z2 for your day-to-day data these days anyways. Space is so cheap!

+1 to metadata acceleration for Storj: it’s really the only ZFS feature than makes a big impact.