Problems with Synology secondary nodes

Hi All, I have a DS1621+ and DS918+ and am trying to add secondary nodes on them. After adding them, within 24-48 hours they are running at 100% utilization and docker becomes non-responsive. Both nodes have been running for over 2 years flawlessly but since all the pay cutbacks and such I thought I’d add more space to them

I found this article [My docker run commands for multinodes on Synology NAS] but in trying the configurations I get “command not found” on all the options below after the “–name” line. Below is the config I’m trying.

Node 1

sudo docker run -d --restart unless-stopped --stop-timeout 300 \
    -p 28967:28967/tcp \
    -p 28967:28967/udp \
    -p 14002:14002 \
    -e WALLET="xx27451401df" \
    -e EMAIL="xxeak@protonmail.com" \
    -e ADDRESS="xxs.net:28967" \
    -e STORAGE="16.5TB" \
    --mount type=bind,source="/volume2/Storj/storagenode/identity",destination=/app/identity \
    --mount type=bind,source="/volume2/Storj/storagenode/",destination=/app/config \
    --name storagenode storjlabs/storagenode:latest
    --server.address=":28967" \
    --log-opt max-size=10m \
    --log-opt max-file=3 \
    --log.level=error \
    --filestore.write-buffer-size 4MiB \
    --pieces.write-prealloc-size 4MiB \

Node 2

sudo docker run -d --restart unless-stopped --stop-timeout 300 \
   -p 28969:28967/tcp \ 
   -p 28969:28967/udp \
   -p 14004:14002 \
   -e WALLET="xx27451401df" \
   -e EMAIL="xxeak@protonmail.com" \
   -e ADDRESS="xxs.net:28969" \
   -e STORAGE="3TB" \
   --mount type=bind,source="/volume3/SJ2/identity/storagenode",destination=/app/identity \
   --mount type=bind,source="/volume3/SJ2/storagenode",destination=/app/config \
   --name storagenode2 storjlabs/storagenode:latest
   --server.address=":28969" \
   --log-opt max-size=10m \
   --log-opt max-file=3 \
   --log.level=error \
   --filestore.write-buffer-size 4MiB \
   --pieces.write-prealloc-size 4MiB \

Any suggestions/comments would be greatly appreciated. I’m running the latest DSMs on both NASs with EXT4 filesystem.

When you post something like a code (preforamated text), post it between those special signs… you have a button there like <>.
Please post your docker command for the first node.
Did you took all the steps for starting a node? New token, new identity, etc?
Did you run the install step for the second node too, but with different cotainer name?
Those 2 last lines, you can skip them. I believe one is deprecated.
The last line should not end in \ or /.
Use unique ports for each node, not the default ones, like I posted.
If you want tcp_fastopen to work, use network host mode. I will come back tomorow with my updated commands.

I updated the code above.

Yes, I ran all the installation steps. I have 13 Ubuntu boxes running many nodes (34 nodes total now) with no problems. Some are very old machines (15 years on 3 :slight_smile: )

Anyway, it’s just DSM giving me the problems. I haven’t attempted to get fastopen working yet.

Thanks in advance for your help.

Do you see the difference?

--mount type=bind,source="/volume2/Storj/storagenode/identity",destination=/app/identity \
--mount type=bind,source="/volume3/SJ2/identity/storagenode",destination=/app/identity \
1 Like

Yes, in this NAS there are actually 4 different volumes. After looking at this for days I finally noticed after posting everything here I didn’t have a \ after “storagenode:latest” in either file so putting it in fixed some of my problem… LOL… I can’t tell you how many times I looked at the files. I’m still anxious to see snorkels latest config. I have to take out the following to start the nodes. Just don’t know if the buffer stuff will stop the node from loading down and Docker turning non-responsive. I should know in the next couple of days.

   --server.address=":28967" \
   --log-opt max-size=10m \
   --log-opt max-file=3 \
   --log.level=error \
   --server.address=":28969" \
   --log-opt max-size=10m \
   --log-opt max-file=3 \
   --log.level=error \

You missed a back slash (\) at the end of this line,

so everything below it would be executed as commands in your shell (of course they can’t)

Yes, I caught that as I mentioned in the post before yours. One of those things you look at for days and still miss it. Hoping the last two lines in the config help fix the Docker issue. I’ll know in 24-48 hours.

Thank you as always for looking!

1 Like

You are right, I overlooked your last post. The forum sometimes doesn’t load something (my current WiFi is to blame…).

Out of curiosity, what file system did you choose and RAID type?
You should read all my posts in that thread, because I didn’t took time to update the first one properly, but all the info is there and there are settings that work with syno. It’s a little different than Ubuntu or other Linux distros. Believe me, I tested many many things, and with the help of forum members, especialy BrightSilence who runs also Syno, and is one of the greatest here, I found how things work better and I constantly updated that thread.
Also check my Synology Memory Guide. You should put as much RAM as you can in those machines. And don’t follow the official recommendations.
If you don’t use jason format for logs, those size params won’t work. But if you use json, you won’t see the logs in DSM anymore. You have to use CLI.

I was talking about the identity folder. Once you used storagenode/identity and once identity/storagenode. Is that correct?

Thank you for telling me about the different logging. I’ll just leave it alone for now because I’ve never had problems with it until this point.

On the 1621 I chose EXT4 and Raid 10 on my 16.5TB node and the new node EXT4 JBOD. The DS918 are both JBOD and EXT4. I’ve had only 4GB RAM in these NAS boxes because it has always seemed to work fine. Even now, both the DS1621 and DS918 are only using 1.5GB running the 2 nodes. However, if docker keeps messing up I’ll see if adding RAM helps.

It’s not a docker, it’s a local implementation. In this case - Synology.
However, the general principe is still valid - you need to run one node per HDD/Disk pool.
So, if you use the one pool for all nodes, you need to stop here. Each node should have an own disk/pool. Now you would understand, why we do not promote RAID setups…

The directories are setup correctly.

For Synology, best option is: ext4 in RAID Basic Type, with No Access Time recording.
JBOD from linux is called Basic in DSM.
Also run only one node per disk, and on that disk, keep it’s Identity.
With the last improvements in node software, I think you don’t need the move the databases. It’s up to you.
Add RAM. No need to wait, observe, question if it’s woth it… Add more RAM! I’ve done a lot of testing, in order that you don’t have too. Linux is using all of it for buffers and cache, which seams to be more and more important.
No need to use OEM one. Use Kingston or Crucial or Samsung, or whatever. Samsung seams to be the most compatible.

1 Like

Gotcha… Back in the day when I setup the node, I had always read that R10 was a good thing due to performance. I had plenty of large drives to use to I started this array with 4-12tb. Since the mass deletion of data started, that 24TB setup went to 9TB of data used so I am in the process of migrating over to a single drive with EXT4 FS. Filewalker has yet to complete a full scan and my dash still shows the node is 21.5TB data being used. However, the O/S shows only 9TB of data is on the array. I have it set to 16TB right now so the node won’t try to put anymore data there. It is only sending data out this point in order to help filewalker finish but docker keeps going non-responsive after a couple of days…

More RAM is the key. :shushing_face:

2 Likes

This is not needed unless you are run several nodes on the same pool… If they are younger than 6 months - just nuke them all.

Due to the changed payout schedule these days, the 4-12TB drives would be much better used individually. 48TB worth of nodes instead of 24. This particular node is just shy of 5 years old… :slight_smile:

1 Like

Docker run commands for 2 nodes, 2 drives (Exos), on the same Synology DS220+, DSM 7.x, 18GB RAM, using network host mode, and databases moved to USB flash drive:

Startup scripts in task scheduler (triggered, root, boot-up):

sysctl -w net.core.rmem_max=2500000
sysctl -w net.core.wmem_max=2500000
sysctl -w net.ipv4.tcp_fastopen=3

Pre-Setup - only one time:

sudo su

echo "net.core.rmem_max=2500000" >> /etc/sysctl.conf
sysctl -w net.core.rmem_max=2500000
echo "net.core.wmem_max=2500000" >> /etc/sysctl.conf
sysctl -w net.core.wmem_max=2500000
echo "net.ipv4.tcp_fastopen=3" >> /etc/sysctl.conf
sysctl -w net.ipv4.tcp_fastopen=3

Setup - only one time before you start the node, to setup the directories and etc.

sudo su

docker pull storjlabs/storagenode:latest

docker run --rm -e SETUP="true" \
	--mount type=bind,source="/volume1/Storj1/Identity/storagenode/",destination=/app/identity \
	--mount type=bind,source="/volume1/Storj1/",destination=/app/config \
	--name storagenode1 storjlabs/storagenode:latest

docker run --rm -e SETUP="true" \
	--mount type=bind,source="/volume2/Storj2/Identity/storagenode/",destination=/app/identity \
	--mount type=bind,source="/volume2/Storj2/",destination=/app/config \
	--name storagenode2 storjlabs/storagenode:latest

Node 1:

sudo su

docker run -d --restart unless-stopped \
	--stop-timeout 300 \
	--network host \
	-e WALLET="xxx" \
	-e EMAIL="xxx" \
	-e ADDRESS="xxx:28961" \
	-e STORAGE="xxTB" \
	--mount type=bind,source="/volume1/Storj1/Identity/storagenode/",destination=/app/identity \
	--mount type=bind,source="/volume1/Storj1/",destination=/app/config \
	--mount type=bind,source="/volumeUSB1/usbshare/storjdbs1/",destination=/app/dbs \
	--log-driver json-file \
	--log-opt max-size=10m \
	--log-opt max-file=3 \
	--name storagenode1 storjlabs/storagenode:latest \
	--server.address=":28961" \
	--console.address=":14011" \
	--server.private-address="127.0.0.1:14021" \
	--debug.addr=":6001" \
	--storage2.database-dir=dbs \
	--log.level=info \
	--log.custom-level=piecestore=FATAL,collector=WARN \
	--pieces.enable-lazy-filewalker=false \
	--storage2.piece-scan-on-startup=false

Node 2:

sudo su

docker run -d --restart unless-stopped \
	--stop-timeout 300 \
	--network host \
	-e WALLET="xxx" \
	-e EMAIL="xxx" \
	-e ADDRESS="xxx:28962" \
	-e STORAGE="xxTB" \
	--mount type=bind,source="/volume2/Storj2/Identity/storagenode/",destination=/app/identity \
	--mount type=bind,source="/volume2/Storj2/",destination=/app/config \
	--mount type=bind,source="/volumeUSB1/usbshare/storjdbs2/",destination=/app/dbs \
	--log-driver json-file \
	--log-opt max-size=10m \
	--log-opt max-file=3 \
	--name storagenode2 storjlabs/storagenode:latest \
	--server.address=":28962" \
	--console.address=":14012" \
	--server.private-address="127.0.0.1:14022" \
	--debug.addr=":6002" \
	--storage2.database-dir=dbs \
	--log.level=info \
	--log.custom-level=piecestore=FATAL,collector=WARN \
	--pieces.enable-lazy-filewalker=false \
	--storage2.piece-scan-on-startup=false

Log files:

sudo su
docker logs storagenode1 2>&1
docker logs storagenode2 2>&1
docker logs watchtower 2>&1

docker logs storagenode1 2>&1 | grep "retain"
docker logs storagenode1 2>&1 | grep "pieces:trash"

# SL satellite:
docker logs storagenode1 2>&1 | grep "1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE"
# AP1 satellite:
docker logs storagenode1 2>&1 | grep "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6"
# US1 satellite:
docker logs storagenode1 2>&1 | grep "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S"
# EU1 satellite:
docker logs storagenode1 2>&1 | grep "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs"

Path to logs:

# Path to log files:
sudo su
docker ps -a	#get instance-id, the beggining of container-id
ls -l			#get dirs and files details

# Synology:
sudo su
cd /volume1/@docker/containers/
/<containerID>/<containerID>-json.log

# Ubuntu:
sudo su
cd /var/lib/docker/containers/
/<containerID>/<containerID>-json.log

Help manuals:

sudo su
docker exec -it storagenode1 ./storagenode setup --help
docker logs --help

If databases are moved to USB:

# This startup script works on Synology (triggered, root, boot-up):
mount -o remount,noatime "/volumeUSB1/usbshare"

Graceful Exit:

sudo su

# NODE 1:
docker exec -it storagenode1 /app/storagenode exit-satellite --config-dir /app/config --identity-dir /app/identity --server.private-address 127.0.0.1:14021
docker exec -it storagenode1 /app/storagenode exit-status --config-dir /app/config --identity-dir /app/identity --server.private-address 127.0.0.1:14021

# NODE 2:
docker exec -it storagenode2 /app/storagenode exit-satellite --config-dir /app/config --identity-dir /app/identity --server.private-address 127.0.0.1:14022
docker exec -it storagenode2 /app/storagenode exit-status --config-dir /app/config --identity-dir /app/identity --server.private-address 127.0.0.1:14022

Forget satellites:

sudo su

# Forget untrusted or exited satellites:
docker exec -it storagenode1 /app/storagenode forget-satellite \
--force \
12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB \
12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo \
--config-dir /app/config \
--identity-dir /app/identity \
--server.private-address 127.0.0.1:14021

docker exec -it storagenode2 /app/storagenode forget-satellite \
--force \
12rfG3sh9NCWiX3ivPjq2HtdLmbqCrvHVEzJubnzFzosMuawymB \
12tRQrMTWUWwzwGh18i7Fqs67kmdhH9t6aToeiwbo5mfS2rUmo \
--config-dir /app/config \
--identity-dir /app/identity \
--server.private-address 127.0.0.1:14022

# check status:
docker exec -it storagenode1 /app/storagenode forget-satellite-status \
--config-dir /app/config \
--identity-dir /app/identity \
--server.private-address 127.0.0.1:14021

docker exec -it storagenode2 /app/storagenode forget-satellite-status \
--config-dir /app/config \
--identity-dir /app/identity \
--server.private-address 127.0.0.1:14022

# after Success status, wait 2 minutes and restart the node:
docker stop -t 300 storagenode1
docker restart -t 300 storagenode1

docker stop -t 300 storagenode2
docker restart -t 300 storagenode2

If you change parameters in config.yaml, you only need to restart the node.
If you change parameters in run command, you have to recreate the container:

sudo su

docker stop -t 300 storagenode1
docker rm storagenode1
docker run...

docker stop -t 300 storagenode2
docker rm storagenode2
docker run...

fatality!
Why do you run this every time? To make sure that your node is definitely disqualified on case of any user error?