I am running 8 nodes on 8 HDDs connected to a Pi5. As you can imagin this is kind of borderline but it works surprisingly good. (still testing and I am going to share it in a different thread in a few days). One of the challenge is to stretch out maintenance jobs to make sure ongoing uploads are not impacted. CPU time is rare so lets use it most efficient.
Here is my script to run filewalker on all 8 nodes but only 1 at a time. You might want to copy and adopt that for your needs. It does require access to the debug endpoint of the storage node.
for i in {8..1}
do
echo $i
sed -i 's\storage2.piece-scan-on-startup: false\storage2.piece-scan-on-startup: true\g' /mnt/sn$i/storagenode$i/storagenode/config.yaml
sudo systemctl restart storagenode$i
sleep 30
sed -i 's\storage2.piece-scan-on-startup: true\storage2.piece-scan-on-startup: false\g' /mnt/sn$i/storagenode$i/storagenode/config.yaml
while :
do
if (( `curl -s 127.0.0.1:$((16000+$i))/mon/ps | grep SpaceUsedTotalAndBySatellite | wc -c` <= 0 ))
then
break
else
sleep 30
fi
done
done
Now there is a chance that one of the storage nodes failed to run it for what ever reason. Maybe a restart or so. The following 2 code snipets will print out more details.
for i in {8..1}
do
echo $i
curl -s 127.0.0.1:$((16000+$i))/mon/funcs | grep -A 2 SpaceUsedTotalAndBySatellite
done
Example Output:
8
[5437966273498512387] storj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalA ndBySatellite
parents: 631927965255748052
current: 0, highwater: 1, success: 1, errors: 0, panics: 0
7
[488868047661742421] storj.io/storj/storagenode/pieces.(*Store).SpaceUsedTotalAn dBySatellite
parents: 2052289714535893614
current: 0, highwater: 1, success: 1, errors: 0, panics: 0
That is unrelated to this topic. The Pi5 has 4 cores. To be safe I would still recommend running just 4 nodes on it and not 8 as I am doing it right now. I doesn’t change much on the script above. I would still run just a single used space filewalker at a time.
You might have a wrong impression of why I am running this Pi5 experiment. I am not recommending 8 nodes on a Pi5. I am sharing my script with you and it will work even with just 2 nodes.
It looks like this script assumes use of the native Linux client and has been configured for management with systemctl. Would similar be able to be done for docker installs replacing the systemctl entry with docker restart of the container? Unsure if the docker container will re-read the yaml on restart of if it would take a full stop/run to do that.
I guess the for i in would also need to be swapped with some logic that would get the container list from docker ps -a -q as well…
Thanks for sharing your script! I’ll be watching for your RPi thread as I’ll want to jump in with some of my recent observations of my RPi node.
EDIT: hmm… the curl line throws a twist in there for docker as well. I am interested in figuring this out for docker nodes and you’ve given me a great start to investigate!
MORE EDIT: I also realized I am not yet making the debug available on my nodes. For those like me that need this info, here’s littleskunk’s previous post on enabling debug, at least for docker nodes:
You already nailed it. Yes this should work with docker as well. Instead of obtaining the list of storagenodes I just named them storagenode1 - storagnode8. It should work the same way with docker. Just replace the number in the name with $i
Sigh… and today I learned that you can manage docker containers by the name. This entire time I’ve been managing them by the container ID value. I swear at one point I tried to do something via the docker container name and it wasn’t allowed, forcing me to instead use the container ID. Well this is going to make my life a lot easier including tweaking this script for docker systems.
Would you mind expanding on this? I looked at the man page for flock and it seems this is for a higher level programming than a shell script. Maybe you meant to link to flock(1)? I’ve never used this in Linux but from knowledge of object locks on other platforms (I’m an IBM i (AS/400) admin/engineer by trade) I can definitely see how the logic could work. Thanks!
This script is not modifying the software in any way, all that its doing is coordinating the enabling/disabling of the piece scan on startup flag in the configuration file. Changing values of a configuration file is not modification of the software. Modifying the software would be changing the actual code to alter specific behaviours.
Sorry, I don’t have time to create a PR out of it. Besides, I have only implemented a Linux versionof it, I don’t know how to do this on Windows. However, it’s simple enough that I’m pretty sure any Storj developer could reimplement it pretty quickly.