Storj Node expansion

Thank you @cdhowie I’ll do some reading up on this. I have some time before the nodes fill up completely, so I’m going to take my time and do it right. It’s answered in the article I suspect, but how easy is it to expand a md-raid array? I have some experience expanding a ext4 volume on VPS and VMs, and while that was not something I’d consider beginner level CLI it was not that hard.

@Vadim Yea, that is how I understand it as well. I’m considering getting back down to 1 node, and just giving it all of the space I currently have between the nodes, as well as the extra space I was going to give it.

@BrightSilence As always I appreciate your input, and it sounds like you understand exactly what I’m getting at. I’m not too worried about the overhead of running extra VMs. Ubuntu is pretty lightweight (compared to Windows VMs) and I have resources to spare in CPUs and ram. I agree I could run more than one node under the same VM, but I’m kind of old school when it comes to services per VM. Always one service per VM. Only started using dockers last year for my “linux ISOs” and it has been hard to break myself of the single service per vm.

As for expanding the current ones, that’s not really an option at the moment. They sit on individual WD Red drives, and while I could I guess go purchase some larger drives for these, and then copy over the vmdk files, I’m pressing up against the drive limit for that system. It’s also running a vitalized FreeNAS zfs array. I’ve considered putting the vmdk files on the array, and then passing it back to ESXi as a iSCSI target, as the array is very quick, and lightly used. Hmmm… that might be the way to go. I have about 20 TB free on the array right now, so giving up 10 TB for this would be easy. I’d just want to get back down to 1 node.

Sorry guys, I know I’m all over the place here, but this discussion is helping me. I hope that it also helps someone else who might be in the same boat as me.

1 Like

I understand, but with Docker, containers kind of take over that role, without any overhead. That said, ESXi these days does a lot to avoid overhead as well and if you have the resources to spare it doesn’t matter too much.

In the end it sounds to me like you have a system on which the exception applies. If you have a large array that isn’t too heavily used and are able to use excess space on that array for your single node, it may indeed be best to just use that array and expand it when you want to add more space. I use a similar system with an SHR2 array on my Synology. Which on the backend is just several raid6 mdraid’s with lvm to combine them into one volume. It’s very easy to expand even with different disk sizes.
So yes, I would keep your most productive node and gracefully exit the other one. Move the vmdk to the ZFS array and expand from there. If possible use the disks you free up to expand te array. :wink: But I think with ZFS they have to match size (haven’t used it yet myself).

1 Like

You are correct. It’s not as easy to expand a zfs array as just adding a drive, but it is not impossible. Best to use the same size drives, but you can add arrays of drives to expand the array. In my case, I need to add 5 drives at the same time, and then expand the array. I think I might go this path, and just exit the other node. I can then recover the other drives and just get a few more to expand the array when needed.

I guess I have my path forward. Thank you everyone!

1 Like

It depends on what you mean by “expand” as well as the array type, but assuming you are talking about RAID5 or RAID6:

  • If you are replacing all of the disks with larger disks, you would use --replace to swap out each disk one at a time with the larger replacement disk. --replace has the same final effect as removing a disk from the array and adding the larger disk, but the old disk is not removed from service until its contents have been mirrored to the new disk. Therefore, --replace allows you to hot-replace disks without a period of reduced redundancy, and without exercising the other disks in the array to rebuild the missing data. When all disks have been replaced, --grow --size=max asks the system to increase the capacity of the array as much as possible to actually use the new capacity.
  • If you are just adding new drives to the array, you can just --add them as spares and then increase --raid-devices appropriately. The array will be reshaped to include the spares. Because this operation requires moving data around, it is better to add all of the new disks in a single operation. This way the data can be shuffled around only one time. The array will remain active while being reshaped, though – similar to any other check/repair-type operation – the reshape will consume a lot of IOPS, so the array’s performance will be worse until the reshape completes.
1 Like

@cdhowie Perfect. I’m going to read over the doc to get a better understanding of md-raid. I appreciate the info and the time you took to explain this to me. I think for now I’m going to iSCSI to my FreeNAS for one large node and exit the other one.

@BrightSilence I expect the copy will take a day or so using rsync. I guess the best way to do this is rsync a few times till the amount of changes is very low, then take the node down and do a final rsync and then change the docker run command.

Just for sanity sake, take a look at the below command and see if I have anything wrong. I’m confused by the docker tag (beta). I think my nodes are using alpha still. Don’t worry about the paths. They are correct, and yes I have data in the path twice. I might fix it once I’m done.

docker run -d --restart unless-stopped -p 28967:28967 -p -e WALLET=“XXXXXX” -e EMAIL="" -e ADDRESS=“XXXXt:28967” -e BANDWIDTH=“40TB” -e STORAGE=“8TB” --mount type=bind,source="/mnt/sdc/data/storagenode",destination=/app/identity --mount type=bind,source="/mnt/sdc/data/data/storage",destination=/app/config --name storagenode storjlabs/storagenode:beta

That’s the correct approach with rsync. Make sure you run rsync with the delete option when you run it the last time after stopping the node.

The run command looks fine to me other than the fact that the forum replaced some of your straight quotes with curly ones. Make sure you don’t have those in the actual command and you should be good. The beta tag has been around for a while and should be the one you use, but I’m pretty sure alpha still worked until now as well. Good luck with everything!

Don’t forget the --delete at the final rsync, otherwise you’ll have already deleted pieces in your storagenode. But I guess garbage collection will take care of that nowadays.

@BrightSilence @donald.m.motsinger Oh yea will not forget the delete option for each run after the first.


For node expansion, should I just Rsync the old node’s data to larger HDD and then simply change -e STORAGE="9TB" to -e STORAGE="13TB" in docker run command?

So should SNOs take care of their local storage redundancy? I have seen that many are militantly opposed to this, defending the principle of “one node - one HDD”… :man_shrugging:

for me it working fine, 1 hdd = 1 node, node can die not only because hdd failure, and this way can make more. Storj alredy have redundency no need to make more.

1 Like

It’s not worth it in my opinion, I agree with @Vadim - it’s better to have several nodes, each on own HDD than a wasted space on redundancy.
You can see also RAID vs No RAID choice

OK, thanks, but what about my first question (about exact steps on node expansion)?

That’s an interesting point. Do we still need the --delete option when running rsync?

Yes, because when you rsync your running node, there would be a temporary system files for SQLite DBs, like *.db-wal and *.db-shm.
If you didn’t run the final rscync with --delete, these files will remain (they doesn’t exist when the storagenode gracefully stopped) and when you start your migrated node, the sqlite engine will try to replay these journals and more like it could break the databases. In best case they will be malformed, in worse case they will not be a database anymore.

Ok, but these are only in one directory right? Rsync can be restricted to run only on that directory. So the question would be is --delete still required for the other folders or would garbage collection take care of those?

Yes, you need to rsync your data several times, then stop and remove the container and run the final rsync extended with the --delete option and run the node back, using your full docker run command with all your parameters, include changed ones.
See for details

1 Like

I would not overcomplicated things. But you are technically right, if you don’t mind to keep deleted data a few weeks more.

I was just thinking of ways how rsync transfer process might get improved or sped up.
Rsync does its job, but it is not optimal with that storagenode software to move a node especially larger ones.

Not necessarily but it does open you up to a lot more ways to make a mistake. I was actually dealing with a similar issue. I had one node running on an old Drobo unit connected to my NAS. But it was USB 2.0, SATA II internally and used NTFS on a linux system. It worked for a long time, but the file walker processes and garbage collection always took ages. More than a day in many cases. And the worst part was that the IO wait was causing the entire system to slow to a crawl.

I finally got fed up with it, but I knew rsync would likely take weeks and I wouldn’t be able to get that speed down below a week most likely with this setup. Since I didn’t want to cause a week of downtime I transferred it in quite an unconventional way. The size had already been lowered to way below the amount of data stored to prevent it from growing and hopefully shrink it slightly, but the shrinking basically stopped. But because it wasn’t receiving any new data I could just cp the blobs folder once and be done with it. It’d miss some deletes, but garbage collection would take care of that. After that was finally done (took more than a week), I stopped the node and copied the rest of the files. DB’s were already on the internal array, so I didn’t need to copy those. Changed the paths for data and identity, started the node back up and all seems to be working fine.

I wouldn’t necessarily recommend this method to anyone as it is quite error prone. But it does seen like it works just fine if you are in a bind with horribly slow storage like I was.