Co-located storj node

Hkup859 · October 26, 2020, 3:27pm

I am setting up an unraid server, or rather 2, that will store data and provide redundancy in the event that one location goes down. On this server I’ll have a lot of extra HDD space and I wanted to setup Storj in a docker container with it. I originally used Storj back in V1, but wasn’t reliable enough with my setup for it to work well long term. Since I will now have a setup that I plan to keep up as close to 24/7 as I can, I wanted to know if there was any way for me to setup Storj to fallback from my primary server to my backup. I already plan to be syncing all my data and have failover setup for my other processes I run, but I wasn’t sure how or if I could accomplish this with Storj.

I realize that this is an odd use case, and that I could run 2 separate nodes instead of backing up the data like this, but I’m curious if this setup is possible with Storj if I wanted to try it.

Arkina · October 26, 2020, 4:02pm

Beside that i wont make sense as you already mentioned, yes i think it should be possible. In the End once everything is properly synced (Data/Dbs/identity) you only need to update the hostaddress druning docker startup.

–> https://documentation.storj.io/resources/faq/migrate-my-node

SGC · October 26, 2020, 5:06pm

sorry but doing live backup and using some sort of failover storagenode solution is just a bad idea in so many ways…

only reason to do that would be if you like to experiment with that kind of stuff and want to have something thats actually live to test on… but from a hardware perspective it’s mostly a waste of resources.

don’t get me wrong… i recently set storagenodes up on mirror “raid” setups (raid1)
but in the case of something like zfs that comes with some advantages… i get double the read iops and bandwidth(but bandwidth is kinda irrelevant in this case), with the same level of hardware as your solution would require.

i get redundancy and with checksums that does that it’s nearly impossible to get the data damaged and thus only software issues will cause problems with my node, your live backup solution would merely copy any errors the primary system makes onto the backup, which would mean that only in case of a power failure or a catastrophic disk failure that doesn’t start to spew out random data / corrupt data which then would just get written to your backup…

any minor errors like say 5% of the data going bad from lets say a bad cable will kill your node…
while downtime is highly unlikely to DQ your node.

so basically your setup will protect you against the least likely way for your node to become DQ
and not against all the most likely ways nodes get DQ, while using the same levels of hardware, and on top of that using bandwidth for live backup… which has to be so accurate that it will literally have to be live… thus maybe even cause more performance loss…

so really i cannot recommend you doing that setup, unless if you really want to test that out, but it’s complicated, and most likely increases your odds for you node eventually failing and with no real benefit… but it might be a fun experiment.

if you got stability issues you might be able to do some sort of SAS connected interlink to multiple computers and then have one storage solution or server, and if a container fails another takes over using the shared storage…

but thats just an over complicated solution, but alteast it will make better use of the hardware.
since the storage is shared and if one server OS fails or whatever, then the second server simply takes over and relaunches the storagenode from the shared SAS DAS type solution.

which you could then also have storage redundancy like say run a raid 6 thus getting past the whole bit rot issues with hdd’s, bad cables and what not…

Hkup859 · October 26, 2020, 5:31pm

Good to know, thanks!

Hkup859 · October 26, 2020, 5:34pm

Thank you for the helpful info. I figured it would be an overly complicated setup, though I do have a question about your setup. First, I’ll say I’m very new to hardware and the “server” world. I’m a software developer by trade, so setting up this kind of system is new to me. That being said, I’m not sure I understand how your mirror (raid 1) setup is all that different from what I’m describing. How does it give you double the iops and bandwidth vs keeping a live backup? Again, I’m sure I’m just misunderstanding something because this isn’t my wheelhouse, but I’m curious if you wouldn’t mind trying to explain it in more detail.

SGC · October 26, 2020, 5:54pm

raid is a solution that combines multiple disks to act as one…

raid 0 writes across all drives in the raid array, speed and capacity is multiplied by the number of drives, but iops remains the same.

raid 1 or a mirror is 2 drives getting the same data written to them, this does that when stuff needs to be read, it can be read from either of the drives and thus all read functions are doubled, but writes remain the same as 1 drive.

raid 5 should not be used… except if no other choices exist or if its never hybrid version that solves some of the issues… minimum number of disks is 3 disks, of which one is redundant… thus in case of disk failures, one can simply replace the disk, fix the cable thats broken or whatever…

raid6 is basically the same as raid5 just that there are 2 redundant disks minimum number of drives are 6 and then 2 if redundant, this does that whenever there is a failure the system can calculate which disk is in error… while raid 5 when 1 drive goes bad it has to guess basically.

both raid5 and 6 gets increases in read and write speeds depending on the number of drives tho not including the redundant drives, however iops remains the same as for 1 disk.

my mirror setup splits the data in two when storing it and thus if there is made a write error on either disk the other one will have the data.

yours suggested setup would basically be a live copy, meaning if there is a write errors or basically any sort of issue they will simply be copied to the second system.

lets start with what hardware you have to work with… then i’m sure some of us here can help you with what a good recommended setup will be… having multiple servers isn’t really a requirement and if you really want to do the crazy stuff there… then a server cluster actually requires 3 servers

but that’s crazy performance stuff for high end enterprise, stock broker type stuff…

might be able to do some container / vm level stuff without all that tho… so that if a vm crashes on one server or one server is shut down the vm on the other server will start up.
but yeah i digress…

lets start at the beginning and figure out what you are aiming for and what you have to work with.

Hkup859 · October 26, 2020, 6:19pm

Thank you very much for the detailed breakdown, that helps me understand it much better. Basically your setup being in a raid 1 is a similar version of protection, though implemented differently, vs my unraid setup. The main similarlity being they are both local backup, vs having 2 physical locations. Am I understanding that correctly? I know the actual way the data is protected is different with raid and unraid, but at the end of the day they’re both local redundancy only (if I understand correctly).

As far as my setup, if I ever build something specifically for storj I’ll definitely check back here for advice as there’s definitely a lot more for me to learn. Currently though, storj is an addition to my setup, which I’ve decided should be an unraid server, or rather 2, based on my other non-storj related processes I’ll be running.

If I misunderstood your raid setup, feel free to correct me as I’d like to learn as much as possible about these processes.

SGC · October 26, 2020, 6:30pm

if you got two servers with extra space on both of them, then run separate nodes on either of them, involving storagenodes in live backup isn’t a good idea and not even sure if it can work per default…

maybe… but just setting up multiple nodes in something like docker should be very straight forward… no need to worry about much aside from port routing and that you shouldn’t run more than 1 storagenode pr hdd or array.

i’m not sure how unraid works actually… i’m familiar with it… but not a ton… i did assume it could do raid tho… just kinda figured it was software raid instead of most conventional raid being hardware raid and then it’s like a NAS solution … the other one “competitor” being freeNAS or trueNAS or whatever it’s named these days based on freebsd while unraid is linux based… both using zfs i would almost assume… like so much linux stuff today.

zfs is a great solution i use it myself… ill have to take a look and see what unraid actually can in regard to that kinda redundancy stuff…

my main pool/raid array is zfs 2xraidz1 i really like zfs… tho is a bit demanding on the learning aspect… but made by very smart people… most of it is made so smart that if you fail at doing it right it will just do defaults xD so one can basically fail it to work even lol

SGC · October 26, 2020, 6:32pm

sorry got out an a bit of a tangent there…

just do a regular node setup on either server and use docker if you can… duno if that is compatible with unraid, but docker is linux and unraid is linux so … should be… i would assume

start here or at the auth token generation if you haven’t done that yet… its a bit longer up in the list on the side bar of the website
https://documentation.storj.io/setup/cli

Hkup859 · October 26, 2020, 8:32pm

Thanks very much for all your help!

Unraid is actually a slightly different technology than raid, the main difference (in my opinion) is how the data is stored. With unraid, you store data files in a single drive at a time (to some extent) rather than splitting it. The 2 big advantages of that are 1. No need to spin up all drives for read or write. 2. Data is recoverable from all remaining drives if you have too many drives fail. You can also run docker containers and VMs like other solutions.

Back on topic a bit more, I definitely plan to run 2 separate nodes now that I have info on how difficult, and more importantly futile, running a live backup would be. I do have another question if you know the answer to it. If not I’ll look elsewhere on the forum. Since I’ll have a substantial amount of free space, is there a limit of space that can be offered per node? I believe I read somewhere it was 8TB and you’d need to make another node on the same network if that first node got filled (which I know takes time). I hope that’s not the case, but curious if you know how that works.

Thanks again!

SGC · October 26, 2020, 9:02pm

the recommended max size is 24TB of each node… however this is more on a per ip subnet basis.
at 24TB we suspect, not enough live data on how the tardigrade users will work with their data… however if we say 5% is deleted each month on avg…

then when you get to 20TB that would mean 1TB of data is deleted each month, over the last month i got about 500gb ingress, sure atm the node is growing… but since the network is new a good deal of the data will be test data, so nobody knows how it will work out eventually.

you can set it however you like, however the max capacity ends up being a ratio of the % of data deleted per month vs new ingress to the network… minimum recommended is 500GB to 24TB + 10% for trash and basic node data operations… tho maybe at the higher end of the spectrum one can go a lot lower in the over allocation % but recommended is 10%

like say if you got a 40TB node… then 4TB free space for whatever it has to do seems … excessive to say the least…

sometimes data will flow in fast, sometimes slow … it’s not easy to predict, and i bet storj labs likes it that way… atleast with their test data… the network is slowly getting some media attention and has a few interesting partnerships, and performance seems to be great… so should just be a matter of time before some serious data starts to flood in…

even having two nodes you run into the same ratio issue, the only way to get around that is using multiple ip’s to ingress the ingress and thus reduce the net drain from deletions over time.

but the network has be very quite lately… so presently it’s barely an advantage…

and in regard to unraid it sounds very much like a replication type deal… maybe with some layered storage involved… which is without a doubt a very cool solution… very redundant on a file to file basis because one would select the level of backup / replication one wants for each file or folder.

it has some advantages ofc the redundant data stored would then to my understanding take up more space than when using raid solutions… but the iops and such would be much greater, without a doubt a worthwhile setup to be running especially if one has very specific and variable data redundancy demands and high utilization maybe…

kinda guessing a bit here… but should be its strong suits if it indeed is a layered storage with replication type redundancy.

sounds cool, i did strongly consider such a solution when i was deciding on my storage solution, looked at windows storage spaces at the time because i wasn’t really into the whole linux aspect of things at the time…

ended up learning about zfs and just had to have it because of all the amazing things about that… checksum, copy on write, l2arc, slog and such goodness.
can be a bit heavy to work with sometimes tho…

Hkup859 · October 26, 2020, 9:09pm

Good info to know! 24TB is a lot more reasonable than 8TB, so I’m happy with that. Excited to try out Storj again after I build my new setup.

fmoledina · October 27, 2020, 10:47pm

This is interesting. I run a few nodes off my Unraid server, although I use docker-compose rather than Unraid’s Docker management GUI.

Regarding your question about failover from the primary to the secondary server, Storj doesn’t provide any leeway for lost or missing data. From a backup and recovery perspective, node operators have a Recovery Point Objective (RPO) of zero, i.e. if your node is lost, you cannot recover from backup, or start the node on the secondary server using backups, without being liable for failing audit checks for data that was lost between your last backup to the secondary server and the time when the primary node was lost. The only way to set something like this up is with some form of high availability storage combined with a high availability app setup. Unraid doesn’t (easily) provide any typical methods for high availability storage protocols (e.g. iSCSI, Ceph, etc).

If you have both servers located in different geographical regions and on different /24 IPv4 subnets, you would be much better off running 2 Storj nodes, one on each server. This way you leverage the erasure coding data integrity protection built-in to the Storj network and likely net higher revenues compared to just running a node on one server.

fmoledina · October 27, 2020, 10:50pm

As the OP is using Unraid, I don’t thing ZFS is pertinent to this conversation.

Hkup859 · October 28, 2020, 12:53am

Thanks for the feedback/info! I’ve decided to run 2 separate nodes based on all the info I’ve heard here, which I figured I’d end up doing.

SGC · October 28, 2020, 6:58am

well i had to go check then and actually unraid does support zfs just as a plugin, which was what i expected since its linux based, so it was most certainly pertinent, but since OP sounded like having a more advanced setup which was used for something like enterprise purposes i let it go, even if i think replication redundancy and tiered storage to be crude and wasteful.

i did however call it layered storage by mistake earlier.