How to move DB’s to SSD on Docker

Odmin · May 22, 2020, 1:09pm

Before you beginning, please make sure that your SSD has good endurance (MLC is preferred), I personally recommend using SSD mirror.

look into the official documentation and make sure that you are using –mount type=bind parameter in your docker run string
Prepare a folder with mounted SSD outside of <storage-dir> from the official documentation. (it your folder with pieces)
Add a new mont string to your docker run string:

Now we have:

docker run -d --restart unless-stopped --stop-timeout 300
-p 28967:28967
-p 127.0.0.1:14002:14002
-e WALLET=“0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX”
-e EMAIL="user@example.com"
-e ADDRESS=“domain.ddns.net:28967”
-e STORAGE=“2TB”
–mount type=bind,source=“”,destination=/app/identity
–mount type=bind,source=“”,destination=/app/config
–name storagenode storjlabs/storagenode:beta

should be:

docker run -d --restart unless-stopped --stop-timeout 300 \
    -p 28967:28967 \
    -p 127.0.0.1:14002:14002 \
    -e WALLET="0xXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX" \
    -e EMAIL="user@example.com" \
    -e ADDRESS="domain.ddns.net:28967" \
    -e STORAGE="2TB" \
    --mount type=bind,source="<identity-dir>",destination=/app/identity \
    --mount type=bind,source="<storage-dir>",destination=/app/config \
    --mount type=bind,source="<database-dir>",destination=/app/dbs \
    --name storagenode storjlabs/storagenode:beta

Add/change a new parameter to your config.yaml

# directory to store databases. if empty, uses data path
# storage2.database-dir: ""
storage2.database-dir: "dbs"

Stop and remove your storagenode container
docker stop storagenode -t 300 && docker rm storagenode
Copy all databases from “storage-dir\storage” to the new location “database-dir”. Do not move it! (if something goes wrong we just started with the database on old location instead of storagenode will recreate it)
It’s recommended to copy with preserving permissions or you should reapply them after copy is done. The related Linux command is

cp -p *.db /destination/path

Start your new docker run .... string
Make sure that on database-dir you see files with .db-shm and .db-wal like on the screenshot

Summary

If you can see files .db-shm and .db-wal on the new location “database-dir”, now you can delete database files from the old location “storage-dir\storage”.

BrightSilence · May 22, 2020, 5:07pm

I’ve been pondering making an instruction for the same reason. I kind of feel if you don’t know how to figure this one out by yourself, you probably shouldn’t so it. Additionally, while it helps with performance, it definitely adds an additional point of failure to your node. Because now HDD failure is not the only thing that can take your node down. The SSD failing can as well.

So I think if we are going to make an instruction available as a separate post, we should include these warnings and caveats. And probably dissuade people from doing this unless they know it’s absolutely necessary for their setup. I mean, I know what needs to be done and have done it for 2 nodes now. But… I was a little in a hurry on the second one and forgot to copy the db’s. Caught is quickly enough and merged them together again (don’t ask how… it’s a lot of manual work, just avoid this). It’s too easy to mess up and not easy to fix problems afterwards. The only reason I’m doing this is because I’m using devices that almost certainly won’t be able to keep up otherwise after they are vetted and getting the full load. Additionally, I use RAID on the db disk, so there is protection against drive failure to mitigate the risk of relying on more than 1 disk for 1 node.

Cmdrd · May 22, 2020, 5:37pm

Made the change myself yesterday and it was pretty easy to figure out after a small amount of trial and error (forgot a space in the config) but it does require some critical thinking and a base understanding beyond just copying commands. I think for a setting like this does need to have some sort of technical hurdle to implement due to the risks. Put it on my 3x replicated CEPH SSD pool so hopefully that affords enough protection in the event of taking a hardware hit.

As an update, whereas before I would see locking multiple times an hour, after moving it 24 hours ago no database locking errors, so I would say that is a success. Seeing some pretty heavy bursts of IO periodically on the SSD volume so that definitely explains why it was locked, the HDD pool was having a bit of a tough time with latency requirements for responding fast enough for that.

cdhowie · May 22, 2020, 5:43pm

Are all of the databases actually critical for storagenode operation? I thought many of them were just for the dashboard.

If none of them are required (e.g. if you stop the node, delete them, then start the node and everything works fine even if some orders are lost) then it would be a point of failure with regards to uptime, but not durability. That is, if the SSD dies then the node gets I/O errors on the databases, but stopping the node and pointing the DB storage at an empty, working storage location would fix the problem.

If some of the databases are actually required for operation, I wonder if it could be possible in the future to move only the ephemeral/not-strictly-needed databases to different storage. Those databases could even be stored on a ramdisk.

nerdatwork · May 22, 2020, 5:48pm

From what I remember about @littleskunk’s post, your node can survive losing all dbs as long as you have all the data.

cdhowie · May 22, 2020, 5:50pm

Nice. So then what I said is true – it’s a point of failure for uptime (the node would crash and probably refuse to restart while there are I/O errors) but not durability.

Using a ramdisk for the databases is then actually quite feasible, at least for Linux systems that do not reboot frequently! I might investigate making this change on one of my smaller nodes and seeing if it has any impact on metrics.

I wonder if audits would be impacted by I/O errors on the database. If they do then a node could be suspended/disqualified needlessly. Audits IMO should not even touch the databases since audit traffic isn’t paid anyway. There’s no reason for audits to be waiting on databases or even attempting to read/write on the DB files, no?

uwe.88 · June 7, 2020, 11:41am

nice!
work fine

ACarneiro · June 10, 2020, 7:57am

Thank you, this is really interesting.
Is the increase in performance worth the (small but real) risk of adding another point of failure?
Are we looking at 10% improvement? 20% 100% 2%

Thank you

SGC · June 10, 2020, 8:12am

i haven’t tested this… but if your system isn’t affected by it lack of iops for db load… then it might actually decrease your overall performance, ofc this is very unlikely but because you spread the data out over more storage media you might affect internal bandwidth or whatever…

however if you are critically affected by lack of IOPS for the storagenode… then you might see 1000% improvement in what your node can keep up with… and it could be much better than this…

this is down to that when stuff like a conventional hdd or smr hdd gets behind it will add up latency into the second range… while a normal seek latency should be 6ms

so that alone could reduce your latency by a factor of 200, and then when the smr writes at it slowest you get like 700kb/s… so you could essentially end up waiting 1.2 sec to write out a few kb … it can slow your system to a crawl…

but if you don’t have that issue… and everything runs smooth… you most likely will feel a limited effect…

ACarneiro · June 10, 2020, 8:17am

Thank you for your insight.
No SMR on any of my drives and I’m gradually changing them from 5900RPM to 7200RPM Exos ones, so hopefully that’ll help a fraction.

Might put in an SSD for the db when the nodes start getting fuller

SGC · June 10, 2020, 8:32am

i run zfs with a slog ssd with sync=always on the storagenode dataset… so essentially my system is already writing db changes to an ssd, ofc it ends up on the spindles… but i also run dual raidz1 on 7200rpm drives so i get double the IOPS of somebody running a single 7200 pr node.

the ssd option is a great solution for somebody running many nodes located on many hdd’s on one system… ofc putting multiple databases on the same ssd… well thats a collective point of failure.
so have to be able to really depend on it, if one runs many nodes like that…

if i was to do that beyond 3-4 nodes i would run a mirrored ssd setup and keep it well monitored in case of failure, maybe backup the db to the individual node drive every hour… or so… ofc that in itself may defeat the point, but a write like that should be sequential and thus not take more than a sec or two out of an hour of full performance.

in the end all this stuff sort of becomes math… but then again so does everything.

BrightSilence · June 10, 2020, 10:47am

If you’re not having actual issues, it’s definitely not worth it. If you’re having issues, wait until you have 1.6.3 to see if those issues remain as one of the major issues with slower HDD’s will be solved in that one.

The last thing you want to do is move all db’s of several nodes to one SSD. That would create a single point of failure and you could lose all your nodes in one go.

In short, don’t fix things that aren’t broken.

ACarneiro · June 10, 2020, 12:49pm

Fair point, I’d forgotten about the changes coming in 1.6.3

We just want to eke out every last little bit of performance out of our systems, I guess

Pac · June 13, 2020, 7:35pm

True that!
Well actually, some of us just want their system not to crash…

kevink · December 31, 2020, 8:34am

So I tried this now on one small node and seems to work so far but I was wondering about the directory of “orders”. I was wondering if I could move that too.

Odmin · December 31, 2020, 8:39am

Yes, you can, if you already did the database move to the SSD you can use the same folder and create a subfolder for orders.

Just stop the storage node and move the folder with orders to the new location and change the path into config.yaml
storage2.orders.path: dbs/orders

kevink · December 31, 2020, 8:42am

oh thanks! somehow I didn’t see that option… just too many options in here

BrightSilence · December 31, 2020, 8:17pm

It was added later, the config.yaml isn’t updated with new options automatically.

kevink · December 31, 2020, 8:36pm

yes but I started with my latest node where the option was already there, just didn’t see it
But then I retrofitted the older nodes with those options.
Not that I see any significant difference on my disks since… But wanted to try it anyway since my SSD is running in a mirror. But since there’s not much ingress at the moment, there’s not much to see anyway. Just wanted to take some load off my HDDs, especially as I wasn’t running the DBs in an efficient way due to them running in a zfs dataset with recordsize 512KB with all the storagenode pieces. So I was curious if a change was even visible without additional tools. But I guess the logging needs more iops than the DBs xD So for SMRs setting the log level to warn would probably make more of a difference
But I’m getting off-topic now, sorry

BrightSilence · December 31, 2020, 10:19pm

Yeah, I haven’t bothered with it. It’s nothing like the db load. Though I have redirected my logs to the db location for my slow external HDD nodes.