Here is good exampe how HDD fragmentation affecting databases:
And of course, it significantly affects storage for pieces (a lot of IO operations)
I think fragmentation on HDD is the root cause of database locks and performance degradation.
Here is good exampe how HDD fragmentation affecting databases:
And of course, it significantly affects storage for pieces (a lot of IO operations)
I think fragmentation on HDD is the root cause of database locks and performance degradation.
If that is the case then that explains why vacuuming the database fixes the problem: SQLite vacuum works by reading in the entire database while simultaneously writing out a new version of the database with the same effective contents into a new file, then moving the new file over the old file when that process is done. Releasing deleted records and clustering of the data is the primary function of vacuum, but filesystem-level defragmentation also happens as a byproduct of how vacuuming is performed.
I’d be particularly interested to see filefrag
output on the database files before and after vacuuming.
Since vacuuming will write a new file fragmentation will be less I assume, unless your disk is nearly full.
I defragment all databases daily. With BTRFS it can be done even if the files are in use. I haven’t vacuumed the databases yet though.
I wrote about vacuum a few post ago
Please give me some time to collect statistics, and I will write another post about “how the weather on SSD with databases”
I solved locked/malformed sqlite databases dumping each one and recreating them back. Easy, fast (and safe if you know what you are doing) and I guess it reduces fragmentation.
$ rsync -ah --progress orders.db orders.db.test
sending incremental file list
orders.db
571.36M 100% 32.52MB/s 0:00:16 (xfr#1, to-chk=0/1)
far from perfect, but seems to do just fine, maybe it’s because my orders.db is only a couple of months old.
turned out the low speed transfer max was due to other factors rather than fragmentation.
This is the same as vacuum
command
loved increased cache_size with old >4gb info.db
btw vacuum never fixed malformed databases
Of course, but you talked about defragmentation
I migrated my databases to the SSD and collected statistics. As I wrote before, we have a huge fragmentation of the databases, but after migration, to the SSD this problem is not actual.
Let’s look into statistics of SSD with databases:
Let’s look more detailed to one of these spikes
Let’s also look into other interesting parameters of these spikes:
Here we can see that one operation takes about 2ms.
Here we can see the average bandwidth of operation
So, let’s summarize and make a conclusion.
What I expected before starting the migration to the SSD:
Conclusion: Move the database to SSD it an awesome feature that we must have especially if you advanced storage node operator and have big storage node. Because a lot of small IO operations can easily kill the performance of every single HDD or any RAID with HDDs and produce a lot of problems like database locks because HDD is will be so busy to handle these operations.
That’s very impressive and detailed study. Good job!
but wouldn’t copying the database remove the fragmentation and for my SSD with my current zfs my old ssd will only do 4k IOPS with my current Q1T1 bottleneck because of misconfiguration of the LUN’s
But i’m not seeing any of that, i got the hourly spike, but it doesn’t created any sort of backlog, or such on any of my drives which at the moment is down at 8ms peak.
ofc i don’t really get fragmented files because i turned it of by using sync=always in zfs, on another not i haven’t really seen any of those database is locked messages for a while, but that may be because i finally got around to not rebooting my system again and again…
also am i missing something, but it doesn’t really seem like you state what hardware you are testing on, not that it really matters but i did wonder a bit if my 2xraidz1 of 4 and 5 drives was partly to blame about being able to keep up with the read write iops for the db committing or verifying everything to disk or whatever it’s doing every hour…
i might try to move the db eventually, just to see if my numbers change, but currently i don’t really see a need… ofc node size might play a very large role in db fragmentation and load on the system.
i would assume if one doubles the capacity of a node, the db sizes and thus load on working with them would also double, even tho the load of the node data ingress / egress would be pretty much the same.
So, what’s the accepted way of doing it with docker? As I understand the paths in config.yaml are relative, so I probably need to mount the SSD under /storj/node/ssd and specify “ssd” in the config file?
I am not super familiar with docker.
I see, I think I should prepare short instruction how we can move DB’s with docker.
(I on docker too)
Please give me some time.
I think this should be a new topic so this can be categorized as SNO->FAQ
Yes, I agree, but I think this instruction for advanced SNO only, a newbie can kill their nodes…
Let’s ask @Alexey to review it and make a decision for the move or not.
This is the part I’m not sure about. If it doesn’t it could be online but fail to respond to audits. It could get into trouble quickly. It would be nice if it just crashed and would only be offline. Especially if you’re monitoring that with uptime robot.
Right. Crashing would be a best-case scenario. I use Zabbix to monitor all of my nodes so that would get caught quickly when it stops responding to API requests.
loss of the databases aren’t a trivial matter and can hurt your storagenode maybe even get it DQ in worst case scenarios, so i would stick to using a proper ssd.
remember the databases controls a lot of stuff locally so them getting corrupt or lost can most likely lead to some pretty messy shituations… even if they can be restored, one has to catch the error before it goes to far for either the data or the satellites to accept.
tho it might be an option to introduce an automatic database restoration, if the system detects its corrupted, and then simply put it back at its original point, or it could have the database in multiple locations and update it less frequent in one big sequential write…