I have a bunch of nodes and tested runtime of vacuuming the databases in-place against copying them to a ramdisk, vacuuming them there, and moving them back to the disk. The results are quite impressive.
The orders database takes the overwhelming majority of the time so I’m including its size before and after. Both nodes run on the same host (two Docker containers). They run on different HDDs; both are exactly the same model.
Node 1: 300MB orders database. Vacuuming all databases in-place took 129 seconds. After, the orders database was 157MB.
Node 2: 380MB orders database. Moving all database to a ramdisk, vacuuming them, and moving them back took 9 seconds. After, the orders database was 95MB.
That’s a whopping 93% reduction in the downtime required to vacuum databases. I would guess based on this that the majority of the in-place vacuum is spent waiting for the HDD to seek back and forth between the old and new databases, and possibly performing out-of-order reads on the old database. Even with the overhead of moving the databases to and from a ramdisk, the ramdisk approach is substantially faster most likely because each file is read or written once and in-order.
In-place vacuum script:
docker-compose stop -t 300 && \
for i in data/storage/*.db; do
echo "$i"
sqlite3 "$i" 'VACUUM;'
done && \
docker-compose up -d
Ramdisk vacuum script (assumes that /dev/shm is a tmpfs, which is the default on many Linux distributions):
What would be the real impact on a node? You also assume your HDs are a benchmark for all SNOs. You also assume that VACUUM 300MB to 157 would have a huge real-time difference on your node. Where is that proof?
What is the purpose behind such a combative reply when all I did is share information about what happened on my own nodes? Please back off a bit.
Less downtime… means more uptime. Which is good for obvious reasons. You can’t earn egress traffic or acquire new data while the node is offline.
I never said that anywhere and I did not intend to imply that. I’m showing stats from what happened on my own hardware with my specific databases. Other SNOs can compare performance of the approaches on their own nodes if they would like. In fact I would love to see other SNOs test and see which is faster and report their metrics.
My expectation is that SNOs storing their databases on HDDs will see similar results but I have no evidence that is the case, and – again – I am not claiming that to be true. Though it would not surprise me if it were true.
I don’t expect SSDs would necessarily see much improvement. Some, perhaps, but not 90+%.
I don’t know what this sentence means. If you are asking whether vacuuming is even necessary, it can reduce fragmentation of the databases and thereby improve access times. This has been documented in other forum posts when slow write performance due to fragmentation caused excessive lock times.
The reduction in the size of the database is not the primary benefit, but it does illustrate the amount of garbage-collection that took place. This is necessary to compare the duration of each approach as a wildly different database size could also contribute to the difference in vacuum duration.
For example, vacuuming the second database probably required less I/O since the resulting database size was smaller, indicating less data had to be read/written. However, that should not account for a 93% reduction in duration… maybe 30% based on the relative sizes.
Thanks. I Just ran a modified version of your in place script. It finished pretty fast but my DBs are not very old and I had defraged them recently also. I had planned to eventually vacuum them one at a time when I got bored enough but this script made it easy to do them all at once.
Not at all, but creating a strawman (“You also assume your HDs are a benchmark for all SNOs”) and then attacking that is rude and fallacious.
I never set out to prove anything, only provide information about what happened on my systems. Making demands for proof is silly in this context.
Try it yourself and see what happens! I’m interested to see numbers on the difference between the two approaches from other SNOs, regardless of which approach turns out to be generally faster. Include whether it was an HDD or SSD.
The split comments were questioning whether vacuuming is even necessary, while this topic is about vacuuming the databases in a ramdisk to make the vacuum process faster. They didn’t belong in this thread.
I’m not sure why I can’t edit the original post anymore (ping @Alexey) but the ramdisk-vacuum script is dangerous and needs to be fixed. If stopping the storagenode container times out, one or more WALs may still exist. This is not currently accounted for in the script and this causes two very bad things to happen:
Anything in the WAL that could be replayed is lost during the vacuum.
When storagenode starts back up after the databases are moved back, sqlite tries to replay the old WAL against the newly-vacuumed database and this corrupts the database.
This happened to me and corrupted my bandwidth DB. I was able to fix it with .dump and manually cleaning up corruption in the generated script, then rebuilding the database from that script.
The vacuum script is fixed by:
Copying any WALs to the ramdisk (replace *.db with *.db*).
Removing all database files prior to moving the vacuumed databases back because the WAL will have been deleted by the vacuum process (add rm *.db*).
To make (2) safe in the case of an unexpected shutdown, we instead move the databases to a subdirectory of the storagenode data directory; if a poweroff happens, then the database files are guaranteed to be there (but may need to be manually moved back).
@cdhowie
Hi, just made my first attempt to vacuum one of the database the biggest one - orders.db
It reduced the size from 500MB to 24KB … are you sure that this process is safe?
I am not copying it back (still running the old “before vacuum” version) as it seems too good to be true for size reduction.
Vacuuming a DB consists of pruning prior row operations, such as deletes and updates. None of the current data in the database tables are affected.
Storj nodes will generally see a very large reduction in database size due to the large number of prior row operations that are stored in the database.