Vacuum databases in a ramdisk to reduce downtime

DB discussions can get contentious quickly…

However, sqlite has the following in its Vacuum page:

However, using auto_vacuum can lead to extra database file fragmentation. And auto_vacuum does not compact partially filled pages of the database as VACUUM does.

https://www.sqlite.org/lang_vacuum.html

The vacuum issues were mostly focused around the orders.db database. And that database has been redesigned as a flat file:

https://forum.storj.io/t/changelog-v1-11-1/8773

So…

@shoofar should have no issues going forward. I didn’t see a post on which version was currently being vacuumed. It’s possible that the orders.db file was no longer being used… just left behind after the update to V1.11.1 … in that case, the file is smaller, but wasn’t being used anyway.

However, it’s not a good idea to delete files that you are unsure about needing.

1 Like

Cool - wasn’t paying attention to last change as it almost happened while I was away.
Great to know that.
Thanx!

Yep. My 300MB+ file compacted to a few KB as well. It likely wasn’t deleted as part of the DB migrations because that would throw out any unsent orders you had when you did the update. A usual part of these kinds of updates is migrating the old data to the new format, and I suspect that the DB migration scripts run first.

Presumably a future update could remove the file altogether, but there isn’t a compelling reason to do so.

This is not strictly true. You can vacuum a database that is in use so long as you run the vacuum process in a context where storagenode and the vacuum process can see each other’s locks and you run the vacuum in-place (not copying to a ramdrive). The vacuum process will lock the database for the entire duration of the operation.

As long as they can see each other’s locks, the worst case scenario is that the vacuum takes a long time and storagenode times out some operations waiting for a lock.

If they can’t see each other’s locks then this process will almost certainly destroy the database.

The likely reason for storagenode itself not vacuuming on a regular basis is that it can be a several-minute operation as the databases tend to have a lot of internal fragmentation. This can disrupt service.

I am doing it as part of the node’s (re-)start procedure, so it’s vacuumed every time there’s upgrade. That should be enough given the recent frequency of updates.

I think this might be quite the “dangerous” operation to do, so a power outtage might cause severe failure - it’s probably better for some people in the world to not have it auto-start during a tornado or heavy thunderstorm of any sort.

All sqlite operations should be safe from a power cut, provided the underlying filesystem and storage device don’t lie about flush requests. Vacuum is actually one of the safer operations at it uses a temporary file to rebuild the database.

2 Likes

i just optimized my system for long term database/large dataset type loads.
so nearly 7 months in an i have yet to deal with a database maintenance issue.

a few key points in this is sizable and big l2arc, all writes are sync writes and thus goes through a slog device / ram until its flushed to the hdd’s in sequential writes thus reducing fragmentation of the data on drives.

ofc these features will not clean bad database structure, they will only help keep the utilization smooth and snappy, enough so that there should be enough other nodes with issues before i feel them and thus storj should in theory take care of it :smiley:

thats my approach anyways…
if doing vacuuming on a db is safe… well thats a larger debate, which i don’t know a massive amount about, i did watch some lectures about it… in which it sounded a bit like… one cannot really avoid it at times, but it will remove stuff from the database and thus atleast in theory will never make the data safer or more stable.

but isn’t it always like that, performance vs safety
can’t have both

personally i like to compare vacuum to trim on ssd’s, it also helps performance, but will ruin your ability to restore recently deleted data, and tho not the same as vacuum it is fairly similar from a conceptual point of view.

i do use trim on my ssd’s tho :smiley:
never really did do much db stuff, so took the hardware approach to the problem, instead of learning new stuff lol