Vacuum databases in a ramdisk to reduce downtime

anon27637763 · September 14, 2020, 12:11pm

DB discussions can get contentious quickly…

However, sqlite has the following in its Vacuum page:

However, using auto_vacuum can lead to extra database file fragmentation. And auto_vacuum does not compact partially filled pages of the database as VACUUM does.

https://www.sqlite.org/lang_vacuum.html

The vacuum issues were mostly focused around the orders.db database. And that database has been redesigned as a flat file:

https://forum.storj.io/t/changelog-v1-11-1/8773

So…

@shoofar should have no issues going forward. I didn’t see a post on which version was currently being vacuumed. It’s possible that the orders.db file was no longer being used… just left behind after the update to V1.11.1 … in that case, the file is smaller, but wasn’t being used anyway.

However, it’s not a good idea to delete files that you are unsure about needing.

shoofar · September 14, 2020, 12:24pm

Cool - wasn’t paying attention to last change as it almost happened while I was away.
Great to know that.
Thanx!

cdhowie · September 14, 2020, 4:13pm

Yep. My 300MB+ file compacted to a few KB as well. It likely wasn’t deleted as part of the DB migrations because that would throw out any unsent orders you had when you did the update. A usual part of these kinds of updates is migrating the old data to the new format, and I suspect that the DB migration scripts run first.

Presumably a future update could remove the file altogether, but there isn’t a compelling reason to do so.

This is not strictly true. You can vacuum a database that is in use so long as you run the vacuum process in a context where storagenode and the vacuum process can see each other’s locks and you run the vacuum in-place (not copying to a ramdrive). The vacuum process will lock the database for the entire duration of the operation.

As long as they can see each other’s locks, the worst case scenario is that the vacuum takes a long time and storagenode times out some operations waiting for a lock.

If they can’t see each other’s locks then this process will almost certainly destroy the database.

The likely reason for storagenode itself not vacuuming on a regular basis is that it can be a several-minute operation as the databases tend to have a lot of internal fragmentation. This can disrupt service.

Toyoo · September 14, 2020, 6:04pm

I am doing it as part of the node’s (re-)start procedure, so it’s vacuumed every time there’s upgrade. That should be enough given the recent frequency of updates.

twl · September 14, 2020, 6:27pm

I think this might be quite the “dangerous” operation to do, so a power outtage might cause severe failure - it’s probably better for some people in the world to not have it auto-start during a tornado or heavy thunderstorm of any sort.

cdhowie · September 14, 2020, 7:57pm

All sqlite operations should be safe from a power cut, provided the underlying filesystem and storage device don’t lie about flush requests. Vacuum is actually one of the safer operations at it uses a temporary file to rebuild the database.

SGC · September 16, 2020, 5:46am

i just optimized my system for long term database/large dataset type loads.
so nearly 7 months in an i have yet to deal with a database maintenance issue.

a few key points in this is sizable and big l2arc, all writes are sync writes and thus goes through a slog device / ram until its flushed to the hdd’s in sequential writes thus reducing fragmentation of the data on drives.

ofc these features will not clean bad database structure, they will only help keep the utilization smooth and snappy, enough so that there should be enough other nodes with issues before i feel them and thus storj should in theory take care of it

thats my approach anyways…
if doing vacuuming on a db is safe… well thats a larger debate, which i don’t know a massive amount about, i did watch some lectures about it… in which it sounded a bit like… one cannot really avoid it at times, but it will remove stuff from the database and thus atleast in theory will never make the data safer or more stable.

but isn’t it always like that, performance vs safety
can’t have both

personally i like to compare vacuum to trim on ssd’s, it also helps performance, but will ruin your ability to restore recently deleted data, and tho not the same as vacuum it is fairly similar from a conceptual point of view.

i do use trim on my ssd’s tho
never really did do much db stuff, so took the hardware approach to the problem, instead of learning new stuff lol

joesmoe · October 14, 2020, 12:23pm

Do i run this from the data directory? Or specify a specific node? Or does it run for all nodes on my machine?

BrightSilence · October 14, 2020, 9:37pm

There should be no reason to do this anymore at all. Why do you want to do this?

joesmoe · October 15, 2020, 7:33am

I stll have huge userdb from ages ago on some nodes. Was just trying to do some housekeeping.

BrightSilence · October 15, 2020, 7:45am

As long as your db’s are not approaching 1% of size used by your node, I wouldn’t bother. This procedure involves risk and doing it without knowing what you’re doing exactly involves more risk. It’s not worth it to save a few gb.

There used to be some reason to do this because transfers could trip over slow database processing. But I think that is mostly solved. So as long as you see no issues, just leave the db files alone.

SGC · October 15, 2020, 11:51am

yeah i’m in full agreement with brightsilence on this, SNO’s shouldn’t touch the DB files…
if you think there is something wrong, then make a troubleshooting thread for it and get some of the storjlings to look at it…

vacuuming a database is basically deleting unused stuff… but figuring out what is the unused stuff… well thats the trick and if you are just a little bit off you might ruin the entire database.

if you do have userdb files that might require vacuuming and you will be far from alone, thus making an issue of it to solve the issue by the storjling route could save tons of people tons of trouble in the future.

vacuuming databases can be highly efficient tho… but me watching a single lecture on it hardly makes me qualified to speak to that lol… i can say without a doubt, it can also go wrong…

Unique · November 23, 2020, 11:35am

I suggest a SNO is responsible for the databases and as such can do maintenance on them in line with Storj recommendations or as a talented expert if they wish to take the risk. Storj has published some database repair procedures, two of which I’ve used. One due to the fact that the vacuum procedure let me know of a database corruption. That said I don’t recommend regular vacuuming unless the database is growing exponentially, which may point to an issue.