Has it ever been considered to use in-memory databases?

jammerdan · February 28, 2024, 6:11am

Something like suggested here:

daki82 · February 28, 2024, 8:06am

would require UPS? then its a nogo.

jammerdan · February 28, 2024, 8:20am

I don’t think it would require a UPS.
My idea is:

On node start create database in memory and import existing databases from disk
Periodically (maybe configurable by SNO) backup databases to disk
On node shutdown export the memory database to disk

I mean the databases are nice to have but not that mandatory. In case of issues (power, unexpected shutdown or whatever) the SNO would only lose the data since the last periodical backup.

I don’t know if that would work, I am not a coder but that’s my idea.

daki82 · February 28, 2024, 8:50am

That increases ram usage a lot, especialy for low ram devices like raspi and nas.

jammerdan · February 28, 2024, 8:52am

How much would that be?
Very often we read the advice to move databases to SSD for better performance.
Yes, maybe on some devices it would require more Ram but on others maybe that would improve performance a lot?

daki82 · February 28, 2024, 8:53am

500MB per node estimate, im no coder also.
but i think a switch on solution is not worth the ressources.
if the device has enough ram, there are also solutions via primocache, arc,os, special devices, ramdisk software, etc…

jammerdan · February 28, 2024, 8:56am

As far as I understand there is a lot of old historical data in the databases. This does not sound like data that has to be in memory at all times. There should be an archive database for old data (like last month) and database for current data (actual month) which could then go into memory.
This could reduce the memory footprint a bit.

daki82 · February 28, 2024, 8:58am

the real big db s already do archives. other dbs do wal mode, i think the bigges problem is the fragmentation of the dbs.

jammerdan · February 28, 2024, 9:00am

I don’t know which of them are the most I/O intensive ones. But at least those of course should the one that should be running in memory.
Maybe this could be even made optional. So SNO could choose how to run it.

daki82 · February 28, 2024, 9:59am

thats exactly how ramdisks and (partialy)primocache work, i do not know about linux, but for windows for sure these software are around for long time.
since we can edit the orders and database path, again, its not worth coding existing software in the own code, “just in case”

jammerdan · February 28, 2024, 10:01am

If storagenode implements something like this, it must be independent from OS and 3rd party tools.

daki82 · February 28, 2024, 10:02am

its not worth to reinvent the 5th wheel. imho.
WAL-mode is implemented in sqlite and storagenode uses it.
https://www.sqlite.org/wal.html

jammerdan · February 28, 2024, 10:06am

We see the suggestion to move databases to SSD very frequently. So IMHO it would be worth it to remove this bottleneck.

Toyoo · February 28, 2024, 10:06pm

You can get the same effect with a careful configuration of rclone and a ram disk.

jammerdan · February 29, 2024, 2:29am

There are probably always ways to hack something. But this is prone to break on every update if something has been changed. And it would have to rely on 3rd party tools. AFAIK there is no native way in Windows to create Ramdisk. I haven’t seen anybody here dosing such a solution, people are buying SSDs instead. That tells all.
But doing the databases on the RAM level would be the fastest way, eliminating the IOPS bottleneck and the need for moving them to SSDs. And using the native sqlite in memory and backup function, such a solution would be available on all operating systems.
So I believe this would be the best and most reliable and resilient solution.

daki82 · February 29, 2024, 7:32am

I think
because it would require UPS to be resilent.
an 20€ usb drive is propably the most cost efficient solution.(mine running still fine) (and most pc have a ssd anyway)

Alexey · February 29, 2024, 8:49am

And this is how this implemented right now. You have some in-memory data, but we flush it to a persistent databases periodically.
This is why all these filewalkers starts from scratch by the way…

Toyoo · February 29, 2024, 10:11am

How do you know it would be most reliable and resilient solution, if it is more complex, and hence prone to bugs, compared to the current code base?

jammerdan · February 29, 2024, 10:23am

Please read what I have written.

jammerdan · February 29, 2024, 10:26am

This is interesting. Still the databases on disk can pose a bottleneck.
So if it is already implemented (partly?) in this way, maybe more databases can be thrown into memory or time of periodical flush could be extended?