Changelog v0.21.4

v0.22.0 will contain a storage node database migration that needs several minutes to hours. To reduce the time we are splitting the migration into 2 parts and release part one with v0.21.4.

Important:
We don’t want all storage nodes to update at the same time because that would mean the network goes offline for a few minutes. Our target is an update window of 12 hours. If your storage node doesn’t install the update right away please be patient and wait. Do not force an update and take offline the network that way

4 Likes

Looking at the source code on github…

It seems the database is transitioning from sqlite to postgresql.

Is that accurate?

If so, this is great news! And will there be a db backup function enabling daily db export external to the docker container?

I think that a node with a day old db would fail a lot of audits, since there probably is no way to signal the satellite “hey, I had to restore a backup that’s x hours old”.

Can you show me the code line? I am not aware of any postgres migration.

External db backup as well as external log backup is very useful for troubleshooting and general record keeping. It’s also very useful for interface development using actual network acquired data.

That code line was added 11 month ago. I don’t see any affect on storage nodes. Yes we are using postgres for the satellite for a long time now.

That’s good to know.

Well… I, for one, vote for postgres migration for storagenodes.

In any case, I was paging through the source code to figure out what exactly was meant by “database migration” … Can you elaborate so we know what to expect as far as changes to the db? What are the general schema changes to be expected? Will this change optimize queries? Why is the change necessary?

1 Like

You have 2 options.

  1. You can implement postgres for storagenodes and we will most likely merge it even if it will be slower compared to sqlite3
  2. You can show us how slow sqlite3 is and the improvement you would expect from postgres. I recommend starting here: Guide to debug my storage node

A simple vote without any data to proof your point doesn’t work :wink:

We split the sqlite DB into multiple sqlite files. That way we can lock them individualy. That will allow us to scale better on bigger storage nodes.

Doesn’t this answer the first part of response to my post about voting for postgresql?

I’m not complaining about sqlite3. It’s probably fine. However, there should be little doubt that postgres is a better fit for the application… especially since it’s currently being used for satellites.

I’m not that much of a coder. I know ANSI C and various Assembly languages quite well. However, my main employment is EE. But, I can give Go and go and see how far I get.

I have to disagree. For the storage nodes there should be no doubt that posgres is slower than sqlite3. Postgres will open a network connection even to localhost. Doing that time sqlite3 delivered the data already.
Sqlite3 gets a problem dealing with multiple parallel requests. That is why we don’t use it on the satellite side. On the storage node side the access pattern is different.

You are wellcome. Some other developer might want to join you and I can test it any time. Do you want to open a post in the forum to form a team?

Unix socket connection… not network.

A unix socket is a file.

Have there been any speed tests on the comparison? If not… I could start on a test scenario next week.

EDIT:

I set up a github for any experiments and/or code additions related to adding postgresql. Please note that this is quite part-time for me… and a bit of a curiosity side project. Perhaps it will be become a major project… I don’t quite know yet.

For now, my experimental Storj will be here:

Please let me know if the repo name is not allowed per official Storj policy, and I’ll change it to something else.

Just letting you know the Stories roadmap Storj V3 Public Road Map isn’t updated on the Aha site. @littleskunk

There also seem to be database corruption problems with sqlite after an unclean shutdown. This shouldn’t be happening anyway (sqlite should use a journal) but I’d expect postgres to be more resilient to unclean shutdowns as it’s used in production environments with high durability guarantees; sqlite is not.

I’d vote for postgres over sqlite just from a reliability perspective.

Same deal. Go for it and proof your statement. Every database can get corrupted :wink:

Every vehicle safety system can fail as well. But the better-engineered ones will fail less often than the worse-engineered ones. Your reply would seem to argue that the worse-engineered one is good enough because “the better-engineered one can also fail.”

While technically a true statement, it completely misses the point.

The number of threads on this very forum that talk about a corrupted database after an unclean shutdown should be proof enough that sqlite is not reliable enough. Something is definitely wrong either with sqlite or the way that storagenode runs it, as forcibly terminating the storagenode container alone should never corrupt the database, yet this seems to happen frequently enough to be concerning.

Either storagenode doesn’t enable journaling for the database, or the sqlite journal implementation is faulty. Or, somehow, every person complaining about a corrupted database also has faulty HDD or RAM. (Possible, but unlikely.)

If SIGKILL can corrupt a database, the database engine is poorly-designed.

I manage several production postgres servers and they’ve all experienced an unclean shutdown at one time or another, and none of them has been corrupted purely as a result of the unclean termination.

1 Like

Can you help Storj by proving what you said can be implemented for storjnode ? More explanation on how it is great won’t help Storj decide to migrate to postgres. Devs have done their due diligence in choosing what they have now and we (devs, Storj, SNOs) want a stable system too so please submit your code proving this in the github repo.

1 Like

Golang is turing-complete. A turing-complete language can perform any task that any other turing-complete language can perform. Other turing-complete languages have postgres clients and can use postgres. Therefore, it is possible to implement in golang (and almost certainly already has).

Of course it can be implemented for storagenode. Asking for proof that something can be implemented in a turing-complete language is a bit silly.

This is true, but the posts I’m responding to weren’t saying “pull requests are welcome if someone does the work” they were saying “we have no proof that postgres is more reliable than sqlite.” That is what I was responding to.

Every objection to postgres so far is easily refuted except for the argument that it will take time to implement. I don’t dispute this, and I’d be interested in working on such a patch if we decide at the outset that it would be considered instead of dismissed for incorrect reasons.

Another example:

Connection pooling eliminates this concern. A dedicated RDBMS is also able to make better decisions about caching data and query plans, and is able to provide more fine-grained locking for writes. (Sqlite does not support concurrent writes at all, which is likely part of the reason for splitting the databases apart. In postgres this feature would have been “free” from the beginning.)

Postgres might very well wind up being slower (though I suspect not) but if it is slower, it won’t be for the reasons brought up here.

However, my bigger concern is that the sqlite implementation seems to have a reliability problem. That should be a substantially bigger issue than concerns about a slight reduction in performance.

1 Like

I second the reliability question. While I have not made extensive tests and cannot provide solid proof, I have not seen a Postgres or MySQL (InnoDB) database get corrupt just because the server lost power (hell, not even running out of space seems to corrupt the db). I have seen corrupt databases, but they were due to bad sectors, deleted files etc.
The number of “I lost power and now the db is corrupt” questions here (or in the chat, when it was used) suggest that either sqlite is not as reliable as Postgres of MySQL or it is used here slightly incorrectly and that is what makes it get corrupted.

2 Likes

A lot of assumptions in here. Again we wellcome PRs and proof of your statements.

1 Like