Storage node loading screen performance

I noticed that some storage node operators reported that they have to wait 10 seconds or more for the loading screen. I place my bet on the bandwidth graph. That should be the only one that queries a bit of data from a sqlite3 database. Even if the developer team removes the loading screen it wouldn’t solve the root cause for this delay.

I have looked into the code and found the query that the storage node is executing. I believe we are missing an index. My own storage node doesn’t have this issue. I would need some volunteers to help me out.

First of all I want to document my steps so that you might be able to solve the next puzzle on your own.

On the storage node dashboard we are showing a bandwidth graph. We are showing some other graphs as well and it would be worth checking them as well but for the moment I focused on the bandwidth graph because the table for it is holding a few more rows compared to the other graphs on the dashboard.

In the code you can find the SQL query: https://github.com/storj/storj/blob/080ba47a06ded71b0844d908cf9f682f5d8003a2/storagenode/storagenodedb/bandwidthdb.go#L139-L149

Now lets analyse that query with sqlite3:

sqlite> .eqp on
sqlite> SELECT action, sum(a) amount from(
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage
   ...> WHERE datetime('2020-10-01') <= datetime(created_at) AND datetime(created_at) <= datetime('2020-11-01')
   ...> GROUP BY action
   ...> UNION ALL
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage_rollups
   ...> WHERE datetime('2020-10-01') <= datetime(interval_start) AND datetime(interval_start) <= datetime('2020-11-01')
   ...> GROUP BY action
   ...> ) GROUP BY action;
QUERY PLAN
|--CO-ROUTINE 2
|  `--COMPOUND QUERY
|     |--LEFT-MOST SUBQUERY
|     |  |--SCAN TABLE bandwidth_usage
|     |  `--USE TEMP B-TREE FOR GROUP BY
|     `--UNION ALL
|        |--SCAN TABLE bandwidth_usage_rollups
|        `--USE TEMP B-TREE FOR GROUP BY
|--SCAN SUBQUERY 2
`--USE TEMP B-TREE FOR GROUP BY

That is a full table scan without an index.

Some other useful command that you might need:

sqlite> .schema
CREATE TABLE bandwidth_usage (
                                                satellite_id  BLOB    NOT NULL,
                                                action        INTEGER NOT NULL,
                                                amount        BIGINT  NOT NULL,
                                                created_at    TIMESTAMP NOT NULL
                                        );
CREATE TABLE bandwidth_usage_rollups (
                                                                                interval_start  TIMESTAMP NOT NULL,
                                                                                satellite_id    BLOB    NOT NULL,
                                                                                action          INTEGER NOT NULL,
                                                                                amount          BIGINT  NOT NULL,
                                                                                PRIMARY KEY ( interval_start, satellite_id, action )
                                                                        );
CREATE INDEX idx_bandwidth_usage_satellite ON bandwidth_usage(satellite_id);
CREATE INDEX idx_bandwidth_usage_created   ON bandwidth_usage(created_at);

and last but not least is the table holding enough data to make a full table scan slow?

sqlite> SELECT count(*) from bandwidth_usage;
1847
sqlite> SELECT count(*) from bandwidth_usage_rollups;
108099

Nevermind I found already the solution. The query doesn’t use the index. This is how it should look like:

sqlite> .eqp on
sqlite> SELECT action, sum(a) amount from(
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage
   ...> WHERE '2020-10-01' <= created_at AND created_at <= '2020-11-01'
   ...> GROUP BY action
   ...> UNION ALL
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage_rollups
   ...> WHERE '2020-10-01' <= interval_start AND interval_start <= '2020-11-01'
   ...> GROUP BY action
   ...> ) GROUP BY action;
QUERY PLAN
|--CO-ROUTINE 2
|  `--COMPOUND QUERY
|     |--LEFT-MOST SUBQUERY
|     |  |--SEARCH TABLE bandwidth_usage USING INDEX idx_bandwidth_usage_created (created_at>? AND created_at<?)
|     |  `--USE TEMP B-TREE FOR GROUP BY
|     `--UNION ALL
|        |--SEARCH TABLE bandwidth_usage_rollups USING INDEX sqlite_autoindex_bandwidth_usage_rollups_1 (interval_start>? AND interval_start<?)
|        `--USE TEMP B-TREE FOR GROUP BY
|--SCAN SUBQUERY 2
`--USE TEMP B-TREE FOR GROUP BY

I am not sure how we can test this code change on a real storage node. For the moment I will simply open a PR and get this into the next version. Let’s see if that helps and we can still open a new thread if it doesn’t help.

8 Likes

Are you sure it doesn’t do an implicit text type comparison if implemented that way?

I suggest this:

sqlite> .eqp on
sqlite> SELECT action, sum(a) amount from(
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage
   ...> WHERE datetime('2020-05-01') <= created_at AND created_at <= datetime('2020-06-01')
   ...> GROUP BY action
   ...> UNION ALL
   ...> SELECT action, sum(amount) a
   ...> FROM bandwidth_usage_rollups
   ...> WHERE datetime('2020-05-01') <= interval_start AND interval_start <= datetime('2020-06-01')
   ...> GROUP BY action
   ...> ) GROUP BY action;
QUERY PLAN
|--CO-ROUTINE 2
|  `--COMPOUND QUERY
|     |--LEFT-MOST SUBQUERY
|     |  |--SEARCH TABLE bandwidth_usage USING INDEX idx_bandwidth_usage_created (created_at>? AND created_at<?)
|     |  `--USE TEMP B-TREE FOR GROUP BY
|     `--UNION ALL
|        |--SEARCH TABLE bandwidth_usage_rollups USING INDEX sqlite_autoindex_bandwidth_usage_rollups_1 (interval_start>? AND interval_start<?)
|        `--USE TEMP B-TREE FOR GROUP BY
|--SCAN SUBQUERY 2
`--USE TEMP B-TREE FOR GROUP BY
1|68009246976
2|3622305536
3|120832
5|4447795200

The column is already a timestamp so doesn’t require conversion to begin with and casting the constant explicitly doesn’t obstruct the use of indexes as you can see.