Quick growth of piece_expiration.db

Hello together,

only some months ago I started a new node and it has already ~200 MB of the piece_expiration.db. Can somebody explain the high growth rate and how to shrink it? The filewalker is not active.

Thanks and kind regards,

Well, without Filewalker running to purge expired segments, you are going to build them up (And not get paid for having them) which subsequently increases the database size

2 Likes

image

Considering above data, does it mean I have 435 pieces that got left behind from the year 2022 ?

I got also way older nodes who have like 60 MB. Don’t know if it is filewalker-related.

Fair enough. I don’t know off the top of my head the database retention rules. Will ask an engineer.

1 Like

well ,my filwalker is runnig for sure, and i have same 250MB .db… at 3TB node data

maybe clients add data for only a short time?

Try vacuuming it.

1 Like

Thanks alot for the vacuum tipp.

Have tried it with a 440 MB piece_expiration. Brought only 40 MB or 10% to 400 MB. Will have to delete the piece_expiration.db as they are clearly the reason for the grafana-dashboard failures and very likely slow read-rate.

I habe also a high file on bandwith.wal:

When the data-bases are moved to a SSD, then the problem is solved. Unfortunately it also happends to a single node on a 20 TB CMR Enterprise HDD.

https://www.sqlite.org/wal.html

@Alexey your opinion on this please.

I have no idea, I usually respond only when I think that I know the answer.

5 Likes

I appreciate and admire that in you hence the title of Awesome :slight_smile:

2 Likes

I thought it’s only me, and I was going to ignore it, but I also have that db way too big. I remember I checked all dbs a few monyhs ago and the biggest was the bandwidth. Now the piece…db is the biggest. I’ll edit with all the sizes from my nodes in an hours or so. So this si a recent thing. The FW was run on all nodes last week.
How can I “vacuum” them? Is it dangerous? Do I have to stop the nodes?

You can vacuum them the usual way for SQLite:

  1. Stop and remove the container (or stop the service in case of Windows/Linux GUI)
  2. Run the command

However

I think you should not fix what’s not broken.

1 Like

I don’t know, but something fishy is happening. The piece_expiration.db is growing FAST!
These are all my nodes; take note that those small ones are started on 30-31 december, so I don’t even know if they are fully vetted, and have huge db for their age, too. The FW was run.
Synology’s, Exos, ext4, no ssd.

Node: space occupied / piece_expiration.db

node11: 12.3TB / 527MiB
node12: 431GB / 77MiB

node21: 14.3TB / 512MiB
node22: 426GB / 77MiB

node31: 6.8TB / 160MiB
node32: 6.8TB / 255MiB

node41: 11.1TB / 522MiB
node42: 425GB / 78MiB

node51: 12.8TB / 530MiB
node52: 433GB / 78MiB

node61: 14.4TB / 418MiB

node71: 14.5TB / 324MiB
node72: 654GB / 105MiB

node81: 14.4TB / 516MiB
node82: 427GB / 78MiB

I have also several 400-500 MB piece_expiration DBs. Like I said the only solution was to move the DBs to a SSD and to change the --mount order, with this guide:

Now the dashboard and the STORJ-Exporterlogs are fine after the DB-moving.

1 Like

It certainly seems like something is wrong. The collector service on your node should be continually trying to get rid of any entries older than the current time. My only theory is that it is experiencing some error trying to delete those files. If that’s the case, it would keep those entries around so it could try again later. Do you have any errors in the log relating to a service called “collector”?

Another thing to check would be whether the deletion_failed_at column is set for those entries:

SELECT
    satellite_id, piece_id, piece_expiration, deletion_failed_at
FROM
    piece_expirations
WHERE
    piece_expiration < '2023-01-01 00:00:00';
ORDER BY
    piece_expiration;

that would mark the last time the node failed in trying to delete those pieces. If it’s blank, something altogether different is going on.

1 Like

The semi colon terminated the where clause so I edited it :nerd_face:

I will work on that and edit this post.

Edit:

2023-12-25T02:17:48Z    ERROR   collector       unable to update piece info     {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "2XLSFR7RKZXCZZUBAATNNJRFNZG7DTSXQGKQ3UILLHTHO3AZ3FKQ", "error": "pieceexpirationdb:
database is locked", "errorVerbose": "pieceexpirationdb: database is locked\n\tstorj.io/storj/storagenode/storagenodedb.(*pieceExpirationDB).DeleteFailed:99\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteFailed:597\n\tstorj.io/storj/storagenode/collector.(*Servic
e).Collect:109\n\tstorj.io/storj/storagenode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tsto
rj.io/storj/private/lifecycle.(*Group).Run.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2023-12-25T02:17:48Z    ERROR   collector       unable to delete piece  {"process": "storagenode", "Satellite ID": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "Piece ID": "2XLSFR7RKZXCZZUBAATNNJRFNZG7DTSXQGKQ3UILLHTHO3AZ3FKQ", "error": "pieces error: database is
locked", "errorVerbose": "pieces error: database is locked\n\tstorj.io/storj/storagenode/pieces.(*Store).DeleteExpired:365\n\tstorj.io/storj/storagenode/pieces.(*Store).Delete:344\n\tstorj.io/storj/storagenode/collector.(*Service).Collect:97\n\tstorj.io/storj/storagen
ode/collector.(*Service).Run.func1:57\n\tstorj.io/common/sync2.(*Cycle).Run:160\n\tstorj.io/storj/storagenode/collector.(*Service).Run:53\n\tstorj.io/storj/private/lifecycle.(*Group).Run.func2.1:87\n\truntime/pprof.Do:51\n\tstorj.io/storj/private/lifecycle.(*Group).Ru
n.func2:86\n\tgolang.org/x/sync/errgroup.(*Group).Go.func1:75"}
2 Likes