Release preparation v1.113

Alexey · September 22, 2024, 3:42am

I believe it has been changed?

Mitsos · September 22, 2024, 6:14am

Dunno, going by the config comment.

edo · September 24, 2024, 5:14pm

According to https://version.storj.io/, we might start the v1.113 rollout soon (no cursor yet). This includes the flat file piece expiration fix, and it seems default enabled. I also noticed that it doesn’t seem to include the chained piece store that checks both the DB and file for expired pieces.

Is my observation correct and will this not be a problem?

Edit: it seems we have now started the rollout of v1.113.

Alexey · September 25, 2024, 7:40am

It should include the scan of both databases as far as I know.
Oh, seems storagenode/pieces: read+delete piece expirations from both stores · storj/storj@ba68d91 · GitHub is not included to 1.113.2

edo · September 25, 2024, 7:48am

Exactly. Does this mean we will need to rely on GC to clean up the expired pieces from the db?

Alexey · September 25, 2024, 7:48am

If the rollout would continue, likely - yes until the next release.
But I asked this.

By the way, do you have any TTL records with a TTL like 2 weeks from now?

edo · September 25, 2024, 8:00am

This is what I get for a small node that already updated to v1.113.2:

sqlite3 /storage/piece_expiration.db "SELECT DATE(piece_expiration) AS expiration_date,COUNT(*) AS piece_count FROM piece_expirations GROUP BY DATE(piece_expiration) ORDER BY expiration_date;"
2024-09-24|15
2024-09-25|436
2024-09-26|572
2024-09-27|13624
2024-09-28|12782
2024-09-29|14758
2024-09-30|15669
2024-10-01|14531
2024-10-02|2
2024-10-03|1
2024-10-04|6
2024-10-05|1
2024-10-06|8
2024-10-07|7
2024-10-08|3
2024-10-26|1
9999-12-31|1431

@Alexey Does this answer you question? I can check other nodes too if you like.

Alexey · September 25, 2024, 8:02am

So, when you get this update, you likely won’t have a lot of old entries to delete before the next release, right?

I would prefer that we get all the next TTL data in a new fast storage (it’s not only one customer which is uploading a lot with TTL), than postponing a release to get a scan for both databases.

edo · September 25, 2024, 8:08am

Well, that was just one of my smallest nodes. One of a bigger node returns:

sqlite3 /storage/piece_expiration.db "SELECT DATE(piece_expiration) AS expiration_date,COUNT(*) AS piece_count FROM piece_expirations GROUP BY DATE(piece_expiration) ORDER BY expiration_date;"
2024-09-25|21944
2024-09-26|27692
2024-09-27|21932
2024-09-28|16427
2024-09-29|14226
2024-09-30|23041
2024-10-01|13251
2024-10-02|2608
2024-10-03|3
2024-10-04|8
2024-10-05|1
2024-10-06|8
2024-10-07|2
2024-10-08|6
2024-10-09|1
2024-10-11|1
2024-10-13|1
2024-10-14|3
2024-11-13|1
2024-11-22|1
2024-11-23|1
9999-12-31|10970

So there is somewhat more data that will expire before a next release.

I understand. For me it is not the end of the world, now that I know GC works properly and will clean this up in a couple of weeks time. I cannot speak for other SNOs with bigger nodes though.

Alexey · September 25, 2024, 8:09am

The release likely would happen on 100% in the next few days, I’m thinking that all TTL data will be deleted to this time already using the old storage. And a new one TTL data which likely will hit your node - remember, we didn’t upload a test data for at least two weeks, so these records likely from these other customers, not tests, and I’m sure they would continue their strategy as before. And it will be stored in a new fast storage, so less GC.
So I would vote to continue the release.

edo · September 25, 2024, 8:22am

If I’m following you correctly (and I hope I am, otherwise I might need more coffee ), you’re saying that since we haven’t uploaded much test TTL data recently, it makes sense to keep rolling out the flat file solution so that at least new TTL data gets registered properly. The old TTL data will naturally expire over the next few weeks and should be easily handled by the garbage collector. If that’s the gist, then I’m on board!

It’s just a shame that the chained solution didn’t make it into this release—especially since it’s been on the radar for weeks:

But on a positive note: at least the flat file solution is released!

Alexey · September 25, 2024, 8:47am

No. At least not fully. The TTL data which is already in the TTL database should be removed by a TTL collector, before your node would get an update, there are remaining one digit records, sure, but at this time the new TTL data would likely arrive. So, I prefer, that it will be handled properly (i.e. registered in the new TTL database fully without a random skip, as it of now). So, less GC in the future.

But yes, the missing (skipped) TTL data will be collected by a GC.

snorkel · September 25, 2024, 2:53pm

Ahead of the official post again, with a jucy update in 114:

7b4be35 storagenode/monitor: support fully dedicated disks with simplified space calculations
21d58e7 storagenode/pieces: save-state-resume feature for used space filewalker

These are very very helpful! Show some love for the team.

LrrrAc · September 25, 2024, 3:51pm

Any idea what

7b4be35 storagenode/monitor: support fully dedicated disks with simplified space calculations

will change? Will we get to just say “Dedicated Disk” and not deal with telling it any size?

pasatmalo · September 25, 2024, 4:51pm

"This patch identifies the available space in a different way. It ignores all the calculated used space, but checks the available space in the running partition.

UI will show fake numbers, but it should work."

Essentially because the whole hard drive is dedicated to Storj, then the available space can be fetched from the partition instead of having to rely on the used space calculations.

It would seem that at the moment nothing else changes.

LrrrAc · September 25, 2024, 5:02pm

Nodes currently check anyway, they just will lower the size of the node if necessary. I guess this will just let it fill anything it finds.

arrogantrabbit · September 25, 2024, 5:28pm

I don’t get it. What is “dedicated disk”?

Available space is by definition the smaller of available storage space and [allocated storage space in the config file]. It does not matter how much is used, it only matters how much space is left on the current volume.

Whether this is dedicated disk, or a dedicated disk with quota, or a partition on a disk, or a dedicated array, or a dataset on existing pool, or a dataset with quota imposed is completely and entirely irrelevant.

Very strange change.

Roxor · September 25, 2024, 5:32pm

I think it just allows a node to assume it’s the only app using space on a filesystem. So it wouldn’t have to add-up the millions of files in the blobs directory anymore (like used-space-filewalker does). It could just look at what your filesystem is reporting as size and free… subtract the space it uses in the non-blobs folders… and what’s left must be “used space”.

So… way less disk IO determining used-space?

arrogantrabbit · September 25, 2024, 5:34pm

Ah, it’s for used space calculation.

UI will show fake numbers, but it should work.

Why would then UI show fake numbers?

Also, this mechanism is already used to prevent running out of space (the 3GB buffer or whatever it is today)

Roxor · September 25, 2024, 5:38pm

If it shows you a number for “used space”… but it didn’t actually add up all the .sj1 files (like it does now)… it wouldn’t be a real number, would it? It’s at best an educated guess. Maybe fake is too strong a word for such an estimate?