[Tech Preview] Hashstore backend for storage nodes

I have migrated my nodes to memtable. They are running almost silent now (=more sequencial writes). Here are my settings:

Environment  = STORJ_HASHSTORE_COMPACTION_PROBABILITY_POWER=2 #will become the new default value next release
Environment  = STORJ_HASHSTORE_COMPACTION_REWRITE_MULTIPLE=10 #will become the new default value next release
Environment  = STORJ_HASHSTORE_SYNC_LIFO=true
Environment  = STORJ_HASHSTORE_STORE_FLUSH_SEMAPHORE=1
Environment  = STORJ_HASHSTORE_TABLE_DEFAULT_KIND=memtbl
Environment  = STORJ_HASHSTORE_MEMTBL_MMAP=true
Environment  = STORJ_HASHSTORE_MEMTBL_MLOCK=false

I copied the settings used for the storj select nodes. What ever works best for them should also be good enough for my node.

I haven’t moved the hashtable to SSD. That seems unessesary to me. Memtable with hashtable on HDD is working great for me and in case of a hardware failure I can get my nodes back online in a short time.

7 Likes

They are get payed for the whole space, not for used only, if i remember right. So, you are probably going to waste some of your space.
P.s: may be you can update 1st post with memtable migrate instruction?

Any time I’ve heard that repeated: nobody could provide a link. I don’t think it’s true.

But regardless: any tuning that helps Select, helps Storj… and Storj remaining-in-business helps all SNOs :money_mouth_face:

2 Likes

Does these settings suppose to have 1GB RAM for each 1TB of HDD space?
What is there is not enough RAM?
What should I change for scenario with 1GB RAM per 10TB HDD?
Should I set STORJ_HASHSTORE_MEMTBL_MMAP=false ?

I haven’t tried that. My system is currently using 16 out of 32 GB available memory for 60 TB used space. That is including some other services I am running in the background.

If I remember correct the switch from hashtable to memtable consumed like 8 GB or so. I haven’t tried if disabling the MMAP flag changes anything.

1 Like

How dangerous is this, in regards to crashes, power outages etc?
Is there a mechanism in place already to reconstruct what might get destroyed by a crash?

Lets say I haven’t managed to corrupt my system and I had to hard reset my system a few times. I am using an orange Pi 5 Plus. I sometimes have the feeling the tiny fan might give up any minute and I will have to buy a replacement board. My setup is designed to deal with that. Setting up a new one would take just a few hours. Install OS, install some basic software, mount the drives and the nodes should be up and running again. Both options hashstore and memtable with an HDD for the metadata should allow that. I am not moving the metadata to the SSD because I have too many drives that would depend on a single SSD. I expect the SSD to die one day and I would prefer to get my nodes up and running in just a few hours. That means the SSD is for logfiles and stuff but not for metadata. And it also looks like memtable doesn’t care that it is running on spinning metal.

1 Like

I like your thinking. if the goal is to isolate “one node per drive” then it’s important to keep only one drive per node, as well.

1 Like

Synology with 8GB RAM, 2 nodes on 2 drives.
Before applying those settings:

2025-05-07T06:13:20Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "store": "s0", "duration": "2h10m30.054595622s", "stats": {"NumLogs":1963,"LenLogs":"1.9 TiB","NumLogsTTL":56,"LenLogsTTL":"28.1 GiB","SetPercent":0.9883434219887592,"TrashPercent":0.03489197160339147,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20215,"LastCompact":20215,"LogsRewritten":605,"DataRewritten":"535.5 GiB","Table":{"NumSet":11069730,"LenSet":"1.9 TiB","AvgSet":185544.04615866873,"NumTrash":349076,"LenTrash":"67.5 GiB","AvgTrash":207721.61697739174,"NumSlots":33554432,"TableSize":"2.0 GiB","Load":0.32990366220474243,"Created":20215,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-08T08:29:11Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "store": "s1", "duration": "57m57.911509404s", "stats": {"NumLogs":1310,"LenLogs":"1.2 TiB","NumLogsTTL":53,"LenLogsTTL":"12.1 GiB","SetPercent":0.9869292524953207,"TrashPercent":0.028067687299006265,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20216,"LastCompact":20216,"LogsRewritten":309,"DataRewritten":"258.7 GiB","Table":{"NumSet":6577320,"LenSet":"1.2 TiB","AvgSet":204466.52798890733,"NumTrash":195733,"LenTrash":"35.6 GiB","AvgTrash":195401.43522042784,"NumSlots":16777216,"TableSize":"1.0 GiB","Load":0.39203882217407227,"Created":20216,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-07T04:05:05Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "store": "s0", "duration": "20m5.769544164s", "stats": {"NumLogs":544,"LenLogs":"539.6 GiB","NumLogsTTL":12,"LenLogsTTL":"7.5 GiB","SetPercent":0.981840243999205,"TrashPercent":0.1964608939790932,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20215,"LastCompact":20215,"LogsRewritten":80,"DataRewritten":"32.6 GiB","Table":{"NumSet":1739553,"LenSet":"529.8 GiB","AvgSet":327024.58848221356,"NumTrash":125432,"LenTrash":"106.0 GiB","AvgTrash":907496.6127941833,"NumSlots":4194304,"TableSize":"256.0 MiB","Load":0.41474175453186035,"Created":20215,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-07T02:04:58Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "store": "s1", "duration": "44m1.730366318s", "stats": {"NumLogs":526,"LenLogs":"519.0 GiB","NumLogsTTL":12,"LenLogsTTL":"5.0 GiB","SetPercent":0.9827181423439193,"TrashPercent":0.08675520988373377,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20215,"LastCompact":20215,"LogsRewritten":86,"DataRewritten":"44.6 GiB","Table":{"NumSet":1864605,"LenSet":"510.0 GiB","AvgSet":293699.54837619764,"NumTrash":92818,"LenTrash":"45.0 GiB","AvgTrash":520864.199271693,"NumSlots":4194304,"TableSize":"256.0 MiB","Load":0.44455647468566895,"Created":20215,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-07T21:05:52Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s0", "duration": "1m12.70430121s", "stats": {"NumLogs":100,"LenLogs":"94.8 GiB","NumLogsTTL":4,"LenLogsTTL":"163.6 MiB","SetPercent":0.9353051118543166,"TrashPercent":0.0336619228811009,"Compacting":false,"Compactions":0,"TableFull":0,"Today":20215,"LastCompact":20215,"LogsRewritten":5,"DataRewritten":"2.2 GiB","Table":{"NumSet":409780,"LenSet":"88.6 GiB","AvgSet":232245.03606813413,"NumTrash":15301,"LenTrash":"3.2 GiB","AvgTrash":223853.03967060975,"NumSlots":1048576,"TableSize":"64.0 MiB","Load":0.3907966613769531,"Created":20215,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-07T19:50:48Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s1", "duration": "9m4.16553993s", "stats": {"NumLogs":104,"LenLogs":"100.3 GiB","NumLogsTTL":4,"LenLogsTTL":"442.6 MiB","SetPercent":0.9436417494837688,"TrashPercent":0.03038700151106748,"Compacting":false,"Compactions":0,"TableFull":0,"Today":20215,"LastCompact":20215,"LogsRewritten":15,"DataRewritten":"9.7 GiB","Table":{"NumSet":406093,"LenSet":"94.6 GiB","AvgSet":250183.36556404567,"NumTrash":14612,"LenTrash":"3.0 GiB","AvgTrash":223900.452231043,"NumSlots":1048576,"TableSize":"64.0 MiB","Load":0.3872804641723633,"Created":20215,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

After applying those settings:

2025-05-11T04:33:16Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "store": "s0", "duration": "6m24.828077758s", "stats": {"NumLogs":1956,"LenLogs":"1.9 TiB","NumLogsTTL":34,"LenLogsTTL":"4.8 GiB","SetPercent":0.9784594992800277,"TrashPercent":0.017776678129570303,"Compacting":false,"Compactions":0,"TableFull":0,"Today":20219,"LastCompact":20219,"LogsRewritten":31,"DataRewritten":"0.9 GiB","Table":{"NumSet":10870074,"LenSet":"1.8 TiB","AvgSet":185762.9418283629,"NumTrash":153940,"LenTrash":"34.2 GiB","AvgTrash":238313.0657658828,"NumSlots":33554432,"TableSize":"2.0 GiB","Load":0.32395344972610474,"Created":20219,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-12T12:11:18Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S", "store": "s1", "duration": "7m45.11335536s", "stats": {"NumLogs":1358,"LenLogs":"1.3 TiB","NumLogsTTL":49,"LenLogsTTL":"11.0 GiB","SetPercent":0.9775337468295083,"TrashPercent":0.03617512880683782,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20220,"LastCompact":20220,"LogsRewritten":25,"DataRewritten":"2.0 GiB","Table":{"NumSet":6665580,"LenSet":"1.3 TiB","AvgSet":207150.09557577886,"NumTrash":252577,"LenTrash":"47.6 GiB","AvgTrash":202305.46255597303,"NumSlots":16777216,"TableSize":"1.0 GiB","Load":0.39729952812194824,"Created":20220,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-12T22:54:04Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "store": "s0", "duration": "5m57.45661745s", "stats": {"NumLogs":539,"LenLogs":"530.3 GiB","NumLogsTTL":8,"LenLogsTTL":"2.4 GiB","SetPercent":0.965312026476717,"TrashPercent":0.27159819746220193,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20220,"LastCompact":20220,"LogsRewritten":20,"DataRewritten":"2.5 GiB","Table":{"NumSet":1715365,"LenSet":"511.9 GiB","AvgSet":320401.17771319806,"NumTrash":159583,"LenTrash":"144.0 GiB","AvgTrash":968998.6892338156,"NumSlots":4194304,"TableSize":"256.0 MiB","Load":0.40897488594055176,"Created":20220,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-12T21:15:41Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs", "store": "s1", "duration": "3m47.232352285s", "stats": {"NumLogs":557,"LenLogs":"545.6 GiB","NumLogsTTL":12,"LenLogsTTL":"3.9 GiB","SetPercent":0.9749799641157659,"TrashPercent":0.11036510010813567,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20220,"LastCompact":20220,"LogsRewritten":31,"DataRewritten":"7.5 GiB","Table":{"NumSet":1952493,"LenSet":"532.0 GiB","AvgSet":292553.6167842855,"NumTrash":107950,"LenTrash":"60.2 GiB","AvgTrash":598974.5652616952,"NumSlots":4194304,"TableSize":"256.0 MiB","Load":0.46551060676574707,"Created":20220,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-13T20:24:42Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s0", "duration": "3.242498697s", "stats": {"NumLogs":113,"LenLogs":"99.2 GiB","NumLogsTTL":10,"LenLogsTTL":"57.1 MiB","SetPercent":0.9167599411657675,"TrashPercent":0.015949679883927516,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20221,"LastCompact":20221,"LogsRewritten":6,"DataRewritten":"0 B","Table":{"NumSet":410082,"LenSet":"91.0 GiB","AvgSet":238217.88118473865,"NumTrash":6737,"LenTrash":"1.6 GiB","AvgTrash":252275.37479590322,"NumSlots":1048576,"TableSize":"64.0 MiB","Load":0.3910846710205078,"Created":20221,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}
2025-05-14T00:05:18Z    INFO    hashstore       finished compaction     {"Process": "storagenode", "satellite": "121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6", "store": "s1", "duration": "3m40.422357812s", "stats": {"NumLogs":110,"LenLogs":"102.0 GiB","NumLogsTTL":5,"LenLogsTTL":"37.2 MiB","SetPercent":0.9322463009275932,"TrashPercent":0.04363072487660159,"Compacting":false,"Compactions":1,"TableFull":0,"Today":20222,"LastCompact":20222,"LogsRewritten":7,"DataRewritten":"0 B","Table":{"NumSet":406924,"LenSet":"95.0 GiB","AvgSet":250793.8949582723,"NumTrash":15789,"LenTrash":"4.4 GiB","AvgTrash":302508.37317119516,"NumSlots":1048576,"TableSize":"64.0 MiB","Load":0.3880729675292969,"Created":20222,"Kind":0},"Compaction":{"Elapsed":0,"Remaining":0,"TotalRecords":0,"ProcessedRecords":0}}}

Huge impruvement on compaction time.
Is there a downside? Should I watch for something? Someone said something about more trash?

3 Likes

There won’t be more trash but there may be more reclaimable space in the log files. The variables do two things:

  1. a higher rewrite multiple allows rewriting more log files per “small” compaction. this means more free space is necessary in the worst case to complete a compaction, but makes compactions complete faster because it doesn’t need to do as many “small” compactions.
  2. a higher probability power steepens the log rewrite probability curve which makes it so that log files that are less “profitable” to reclaim are chosen less during a small compaction. this means that we do less log rewriting (the most expensive part of compactions) and so there’s potentially more reclaimable space. that said, there’s always going to be some reclaimable space: making that value zero at all times is effectively the old piece store and requires many more iops to do.

hope that explanation helps some. if you want something to watch, computing LenLogs-LenSet from the log lines (or the /mon/stats debug endpoint) gives an indication of the amount of reclaimable space. that may hover at a higher baseline after.

edit: oh, also the SetPercent field is LenSet/LenLogs so that may hover around a lower value.

6 Likes

Anyone with a full hashstore node care to share what percentage of the disk is “wasted” by trash in the log files, especially with the new defaults in 1.129? Duration of trash cleanup would be great as well.

Also it’s been awhile since the start of this new backend; is it ready for primetime?

Don’t use it if you don’t have UPS to prevent unclean shutdowns.
I will stick with Badger Cache and piecestore.

I’m currently using more aggressive values (rewrite sooner) than the default ones in 1.129, and it is 9.59% of trash data across multiple nodes. Was aiming at between 10% and no more than 20 with the probability 0.8 and multiple of 5.
I would assume with the new defaults it might be roughly twice as much.

Probability 0.8 multiple 5:
(y axis should be the probability to rewrite the file - where 1 = 100%, 0.1 = 10% and so on, x axis the percentage of stale data if I understand it correctly)

New defaults:

The longer rewrites do take between 2 and 3 hours to rewrite a 7TB node on a dedicated disk running 1.128 on ext4.
How frequent are these you can see here, where the success rate is lower on this 7 day span:

Me personally I haven’t had any issues with the hashstore so far, and have migrated bunch of nodes already.
I would say it is, they are probably using it heavily on Select.

3 Likes

Can someone write an instruction how to migrate node on memtable? Is it ready for testing on a couple of nodes? )

You have a periodic dip in success rate. Which means you have a periodic dip in your server performance. How is that not an issue? You have very high unhidable periodic IO for no reasons absolutely. (You also have correlation now between uncorrelated chunks, and a single point of failure, but nobody remembers history)

None of this is true with piece store. Why are you using hashstore?

2 Likes

Really. Piecestore had no periodic dip in disk (and node) performance. It just was almost constant :grin: (How do you solve slow file deletion on ext4?). Even if we don’t mention file deleting - have you forgot filewalker? It tooks hours, sometimes even days of intensive disk reading on typical filesystems just to calculate occupied space and trash correctly.

I’m happy I moved all my nodes to hashstore. Now I can use disks with “standard” filesystems ext4 or xfs without dealing zfs and vdev wearing out SSDs. I even can migrate node from one HDD to another and it takes hours, not weeks as it was with piecestore. Servers with nodes now has less RAM usage and much less LA.

The only reason to not move from piece store is dying disk. When your could lose only few pieces with piecestore you can lose big log file or even whole hashtable due to bad blocks.

1 Like

TLDR: We moved to storing data in files because it’s the most reliable way to do that. Previous attempts to keep stuff in databases caused data loss and were abandoned. Adding anything on top of plain filesystem is adding unnecessary complexity and risk for no benefit in return. Modern filesystems are already very good at shuffling trillions of files. Moreover, hiding data inside logs hinders filesystems ability to optimize access to data inside logs — like caching.

Not the case for me. It’s barely noticeable.

There is no hurry deleting. It’s faster than defragmenting logs anyway.

It takes minutes on 10TB node and traffic only goes to SSD, where metadata lives.

Copying file by file is the worst way to transfer anything, let alone node. Performance of copying the whole partition or dataset is constant, it does not depend on file size. And even if it was — optimizing for worst possible way of node migration is silly.

Here we go. “The only downside is data loss”. I’ll show myself out.

(But it’s not the only one — performance regression is still a downside. Need to compact logs creates fragmentation and adds correlation between independent segments. It’s horrible in all respects — except maybe a few raspberry pi nodes will be slightly better for longer and completely dead periodically. I don’t know if optimizing for shoddy nodes (that are also more prone for data loss) while sacrificing performance, reliability, and responsiveness of better nodes is a right course of action here - and by that I mean — it’s definitely not)

3 Likes

Obviously, not commonly used by default filesystems. EXT4, XFS and NTFS are not ready. I know it from my own experience since SLC tests when it faster for me to drop a few nodes and begin from scratch than to wait months until test trash will be deleted. ZFS was good enough, but only with additional hardware (intensively wearing SSDs) and very specific configuration (and this is absolutely not “use what you have” way).

Probably you forgot how often SNOs on this forum asked “Why trash is still not deleted?”, “How to disable piece scan on startup” and so on.

There is. Because when deleting is times slower then new data uploading (exactly the case I had with piecestore on ext4 after test end) - your node will end up finally with close to 100% of trash and zero income.
And no, it’s not faster. Random access reading/writing cannot be faster than almost linear compactions on mechanical HDDs.
Housekeeping (compactions) on my largest node now takes much less than hour per day. In most cases it’s about ~10 minutes.

Logs

2025-05-28T05:58:53Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s0”, “duration”: “10.763869ms”, “stats”: {“NumLogs”:1,“LenLogs”:“160.3 MiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:1,“TrashPercent”:0.3762777394761016,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20236,“LastCompact”:20236,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:235,“LenSet”:“160.3 MiB”,“AvgSet”:715385.7361702127,“NumTrash”:97,“LenTrash”:“60.3 MiB”,“AvgTrash”:652146.1443298969,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.01434326171875,“Created”:20236,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-28T18:11:17Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “store”: “s1”, “duration”: “7m20.300614876s”, “stats”: {“NumLogs”:971,“LenLogs”:“0.9 TiB”,“NumLogsTTL”:27,“LenLogsTTL”:“11.7 GiB”,“SetPercent”:0.955044672228656,“TrashPercent”:0.07246763025265117,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20236,“LastCompact”:20236,“LogsRewritten”:28,“DataRewritten”:“8.1 GiB”,“Table”:{“NumSet”:3994883,“LenSet”:“0.9 TiB”,“AvgSet”:245349.25109296068,“NumTrash”:279271,“LenTrash”:“69.3 GiB”,“AvgTrash”:266307.4613547415,“NumSlots”:8388608,“TableSize”:“512.0 MiB”,“Load”:0.47622716426849365,“Created”:20236,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-29T02:37:36Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “store”: “s0”, “duration”: “6.524384281s”, “stats”: {“NumLogs”:965,“LenLogs”:“0.9 TiB”,“NumLogsTTL”:20,“LenLogsTTL”:“7.4 GiB”,“SetPercent”:0.9805470528840831,“TrashPercent”:0.05582580072450234,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20237,“LastCompact”:20237,“LogsRewritten”:6,“DataRewritten”:“0 B”,“Table”:{“NumSet”:4481904,“LenSet”:“0.9 TiB”,“AvgSet”:223683.81130341033,“NumTrash”:225714,“LenTrash”:“53.2 GiB”,“AvgTrash”:252874.55353234624,“NumSlots”:16777216,“TableSize”:“1.0 GiB”,“Load”:0.26714229583740234,“Created”:20237,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-29T20:20:22Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “store”: “s1”, “duration”: “4.398557218s”, “stats”: {“NumLogs”:65,“LenLogs”:“58.7 GiB”,“NumLogsTTL”:6,“LenLogsTTL”:“486.2 MiB”,“SetPercent”:0.8978828041522576,“TrashPercent”:0.2142398397134952,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20237,“LastCompact”:20237,“LogsRewritten”:2,“DataRewritten”:“0 B”,“Table”:{“NumSet”:547604,“LenSet”:“52.7 GiB”,“AvgSet”:103333.21042212987,“NumTrash”:26296,“LenTrash”:“12.6 GiB”,“AvgTrash”:513449.13173106173,“NumSlots”:2097152,“TableSize”:“128.0 MiB”,“Load”:0.26111793518066406,“Created”:20237,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-29T23:43:11Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s1”, “duration”: “5.698819ms”, “stats”: {“NumLogs”:2,“LenLogs”:“1.6 GiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:0.9776289693553504,“TrashPercent”:0.05453414744583196,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20237,“LastCompact”:20237,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:2266,“LenSet”:“1.6 GiB”,“AvgSet”:757326.0088261253,“NumTrash”:128,“LenTrash”:“91.3 MiB”,“AvgTrash”:747872,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.1383056640625,“Created”:20235,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-30T02:47:40Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “store”: “s0”, “duration”: “27.001067324s”, “stats”: {“NumLogs”:585,“LenLogs”:“574.0 GiB”,“NumLogsTTL”:18,“LenLogsTTL”:“7.2 GiB”,“SetPercent”:0.9730061044319791,“TrashPercent”:0.08402392634386999,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20238,“LastCompact”:20238,“LogsRewritten”:5,“DataRewritten”:“1.8 GiB”,“Table”:{“NumSet”:2111253,“LenSet”:“558.5 GiB”,“AvgSet”:284021.38542751624,“NumTrash”:86406,“LenTrash”:“48.2 GiB”,“AvgTrash”:599286.9591463556,“NumSlots”:8388608,“TableSize”:“512.0 MiB”,“Load”:0.25168097019195557,“Created”:20238,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-30T19:30:53Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “store”: “s1”, “duration”: “56.823994271s”, “stats”: {“NumLogs”:462,“LenLogs”:“452.0 GiB”,“NumLogsTTL”:12,“LenLogsTTL”:“2.3 GiB”,“SetPercent”:0.9621326397578146,“TrashPercent”:0.07508195673764045,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20238,“LastCompact”:20238,“LogsRewritten”:10,“DataRewritten”:“1.7 GiB”,“Table”:{“NumSet”:1707605,“LenSet”:“434.9 GiB”,“AvgSet”:273468.5470984215,“NumTrash”:70996,“LenTrash”:“33.9 GiB”,“AvgTrash”:513288.5241985464,“NumSlots”:4194304,“TableSize”:“256.0 MiB”,“Load”:0.40712475776672363,“Created”:20238,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-30T21:18:38Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “store”: “s1”, “duration”: “10m21.864054788s”, “stats”: {“NumLogs”:994,“LenLogs”:“1.0 TiB”,“NumLogsTTL”:28,“LenLogsTTL”:“12.5 GiB”,“SetPercent”:0.9452661335918452,“TrashPercent”:0.05842124917028121,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20238,“LastCompact”:20238,“LogsRewritten”:30,“DataRewritten”:“9.7 GiB”,“Table”:{“NumSet”:4065269,“LenSet”:“0.9 TiB”,“AvgSet”:244367.51879297532,“NumTrash”:269755,“LenTrash”:“57.2 GiB”,“AvgTrash”:227604.05992103947,“NumSlots”:8388608,“TableSize”:“512.0 MiB”,“Load”:0.48461782932281494,“Created”:20238,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-31T04:01:08Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s1”, “duration”: “108.960252ms”, “stats”: {“NumLogs”:2,“LenLogs”:“1.6 GiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:0.9777181814134441,“TrashPercent”:0.05431667407112214,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20239,“LastCompact”:20239,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:2301,“LenSet”:“1.6 GiB”,“AvgSet”:748860.8848326814,“NumTrash”:128,“LenTrash”:“91.3 MiB”,“AvgTrash”:747872,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.14044189453125,“Created”:20235,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-31T10:37:26Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “store”: “s0”, “duration”: “1.803013667s”, “stats”: {“NumLogs”:139,“LenLogs”:“129.9 GiB”,“NumLogsTTL”:9,“LenLogsTTL”:“308.1 MiB”,“SetPercent”:0.9195301655736997,“TrashPercent”:0.02588993983015171,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20239,“LastCompact”:20239,“LogsRewritten”:3,“DataRewritten”:“0 B”,“Table”:{“NumSet”:602947,“LenSet”:“119.5 GiB”,“AvgSet”:212746.64414616872,“NumTrash”:12527,“LenTrash”:“3.4 GiB”,“AvgTrash”:288310.08860860544,“NumSlots”:2097152,“TableSize”:“128.0 MiB”,“Load”:0.2875075340270996,“Created”:20239,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-05-31T13:33:34Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s0”, “duration”: “6.218477ms”, “stats”: {“NumLogs”:1,“LenLogs”:“163.0 MiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:1,“TrashPercent”:0.37021088044514794,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20239,“LastCompact”:20239,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:246,“LenSet”:“163.0 MiB”,“AvgSet”:694596.162601626,“NumTrash”:97,“LenTrash”:“60.3 MiB”,“AvgTrash”:652146.1443298969,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.0150146484375,“Created”:20236,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-01T00:37:31Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “store”: “s1”, “duration”: “6m48.188734546s”, “stats”: {“NumLogs”:66,“LenLogs”:“59.1 GiB”,“NumLogsTTL”:7,“LenLogsTTL”:“239.2 MiB”,“SetPercent”:0.8851167444228896,“TrashPercent”:0.1782289760904676,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20240,“LastCompact”:20240,“LogsRewritten”:4,“DataRewritten”:“1.0 GiB”,“Table”:{“NumSet”:542910,“LenSet”:“52.3 GiB”,“AvgSet”:103384.63052808016,“NumTrash”:17717,“LenTrash”:“10.5 GiB”,“AvgTrash”:637927.5985776373,“NumSlots”:2097152,“TableSize”:“128.0 MiB”,“Load”:0.2588796615600586,“Created”:20240,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-01T08:04:38Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s1”, “duration”: “11.741764ms”, “stats”: {“NumLogs”:2,“LenLogs”:“1.6 GiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:0.9777496902250361,“TrashPercent”:0.05423986463822237,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20240,“LastCompact”:20240,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:2309,“LenSet”:“1.6 GiB”,“AvgSet”:747347.1805976613,“NumTrash”:128,“LenTrash”:“91.3 MiB”,“AvgTrash”:747872,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.14093017578125,“Created”:20235,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-01T16:55:25Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s0”, “duration”: “4.752852ms”, “stats”: {“NumLogs”:1,“LenLogs”:“163.0 MiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:1,“TrashPercent”:0.37011203987767394,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20240,“LastCompact”:20240,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:247,“LenSet”:“163.0 MiB”,“AvgSet”:691968.7773279352,“NumTrash”:97,“LenTrash”:“60.3 MiB”,“AvgTrash”:652146.1443298969,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.01507568359375,“Created”:20236,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-01T18:49:10Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “store”: “s1”, “duration”: “58.926029045s”, “stats”: {“NumLogs”:464,“LenLogs”:“454.3 GiB”,“NumLogsTTL”:11,“LenLogsTTL”:“1.3 GiB”,“SetPercent”:0.9594100327344534,“TrashPercent”:0.08587034477566682,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20240,“LastCompact”:20240,“LogsRewritten”:7,“DataRewritten”:“3.1 GiB”,“Table”:{“NumSet”:1711217,“LenSet”:“435.9 GiB”,“AvgSet”:273517.1077332682,“NumTrash”:91398,“LenTrash”:“39.0 GiB”,“AvgTrash”:458344.3054771439,“NumSlots”:4194304,“TableSize”:“256.0 MiB”,“Load”:0.4079859256744385,“Created”:20240,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-02T03:33:55Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “store”: “s0”, “duration”: “7m31.502830245s”, “stats”: {“NumLogs”:578,“LenLogs”:“565.0 GiB”,“NumLogsTTL”:15,“LenLogsTTL”:“2.2 GiB”,“SetPercent”:0.9449261941393364,“TrashPercent”:0.03041130896228977,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20241,“LastCompact”:20241,“LogsRewritten”:37,“DataRewritten”:“8.2 GiB”,“Table”:{“NumSet”:2103730,“LenSet”:“533.8 GiB”,“AvgSet”:272476.4207422055,“NumTrash”:49970,“LenTrash”:“17.2 GiB”,“AvgTrash”:369187.34152491495,“NumSlots”:8388608,“TableSize”:“512.0 MiB”,“Load”:0.25078415870666504,“Created”:20241,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-02T15:47:57Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “store”: “s1”, “duration”: “7m50.501219457s”, “stats”: {“NumLogs”:1012,“LenLogs”:“1.0 TiB”,“NumLogsTTL”:28,“LenLogsTTL”:“12.5 GiB”,“SetPercent”:0.9448096487052571,“TrashPercent”:0.060749674579114725,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20241,“LastCompact”:20241,“LogsRewritten”:23,“DataRewritten”:“8.2 GiB”,“Table”:{“NumSet”:4144031,“LenSet”:“0.9 TiB”,“AvgSet”:243938.64476930795,“NumTrash”:400198,“LenTrash”:“60.5 GiB”,“AvgTrash”:162415.81824996628,“NumSlots”:8388608,“TableSize”:“512.0 MiB”,“Load”:0.4940069913864136,“Created”:20241,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-02T20:09:28Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “store”: “s0”, “duration”: “9.673009447s”, “stats”: {“NumLogs”:142,“LenLogs”:“132.8 GiB”,“NumLogsTTL”:9,“LenLogsTTL”:“318.5 MiB”,“SetPercent”:0.9165650635438674,“TrashPercent”:0.04569411697517113,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20241,“LastCompact”:20241,“LogsRewritten”:3,“DataRewritten”:“0.7 GiB”,“Table”:{“NumSet”:607058,“LenSet”:“121.8 GiB”,“AvgSet”:215349.56193312665,“NumTrash”:24055,“LenTrash”:“6.1 GiB”,“AvgTrash”:270935.43429640407,“NumSlots”:2097152,“TableSize”:“128.0 MiB”,“Load”:0.28946781158447266,“Created”:20241,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-03T14:01:39Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12L9ZFwhzVpuEKMUNUqkaTLGzwY9G24tbiigLiXpmZWKwmcNDDs”, “store”: “s1”, “duration”: “2m56.377522406s”, “stats”: {“NumLogs”:468,“LenLogs”:“456.8 GiB”,“NumLogsTTL”:13,“LenLogsTTL”:“1.6 GiB”,“SetPercent”:0.9516115713488352,“TrashPercent”:0.07355705058878811,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20242,“LastCompact”:20242,“LogsRewritten”:12,“DataRewritten”:“4.1 GiB”,“Table”:{“NumSet”:1718637,“LenSet”:“434.7 GiB”,“AvgSet”:271577.84619323333,“NumTrash”:86711,“LenTrash”:“33.6 GiB”,“AvgTrash”:416072.3875863501,“NumSlots”:4194304,“TableSize”:“256.0 MiB”,“Load”:0.40975499153137207,“Created”:20242,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-03T14:50:11Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “12EayRS2V1kEsWESU9QMRseFhdxYxKicsiFmxrsLZHeLUtdps3S”, “store”: “s0”, “duration”: “2m27.376401566s”, “stats”: {“NumLogs”:983,“LenLogs”:“0.9 TiB”,“NumLogsTTL”:22,“LenLogsTTL”:“5.7 GiB”,“SetPercent”:0.9677798034482467,“TrashPercent”:0.06202964215611196,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20242,“LastCompact”:20242,“LogsRewritten”:14,“DataRewritten”:“3.1 GiB”,“Table”:{“NumSet”:4485063,“LenSet”:“0.9 TiB”,“AvgSet”:224040.93191422283,“NumTrash”:406887,“LenTrash”:“60.0 GiB”,“AvgTrash”:158286.84583188946,“NumSlots”:16777216,“TableSize”:“1.0 GiB”,“Load”:0.2673305869102478,“Created”:20242,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-03T15:31:09Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “1wFTAgs9DP5RSnCqKV1eLf6N9wtk4EAtmN5DpSxcs8EjT69tGE”, “store”: “s1”, “duration”: “70.172208ms”, “stats”: {“NumLogs”:2,“LenLogs”:“1.6 GiB”,“NumLogsTTL”:0,“LenLogsTTL”:“0 B”,“SetPercent”:0.977791547001344,“TrashPercent”:0.05413782983942201,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20242,“LastCompact”:20242,“LogsRewritten”:0,“DataRewritten”:“0 B”,“Table”:{“NumSet”:2323,“LenSet”:“1.6 GiB”,“AvgSet”:744275.0650021523,“NumTrash”:128,“LenTrash”:“91.3 MiB”,“AvgTrash”:747872,“NumSlots”:16384,“TableSize”:“1.0 MiB”,“Load”:0.14178466796875,“Created”:20235,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}
2025-06-03T18:33:09Z INFO hashstore finished compaction {“Process”: “storagenode”, “satellite”: “121RTSDpyNZVcEU84Ticf2L1ntiuUimbWgfATz21tuvgk3vzoA6”, “store”: “s1”, “duration”: “28.674130488s”, “stats”: {“NumLogs”:65,“LenLogs”:“57.9 GiB”,“NumLogsTTL”:7,“LenLogsTTL”:“152.0 MiB”,“SetPercent”:0.8778711290745664,“TrashPercent”:0.1882246153883497,“Compacting”:false,“Compactions”:0,“TableFull”:0,“Today”:20242,“LastCompact”:20242,“LogsRewritten”:4,“DataRewritten”:“0.8 GiB”,“Table”:{“NumSet”:538981,“LenSet”:“50.8 GiB”,“AvgSet”:101264.57587187675,“NumTrash”:19447,“LenTrash”:“10.9 GiB”,“AvgTrash”:601761.0053992904,“NumSlots”:2097152,“TableSize”:“128.0 MiB”,“Load”:0.2570061683654785,“Created”:20242,“Kind”:0},“Compaction”:{“Elapsed”:0,“Remaining”:0,“TotalRecords”:0,“ProcessedRecords”:0}}}

Worst or not, but it works perfect with hashstore without any needs in special tools for that.

Isn’t losing vdev on zfs = losing of whole pool? Not just some precent of data.
Using already dying disk you should be ready that it can die any moment without warning anyway, so data loss should be expected in this case.

There is no performance regression comparing piecestore. I’d say there is really huge progress.

Storing small files randomly creates fragmentation too. Probably even more than linearly (re)writing log files.

1 Like

Right. Those are old fielsystems not desifgned for handling modern workloads.

I think we have different undersatnding of what “use what you have” entails. The idea is to use already online, underutilzied capacity – not to bring back online ancient disks from the landfill just because you had them laying around and they are not good for anythign but chomping on power.

So that if you already have 200 TB of free space onine, regardless of storj – you can share some with storj. Those 200TB I guarantee you are not on some raspberry pi with 2GB of ram. They are most likley ZFS or some propriatery filesytem, that absolutly can handle storj workload with ease. Having metadata accelerated – be that via zfs special device, or other cache solution on top of ext4 or btrfs is preetty much as must today – for the sake of your usrs. SSDs are cheap, human time priceless. Not thowing in a cheap SSD to make life of your users better is just silly.

How so? Modern (those made in last 10 years) SSD withdand several full rewrites every day for half a decade. How are you going to even make a dent in that endurance with metadata updates caused by storj? It just does not make sense as an argument.

We hear volal minority. Those that have no issues don’t need to post. They are the majority. Those like myself who has no issues and yet can’t shut up are probably also a handful.

Well, that’s clearly not good. Did you find the bottleneck? ZFS is slower at deleting files that ext4 (significantly so) but I did not see any issues during that time.

Of course it can. Couple of orders of magnitude so! If random access is to metadata in RAM or SSD, and compatcion by defintion has to touch slow HDD.

Well, that is an expensive disk IO that is 100% avoided with piecestore.

What special tools? I’d argue rsync is a special tool, as opposed ot zfs send, which is part of the filesystem :slight_smile:

That’s fair. both result in fragmentation. IT remains to be seen which one is wose. However, most of storj files are very small, under 16k. They can occupy one record each. Only metadata gets fragmented – and it’s on SSD, so it does not matter.

Logs are huge. Their fragmentation will have impact on acces.

Basically, I not only don’t see value, but only see downsizes, cost, and extra risk, to save a few underpowerd and ill-suited for the task nodes.

Yes, for me buying of additional SSDs exclusively for Storj ZFS pools vdevs + buying additional cards with SATA ports for these SSDs is not “use what you have”. Because this is additional hardware I need only for Storj with piecestore.

Modern SSDs (especially in consumer segment) are more often QLC based with less than 1000 rewrite cycles per cell. With several rewrites per day their resource will run out in less than 1 year. After a few month of using as vdevs I had 10-20% wear level in SMART.

We hear minority who actively use forum and monitor their nodes. Those who didn’t post not necessary has no problems. May be they even didn’t know that their nodes had problems and thought it should work like this.

Yes. It’s millions of files (=piecestore) on HDD without additional hardware for metadata acceleration.

Piecestore has to delete tens of thousands of pieces uploaded with TTL, while hashstore have has to delete only one log file with combined by TTL data. Then, when it’s time to write new data piecestore will deal with tens of thousands “small holes” of free space, amplifying fragmentation and requiring more and more random access.

It’s not not only not avoided with piecestore. It’s at least on the same level when using vdevs and requre muuuuuch more random disk IO without. So, as I already mentioned, I prefer 10 minutes of expensive disk IO with hashstore (and without any not neseccary for me hardware) to at least same 10 minutes (with piecestore+zfs+additional SSDs and ports used) and especially to hours and days in case of piecestore without meta acceleration.

Zfs often is not a preinstalled software and require installation + some knowledge about using zfs send. Rsync is more common tool and many Linux users used it before become SNO.

1 Like