Setup Bcache on Debian 11 amd64 with mdadm devices

Eioz · October 30, 2022, 4:08pm

Hi dear all, after moving data on bcache device, also moving db files to SSD, i open eyes on the fact that bcache isn’t really a good solution finaly. Why ?

Cache is already filled at 7.1G (written) + 1.3G (metadata), i think for real performance gain in my configuration (remember that i use cache mode “writearound”) i will need a very large SSD capacity to be able to use cache_readaheads !!! Maybe for 20TB, 500GB SSD capacity as cache and maybe a lost of performance for reading 2 times during filling 500GB SSD cache, with hypothetic about cache_readaheads ratio…
Actually, my HDDs are approximatively 50% up more of read response time, i think actually to read data (upload piece), data is read one time for “upload_piece” and one time for “copy piece into cache” for read caching, i was at 40-50ms read response time on HDD’s now arround 80-100ms for read response time avg.

I have a question for Storj because i don’t remember so much : do read operation or upload_piece between storage node and customer transit by satelitte or it transit directly from node to customer ? In case of data transit by satelitte, satellite may already have read cache so upload_piece (read), maybe, isn’t asked many times while satelitte cache is filled by the same data ?

It seems setup bcache for only read (writearound) is very useless finaly !
I do not recommend to use bcache in writeback mode unless your system is powered by APC or if it could continue to shutdown properly after electrical issue (power outage…).

I will resync data to backup and run again mkfs.xfs /dev/md0 directly

Thanks to all ! Regards, Eioz.

Alexey · October 31, 2022, 6:10am

The satellite is an address book, audit, repair and payments processor. Data is flowing directly between nodes and customers.

Eioz · October 31, 2022, 7:31am

Thank you for the answer @Alexey !
So it may have sense to have read cache for data who are downloaded by customers multiple time, but i don’t know how much could be the “cache_readaheads” value, as thoses data could be read on cache (SSD) if the cache size is sufficiently large. Does Storj have statistics on downloaded pieces ids by customers ? Maybe on storagenode itself ? If yes, and statistics are globals, do you know the ratio of piece to be downloaded multiple times by customers (uploaded by node multiple time in a day for exemple or a week). I think with approximatively 30-50GB per day of storagenode upload bandwitdh cache size would be at least at 7x50GB so 350GB to keep data on cache during 7 days. If in 7 days data is requested multiple time, so cache is used. I do not recommend caching unless you have a big SSD capacity like 350GB or maybe 700GB to be able to keep data on cache during maybe 15 days (it depend of the upload bandwidth that you currently have on your node). In this way you will hypotheticaly cover node uploads using cache, but hypotheticaly, i don’t know how much could be the “cache_readaheads” value, i don’t know if customers download multiple time their data as i don’t have statistics for the moment, but with cost on bandwidth for customers to retrieves their datas (as others clouds do the same thing) it would surprise me that data could be downloaded by customers multiple time (to avoid pay multiple time the file transfer “bandwidth”). Again i do not recommend cache because cache filling is decreasing performance of HDDs with reading two times the data on the storagenode, one time for upload piece (when customer download piece), and one time to fill cache (write data or “piece” to cache). See my previous answer about bcache and HDDs responses times at arround 100ms when bcache activated with good parameters… Best regards to all Storj community
PS : i speak for “writearound” cache mode…

Alexey · October 31, 2022, 7:37am

Unlikely it is collected, especially in real time. It’s also could select a different batch of nodes for different locations. The satellite has a cache of resolved nodes’ IP addresses, but it’s a different thing.

lyoth · October 31, 2022, 10:18am

Why use bcache when you can just use zfs and it’s native ram and ssd caching.
I have a 10.3TB node on a Seagate 16TB CMR drive format to ext4 and I never felt it was even struggling with the download and upload of storj

Toyoo · October 31, 2022, 3:50pm

I don’t see how you arrive at your conclusing that you’d need a large SSD. bcache is pretty good at evicting less used data from its cache, and the working set for a Storj node is not that big. bcache should help a lot with metadata (e.g. the file walker process), but file contents would be usually read from HDD anyway.

Something’s wrong here. You shouldn’t be observing worse latency for reads with bcache. What model are those SSDs? You said you use them in RAID1, how did you configure it?