I recommend prewarming the cache if you add it from scratch by running fsck. Itās faster than waiting until the cache fills itself with metadata, as fsck knows the file system layout and can read inodes sequentially. This should in theory also make cache itself better organized, though I donāt have any practical way to check this. However, this is the easiest way to check exactly how much cache you need at a given point.
Iām migrating 2x 7TB nods using rsync. Looks like itās going to take⦠weeks⦠one drive is ext4 and the other is zfs with an l2arc of just metadata.
while the ext4 drive often shows a higher āread speedā, the zfs drive is a bit ahead on the sync. 2TB vs 1.5TB
As far as I remember 95% was assigned to storj and it was fullā¦
However, at that time I was testing running 2 nodes per HDD and it was still capable to handle that load. I am doing this no more, so please no TOS discussion.
The amount of performance a node requires is capped by the speed of the Internet connection: which is always much slower than even a full ZFS filesystem can deliver on a HDD. Leave a couple hundred GB free but otherwise fill things up for max rewards.
Iām moving 20-ish nodes from ext4 / xfs to zfs. It really differs per time, how much time it takes. Main challenges are the source disk, destination disk, and used copy method (interestingly rsync is speedier in case the source disk is an SSD, then rclone copy; with HDD as a source disk it is usually the opposite).
I usually try to move the filesystem as a whole to SSD if possible (<4TB used disk space), and then from SSD to zfs filesystem on HDD. Often I get speeds ranging from 30-150MB/s in that case, while direct copy than only is 3-8MB/s.
Are you doing something like dd/ddrescueāing the source partition to an image file on your SSD, then loopback-mounting it⦠then copying from that temporary mount to your destination HDD? I could see that as the dd should be a sequential transfer and pretty much at the max speed of your disk.
Today I finished moving a small node from ext4 to my new zfs setup with rsync. This time I used no network, both disks are local. It was still rather slow, roughly 1 TB / day. The random read performance of the source hdd seems to be the limit.
My migration of two nodes to new drives got interrupted. So the first half was with rsync. The second half was with rclone,which tries to run four parallel transfers.
Itās been 10 or 14 days or so working on just the initial copy.
The zfs source drive finished during this time and copied 5.8TB.
The ext4 source drive is still running over the same time and has only done 2.8TB.
The destination drives were identical 14TB 3.5" drives with ZFS.
So in my migration case, zfs was faster. It probably helped that I had fairly ample caching. 24GB commited to ARC cache (shared amongst a few drives) and also a metadata-only secondary l2arc on SSD.
I found rsync --no-inc-recursive to be faster than regular rsync for moving nodes on ext4, and this would match your observations, as the --no-inc-recursive flag is effectively pre-warming metadata caches.
The peaks at end of July were final move operations where I moved entire nodes from cached disks to another to their final location, pre-allocated some new vhdx caches, etc. The peaks in the last few days are likely TTL deletions.
My SSD is a consumer Kingston KC3000 2TB with nothing else but the caches on it currently. It is rated for 1.6 PBW which translates to to 894GB/day for 5 years.
Everything being equal, at 130GB/day the SSD should support current cache usage for 30 years.
When full, assuming cache usage goes to 300gb/day, for 14 years.
Given all the benefits of this setup (i.e. it solved everything) Iām definitely sticking with it without worry.
Last thing, Iām currently using oversized 256GB caches (18GB/TB). From my previous post I estimate that a 128GB should be more than enough (for 4-8GB/TB). At some point later Iāll recreate caches and compare again.
I donāt have all the numbers and thinking, but would like to share my relative success with zfs + l2arc.
I have 8 nodes on 8 disks, totalling about 43TB. highest amount stored on a single disk is 12TB.
Each disk is formatted zfs, with the following characteristics:
compression=on
lz4 compression
secondarycache=metadata
redundant_metadata=some
atime=off
I have up to 26GB of RAM allocated to ARC cache. according to arc_summary the cache hit rate is 97% or 98%.
I also have a single SSD that Iām using for l2arc cache. Itās a 5 year old used MLC enterprise SAS drive. with the endurance it will last until the heat death of the universe. I have it split into 8 partitions, created by hand, which I am using as a separate cache for each drive. I am allocating 5GB for each TB of disk space. So far, the l2arc data gets compressed so I have not filled up any partition for any drive.
The performance so far has been⦠pretty good! Use to be disks would be pegged at 100% pretty much on any activity. Now the only drives showing over 90% utilization doing filewalkers are the two that are fragmented over 50%. The rest are busy, but more chill.
The SSD actually gets worked pretty hard when all the filewalkers kick off at once. IT shows around 50% busy when they all start up. The write activity to the SSD is very modest. Partly because the l2arc only writes to the metadata so fast, and also because things like filewalkers are really read intensive, so very little is written to the SSD at all after the metadata gets initially populated in the l2arc cache.
according to arc_summary the size of the l2arc is a nominal 1.3TB, but compresses down to 285GB on the disks. and requires 3.5GB of ram for the headers. The reported his rate is usually around 70%-80%
But anyway, filewalkers seem to finish in a more reasonable about of time (hours instead of days) and the drives seem to be capable of keeping up with high incoming test traffic days.