A major performance issue with using/recommending rsync
to copy 4TB of data stored in 9.4 million Storj files on an HDD is that rsync
isn’t reading the file-data in physical disc order (see: FS_IOC_FIEMAP
and BTRFS_IOC_TREE_SEARCH_V2
ioctls). If rsync
was able to read the file-data in disc order, then it would be able to achieve 100-200 MB/s read speeds, instead of the 5-10 MB/s it can currently achieve because of the many random I/O HDD accesses.
Another performance issue is that (as far as I know) rsync
isn’t taking advantage of io_uring
in Linux. A related issue is that to enable the Linux kernel to read 9.4 million files from an HDD more efficiently, rsync
would have to have open at least 20000 files all the time (see RLIMIT_NOFILE
and “ulimit -n”).