Today I wrote about my problem with slow file deletion on ext4 filesystem (mostly for SLC folders):
How do you solve slow file deletion on ext4?
Right now Storj uses folder structure “/ab/filename.sj1” with 1024 subfolders for every Satellite. During intensive tests SLC sends tens of millions files. I see more than 50mil records in piece_expiration.db so number of files in each folder can reach 50 000 or even 100 000. And we hope one day Satellites working with customer’s data will surpass these numbers
So I decided to make some simple tests to understand how the number of files in folder affects file deletion.
Test steps:
- Create folder with N files with random 50-symbols Storj-like filenames, random content and random size from 4Kb to 1Mb.
- Pick 1000 random files in random order.
- Remount filesystem to reset caches.
- Call “rm -f” with list of picked files and measure how much time it needs to finish the job.
- Repeat for N = 1000, 1500, 2000, 2500, 3000, 4000, 5000, 6000, 7000, 8000, 9000, 10000, 15000, 20000, 25000, 30000, 35000, 40000, 45000, 50000, 60000, 70000, 80000, 90000, 100000.
Here are results (test is long and I had no time to run it few times with averaging, so there are some fluctuations):
N - number of files in folder | t - time to delete 1000 random files from there |
---|---|
1000 | 1.58 |
1500 | 2.73 |
2000 | 3.07 |
2500 | 4.6 |
3000 | 4.95 |
4000 | 5.55 |
5000 | 6.13 |
6000 | 7.63 |
7000 | 7.87 |
8000 | 7.07 |
9000 | 9.12 |
10000 | 8.18 |
15000 | 9.79 |
20000 | 9.92 |
25000 | 12.87 |
30000 | 10.32 |
35000 | 12.51 |
40000 | 17.13 |
45000 | 14.6 |
50000 | 13.79 |
60000 | 18.25 |
70000 | 22.06 |
80000 | 24.11 |
90000 | 26.32 |
100000 | 29.75 |
The results show that the total number of files in a directory significantly affects the speed of deleting of each individual file.
I would like to note that the file deletion rate in this synthetic test is significantly (times) higher than on real Storj expired pieces. This is probably due to the fact that the files for the test were created together during a short interval of time and are located on the disk not far from each other, unlike real data, which were written in different folders for a month. For this reason, it is likely that increasing the number of files in the Storj folders has an even greater effect than in my test.
So my suggestion is to add one more subfolder [2-7a-z] to folder structure and make it “/ab/c/filename.sj1”. This way each of 1024 folders will get 32 subfolders and number of files will decrease on average of 32 times. (50 000 to ~1500 or 100 000 to ~3000).
This relatively simple change can increase the speed of deleting old files by at least several times.
Here’s the script I used for the tests in case anyone wants to reproduce them quickly. I used the language I am most familiar with. Sorry, it’s PHP
But it’s measure time of system “rm -f” so it doesn’t matter.
test.php
<?php $folder='/home/storj2/node/2/'; $dev='/dev/sdc1'; $alphabet='abcdefghijklmonpqrstuvwxyz0123456789'; $testNumbers=array(1000,1500,2000,2500,3000,4000,5000,6000,7000,8000,9000,10000,15000,20000,25000,30000,35000,40000,45000,50000,60000,70000,80000,90000,100000); foreach ($testNumbers as $numFilesInFolder) { $numFilesToDelete=1000; print "Test folder with $numFilesInFolder files\n"; $totalSize=0; $files=array(); exec('rm -rf '.$folder.'test/'); mkdir($folder.'test/'); $time_start = microtime(true); for ($i=1;$i<=$numFilesInFolder;$i++) { $filename=''; for ($j=1;$j<=50;$j++) { $r=rand(0,mb_strlen($alphabet)); $filename.=mb_substr($alphabet,$r,1); } $filename.='.test'; $files[]=$filename; $filesize=rand(1,256); $totalSize+=$filesize; $fl=fopen($folder.'test/'.$filename,'w'); for ($j=1;$j<=$filesize;$j++) { fwrite($fl,openssl_random_pseudo_bytes(4096)); } fclose($fl); } $time_end = microtime(true); print "$numFilesInFolder files with total size ".round($totalSize*4096/1024/1024)." Mb are created in ".($time_end-$time_start)." seconds\n"; print "Let's give some time for disk cache to finish writes and remount filesystem...\n"; sleep(15); exec('umount '.$dev); sleep(5); exec('mount -a'); sleep(10); $itemsToDelete=array_rand($files,$numFilesToDelete); $filesToDelete=array(); foreach ($itemsToDelete as $item) { $filesToDelete[]=$files[$item]; } $command='rm -f '; foreach ($filesToDelete as $file) { $command.=$folder.'test/'.$file.' '; } $time_start = microtime(true); exec($command); $time_end = microtime(true); print "RM command test: delete $numFilesToDelete random files from folder with $numFilesInFolder files finished in ".($time_end-$time_start)." seconds\n"; exec('rm -rf '.$folder.'test/'); } ?>