Disk fragmentation is inevitable... Do we need to prepare?

Pavmer · December 19, 2019, 1:30am

What about defragmentation for SN…
A defragmentation tool in the background, without much load on the array, would facilitate faster read/write access.

Why? My personal opinion is that…
On conventional HDD and even RAID will be strong fragmentation, during the operation of the node, files are erased and written (after the introduction of the cleaner), large volumes of disks and over time, read / write will take more and more time for HDD mechanics.

As an option, it can be Diskeeper in automatic mode (THIS is not an ADVERTISEMENT)

Who that thinks on this about ?

p.s. I’m a newbie, and gathered my node on my knee with Windows and 1TB HDD
I plan to further expand the volume through RAID6

nerdatwork · December 19, 2019, 2:30am

Have you seen the number of files that a node holds for 1TB! I am pretty sure node’s db has indexes to locate the correct piece and retrieve it for processing. De-fragmentation would put unnecessary load on CPU. Storj is meant to be used even on RPi 3.

Pavmer · December 19, 2019, 2:43am

Maybe I don’t understand indexes ? correct, but I will say as I understand.
If there is an index, it is fast for the database to find the file, there are no problems and the database knows where the file lies, but it must not only be found but also read and send, and given the fact that the file is fragmented, HDD mechanics need to do extra work, and this in turn slows down the response

Conventional defragmentation tools, Yes, create a large load on the disk subsystem at the moment. That’s why I wrote that we need a background tool that does not take on much and does not aggravate the situation with fragmentation.

According to the reviews of the above Defragmenter, I realized that their customers are very happy with how the response values of their systems on highly loaded servers have improved.
At us, in case of acceptance of a network by the world, nodes of operators will be very loaded with input output and if on a disk there is a mess it will slow down everything, it seems to me so…

MrTechGadget · December 19, 2019, 3:38am

I think you are worrying about a potential bottleneck that isn’t likely there.

Vadim · December 19, 2019, 5:47am

i whatched this tool. It makes cashing in RAM what is extrimly dangerios sor us, becaus small power cut will perform data lost.

Pavmer · December 19, 2019, 12:12pm

I’ve seen it too, but this item is easily disabled in settings. The rest of the functions are very useful to us, such as the preventive operation of their logic, as one of.

But I am interested to hear exactly the opinion of the community and engineers on this issue.
Not necessarily this tool, (I cited this Defragmenter as a suitable example in my opinion just) can be another program,

Pentium100 · December 19, 2019, 12:18pm

I use zfs (ext4 on a zvol), so I hope that it can allocate space efficiently. I also use SSD cache for writes and reads. If it so happens that my node stores some files that generate a lot of traffic, the cache would help. However, if all files in the node are accessed equally frequently, then the performance will not be great with HDDs no matter what.

Since the node only creates and deletes files (never modifies them), the fragmentation should not be that big as long as there is enough free space.

Pavmer · December 19, 2019, 12:42pm

Precisely the that not only housed, and will and remove, makes fragmented composition over time, if pitch this on gravity.
Although I have now a small volume of deposited, (pass audit on this moment) judging by tests compact disks are filled strongly.
I also think that a lot depends on the file system, but the average node operator, a normal disk with NTFS

Pentium100 · December 19, 2019, 3:31pm

Wrose would be if it extended one file. When the node wants to create a new file, the filesystem will find a non-fragmented block to put the file in, as long as there is enough free space. So, each file should not be fragmented that much.
Files themselves will, of course be all over the disk, but they have no relation to each other and there is no way to predict which files will be accessed frequently (to keep them together).
So, I do not think that this will be a big problem.

donald.m.motsinger · April 15, 2020, 2:08pm

On an almost full 3.5TB node I had a problem with a lot of “ordersdb error: database is locked” errors and 99% I/O with 4 threads. The orders.db was heavily fragmented (~180000 extends). After defragmenting all databases the I/O is back to normal and so far no more errors.

nerdatwork · April 15, 2020, 2:40pm

How long did that take ?

donald.m.motsinger · April 15, 2020, 2:43pm

About 5 minutes on BTRFS. They are still not all contiguous, but 56 is much better than 180000…

Edit: In case it was not clear, I am talking about defragmenting the database files on the filesystem, not defragmenting the databases.

buchette · May 3, 2020, 6:58am

Hosting 3.3TB on a USB HDD (EXT4), My node was suspended on the saltlake satellite.

I found i started to have a similar db locked issue but on a different one.

“usedserialsdb error: database is locked” with I/O stuck @ 99.99%

‘used_serial.db’ had 22883 extents
After a defrag of the file : 375 extents.

No more issues for now.

I wish i can move the Databases to the system SSD.

naxbc · May 3, 2020, 11:18am

How do you defrag the database files? Or is it defragment the drive?
Because Windows 10 defragments the drive automatically, weekly by default.

buchette · May 3, 2020, 12:23pm

Linux.

And reply to myself : it did not last. My node is now suspended on 4 satellites.

donald.m.motsinger · May 3, 2020, 2:06pm

I meant the database files only. I’m not sure whether defragmenting the whole drive is a good idea. It might be that it adds to the high I/O too. The files in the blob folder are relatively small, so fragmentation would not be a problem I think. Also, once stored they are not read from and written to, so it doesn’t matter really. The database files on the other hand are constantly in use.

I think Windows can’t defragment files which are currently in use. You can analyze the fragmentation with contig and defragment, after stopping the node.

BrightSilence · May 4, 2020, 9:12am

It’s still a workaround, but try vacuuming the db and the defrag. Hopefully that’ll last a little longer.

buchette · May 4, 2020, 10:07am

well, after several hours it is better than before i defraged the file, but not perfect.

I tried to vaccum it but got an error message :

BrightSilence · May 4, 2020, 10:16am

That is not a good sign. I recommend you go through these steps.

buchette · May 4, 2020, 10:28am

Thank you.

I’m running an older version of sqlite3 on Ubuntu 18.04.4 LTS

sqlite3 -version
3.22.0 2018-01-22 18:45:57 0c55d179733b46d8d0ba4d88e01a25e10677046ee3da1d5b1581e86726f2alt1

I’m struggling to update it to " v3.25.2 or later" as the posted guide asks.