SSD Caching for external drives on QNAP system

striker43 · April 27, 2020, 10:30am

Hi,

I am currently running a QNAP TS-251d 2 bay NAS with two additional 8TB external drives connected via USB 3.0. one Storj node is using the Storage pool of the internal HDDs and then I have 2 separate nodes running for the external drives.

Following many posts in the forum, I am wondering if it would make sense to add SSD caching, as this could speed up reads and writes a lot. The NAS has a PCIe slot and I could add a M2 SSD…

My question is now if this is only speeding up the node that is running on the 2 internal bays or is it also caching the data that flows from/into the external drives?

I couldn’t find anything in the internet, therefore I am asking here. Maybe a QNAP expert reads this

Thank you

joesmoe · April 27, 2020, 10:55am

As is always the reply when it comes to buying new hardware - If you’re buying it for multiple uses and want it, then great. If you are buying ssd cache exclusively for storj (such as it sounds in you case), then it’s probably not worth it.

striker43 · April 27, 2020, 11:41am

Yes you are completely right

I am not planning the upgrade only for STORJ as I am also running other services on the NAS. But I would be very interested if it is technically possible to use the QNAP ssd Cache also for external drives as adding an external drive is much cheaper than buying a QNAP extension rack. This could be very useful in future if my storage demand is increasing.

SGC · April 27, 2020, 12:39pm

cache is good for making stuff run smoother, but eventually you will need to add your cache on to the drives, if you lack IO another option that should also be considered is adding more drives to the array.

ofc as a SNO generally the incoming data is much less then what a drive can optimally handle, so long as it is in sequential reads and writes.

just keep in mind, that you can in some cases add cache and end up with it having limited to no effect, due to how your data is being addressed.

generally what larger storage systems does, is bunch the drives into one big array, so that the load is shared across the different drives, instead of being focused on one or two drives out of many.

however yet again, this is very dependent of the use case, as a SNO i plan on just putting all my drives into one array, which also allows me to being to get cheaper redundancy… like say if you got two drives and needs 1 to be able to fail… then you loss 50% of your capacity for that… while if you got 5 drives and 1 redundant, then you only spent 20% of your capacity on buying that redundancy.

alas yes cache can help a lot, but make sure thats what you need… tho for running many many different things at once, then high IO is the way to go… with nvme m.2
samsung evo pro 1tb is the king of performance… tho you may make do with much much less, but do get nvme with low latency… check storagereviews if you want detailed info and many choice.

direktorn · April 27, 2020, 5:34pm

Sure you can use the build-in ssd cache in QNAPs version of MDRaid. But notice that QNAP will make the SSD cache part of the filesystem, so if your cache fails your entire raid might fail, or at least you can loose all write data in cache. QNAP recommends at least two SSD drives for cache.

The other point is that, and we don’t know this yet - the spread of reading files. Let’s say all your data is read evenly, that means your cache is inefficient and would not improve things, most likely the other way is it has to serve the file, and at the same time write it to SSD cache, where it would never be used again.
Now I’m not saying it will be this way, but atm it looks like most are using the network for a second backup copy, i.e. write once, read zero.

KernelPanick · April 29, 2020, 3:05pm

The QNAP filesystem can be excluded from the SSD cache. At least, it indicates that you can.

I currently run a Read/Write SSD cache on a few iSCSI LUNs. Considering doing only a Read SSD Cache for StorJ though, because of the almost 4x difference in downloading vs. uploading. I have two nodes now, so i plan to run one without and see if the HDD speed can keep up with writes, and not lose the battle to get data from other nodes.

direktorn · April 29, 2020, 4:54pm

I didn’t mean the QNAP filesystem used for QNAP, I meant you need to enable SSD cache for an entire volume. There are no ways around this. You can enable it for a single iSCSI volume if you want iirc.

Enabling SSD cache for iscsi can be quite beneficial specially if your using Vmware. but that’s a bit OT

SGC · April 29, 2020, 5:20pm

4x

O.o

i get more like 10x+…
is that ratio common for those using QNAP?
and what speeds are we talking then?

KernelPanick · April 29, 2020, 5:28pm

I wish i could get 10x. I must have others sharing my subnet.

Since the beginning of the month i’m averaging 6Mb upload, 30Mb download. 10Gb iSCSI with SSD cache, 1Gb symmetric Fiber, Central USA. Running compute on a more powerful system.

Also, i should note that my failure rate is almost 0%/1%, it’s highly successful, but doesn’t look like it’s getting as much as others…

SGC · April 29, 2020, 5:51pm

this is not 10x is a good way… i get 30Mbit ingress and like 2-3 Mbit egress.
but my node is kinda new in week 8 i think and i’m still kinda finetuning… is kinda odd tho because my successrate says 98+% for egress successrate and 0.3-0.4% failed which is the avg failure rate of ipv4 connections…

the uploads ingress doesn’t have failure tho… maybe my script is bugged…
get somewhere between 75% successrate on ingress a bit more when the system isn’t to busy.
i think your numbers are pretty good, but my gear is like a decade old apart from my consumer grade sata ssd.

how many cancelled do you get… in ingress and egress
from what i can gather the reason i’m not getting better numbers is my disk latency…

KernelPanick · April 30, 2020, 3:40pm

Ah you’re right, i didn’t think your ratio would be different.

This is my first node that is almost full. I need to adjust my dashboard to look at the second one.
My failed calculation considers cancelled messages.

Disk latency might play a decent role, which is why I’m hesitant to remove my SSD cache.

SGC · April 30, 2020, 6:52pm

wow the windows storj dashboard look really different… tho maybe it’s because i’m still on v1.1.1 or that you got something custom running.

it won’t change what is called failed on the script that i run to check my logs, but it might affect your time to verify writes and thus cause you to loose more ingress races… for some reason i never loose download races… but maybe thats because its mostly test data or just in my case… i know it’s not like that for everybody…

i will assume your write cache is high end nvme ssd?

which brand model are you using,?

your ratios are damn near perfect…
i’m kinda looking to replace my own sata ssd for a nvme one, so its nice to get an idea about how much i need… i think the write cache is the most important… even tho read might help, it most likely wouldn’t … storing it in cache before sending it doesn’t make sense… and your system will most likely just put it in ram before sending it, which is faster anyways…
ofc read cache might be able to predict… but my ARC is drawing a blank on the whole prefetch atleast last i checked…

and keep in mind the dataset ends up being HUGE… no way it can be committed to ssd or ram anyways… not below like thousands of $ if you want to run a semi big node anyways.
maybe when its not test data tho… then read cache will be much more useful… but sadly now it’s mostly test data.

bdurrer · May 1, 2020, 6:49pm

caching only works for data that is accessed frequently. I wouldn’t be too sure about the usage patterns in the storj network, but my guess is that it is more or less random access and seldomly accessing the same file multiple times in a short timeframe. Unless your cache SSD is as big as your whole node, you wouldn’t get any performance boost.

kevink · May 1, 2020, 6:52pm

It could at least boost the database.
And maybe the ingress by caching the files if they are being written synchronously. if they are written asynchronously, the RAM will cache them anyway.