Stumbled across this thread, because I am currently planing on building a new TrueNAS system that maybe will some storage for STORJ.
While it is true, how big are the chances of unexpected shutdown and the drive dying at the same time? Not saying it is not there, but I would rate the risk of broken pipe in my house flooding TrueNAS higher than that ixSystems seems to agree with me on that one and sells TrueNAS mini with only one drive for SLOG. Way more important would be PLP to not loose the pool. But not only because of security also because drives without PLP are not fast SLOG drives, because without PLP, they canât make use of fast internal cache. A USV is a good way to avoid most unexpected shutdowns.
Anyway with the database not in the same place as STORJ data, is there a point in sync writes? Worst case is a lost write and my audit goes down a little? I fail to see the need for sync.
Write cache for ZFS is 1/8 of total RAM for 5 seconds.
SLOG is not a write cache. SLOG helps the pool by taking over the hosting of the ZIL.
SLOG only moves the ZIL from the pool to a separate device and this is true only for sync writes not async, because async does not make use of ZIL to begin with.
Which is not a good idea for performance. Only thing you do with that is to force sync. This is only good for safety in some edge cases, where wonky applications that require sync but donât ask for it.
It only reduces fragmentation for sync writes, which we should not do to begin with for STORJ data.
I think you have false assumptions about SLOG.
To better understand this, letâs first look at how ZFS works without SLOG.
Any writes, no matter if sync or async are aggregated in RAM into TXGs. Sync writes are at the same time written to ZIL (ZIL is on your pool, so in your case the 12 EXOS HDDs). Only when the write is finished to ZIL, sync calls get returned. In normal operation, that ZIL is never used. Sync writes get written in TXGs from RAM to pool, after that the ZIL gets unlinked.
What happens after crash?
During the pool import, ZFS checks ZIL for any dirty writes. If there is a dirty write, it gets replayed (TXG from RAM to pool) and unlinked afterwards. So no data lost here. Only downside, you pool is not a very fast ZIL destination and additional fragmentation because you basically write down twice and delete one.
To solve that problem, we can add SLOG.
How does ZFS work with a SLOG?
SLOG is basically just a separate device for ZIL. It behaves exactly the same as ZIL. It gets never read from unless after a crash.
Does SLOG help with fragmentation?
For sync writes? Yes, because there is no ZIL on the pool.
For async? No. Async writes and sync writes with a SLOG behave 100% exactly the same. It gets aggregated in RAM (default setting is 1/8 of total RAM and 5 seconds) and then committed to the pool.
Should I set sync=always with a fast SLOG?
You can, but it wonât make your asynchronous writes go any faster.
Remember, asyc write calls already return immediately. You literally canât improve on that, no matter what you do. And because both are written the exact same way from RAM, absolute best case scenario would be that the sync write does not get bottlenecked by the SLOG, because the SLOG is as fast as the RAM. Even if the SLOG would be faster than RAM (which is basically impossible) it still would not go any faster. Because remember, the write still has to go trough RAM. So absolute best case impossible scenario, sync write would offer the SAME performance as async.
The potential benefit to setting zfs sync=always isnât speed, itâs safety.
If youâve got applications that notoriously write unsafely and tend to screw themselves after a power outage you might decide to set zfs sync=always. Again, youâre not going faster, youâre going safer.
ZFS default behavior is âsync=standardâ which is âAsynchronous unless requested otherwiseâ. âAsynchronous unless requested otherwiseâ write behavior is taken for granted in modern computing with the caveat that buffered writes are simply lost in the case of a kernel panic or power loss.
That is very interesting. I think this could help a lot for STORJ. Especially for filewalker. How is your usage for the special vdev?