Temperary files on Node

Vadim · December 28, 2020, 11:46pm

Is it posible to make that there is no temp blob files on node?
they are always about 4mb.
can i stor them on ram?
Filestore.write-buffer-size is not working, i configured it to 6 MiB but there is stiil apeare and go away files in temp folder.

SGC · December 29, 2020, 12:05am

there are certain issues with storing even temporary data in memory for extended periods, unless if you got proper Registered ECC RAM and system then the data may get corrupted while in memory.

you trying to do what?, reduce iops?. improve performance?

generally such things run in cache, your hdd’s will store stuff in it’s cache for easy access to it, and there are a wide range of caching solutions which one can implement, usually based on SSD’s or similar technology.
so that the data isn’t lost on power failures which is yet another issue with saving data in RAM.
your OS will also to some degree try to cache stuff it uses often in RAM…
but still some data needs to be saved often as fast as possible to limit the odds of corruption.
if it’s successfully written to disk it’s usually “pretty safe”.

there should be an option to store the databases and such temp data in an alternative location such as on a SSD, never tried to set that up tho… my system runs zfs, so it’s very adept at caching.

ofc stuff stored in a cache doesn’t mean it’s not written to disk, just that it’s easy to access again… and thus often used data is in a cache and speeds up most processes immensely.

some data demands to be written to disk and will not send acknowledgement back before it’s saved to disk…unless if one cheats but thats generally a bad plan… fast, but pretty bad…

Vadim · December 29, 2020, 12:14am

I want that piece wile write at 1 time, not chashed to temp and mooved or copiesd to its supposed location. It is more operations.

kevink · December 29, 2020, 8:27am

I guess that’s a valid question for SMR drives? But all other drives would also benefit from less operations. But is the temp really used for caching pieces on ever upload?

SGC · December 29, 2020, 8:29am

that’s a write cache, the problem with that is storing it until you write it, because sync writes require to be written before your system will acknowledge it has saved them.

you are on windows right?

here is a random list of stuff that might be good

seems like the list i provided is kinda weak, as the crucial and the kingston requires specific models of their ssd’s for their software to work.

i would say do the primocache, and then… cups gets cracks, or maybe even pay for a cup of coffee.
seems to be a one time deal and i bet you can install it as an infinite number of machines.

persistent ssd caching, should if the software works correctly, make it possible to do sync writes without them actually being stored on disk immediately after… but ofc some programs might check… i duno… it’s rather complex…

but if you imagine something being written to a cache instead of the hdd and then the programming writing checks to see if it was written, or wants to reload the data, then it won’t read from the cache…

ofc thats the kind of stuff the software should fix…
else you are into using something like LSI Cachecade which is proprietary software and usually requires some sort of raid controller and a hardware license key.

this comment is so getting the ban hammer, i can feel it coming a mile away

kevink · December 29, 2020, 8:33am

storj files are no sync writes, only the DB uses sync writes. Therefore your post is not applicable.

SGC · December 29, 2020, 8:36am

they can also do async writes, but if they are async writes then they would be stored in memory anyways, ofc a proper cache always makes hdd’ run better… it’s just an expensive solution.
async is the easy writes to cache, because they are not ack’ed and thus can wait into the minutes range before being written.

on top of that a cache helps with stuff thats read often and thus reduces the repetitive read IO thus allowing for more write IO.

personally i would always use caches for performance, because they are awesome… anything runs better with caches, sure maybe there are a 0.01% that it won’t matter for, but in all other things caching rules.

was just trying to say that sync writes are the more difficult writes to cache, but i suspect most caching software would find that rather simple as it will most likely apply as a driver type layer between the hdd and the OS.
thus even in the case of async it would be able to work effectively.

BrightSilence · December 29, 2020, 4:38pm

This doesn’t really have anything to do with sync or async writes. Both end up on disk and @vadim is looking to skip that step entirely. I don’t see any options to move the temp location, but you could always symlink the temp folder to a ramdisk. Make sure you have plenty of space(RAM) though as slow transfers can really start filling up that RAM. I’m guessing this is the reason the temp folder exists in the first place. V2 could really run into RAM issues because of this.

Vadim · December 29, 2020, 4:40pm

i have minimum 16GB and some servers are 24GB, servers are only dedicated to nodes
As i have almost all HDD WD perple that are 5400 rpm, so perform slower, but very good lifetime.
I want to skeep writing temp files there. Ramdisk is to complicated. Simpler solution is better.
Filestore.write-buffer-size: 5 MiB shold resolv that 4096 KB pieses acure, but it dosnt, i think may be there is some bug.

BrightSilence · December 29, 2020, 4:47pm

That should usually be plenty as the folder is usually empty or close to empty. I don’t monitor this folder constantly though and I’m guessing there may be peaks. Would be best to have a fallback as a full ramdisk would result in every upload failing. I guess that’s not fatal though.

Also, I have definitely not tested this suggestion so try at your own risk. But I think you have more then enough nodes to try this with one or two. Do report back on your experience. I’m kind of curious whether this would relieve some of the stress on the HDD. If so we might want to request a feature to make the temp path a setting (just like the db path). The temp path is kind of an ideal fit for a ramdisk as any interruption that would cause the RAM disk to lose data would also fail the ongoing transfers and make the temp files obsolete and it should only impact ongoing uploads at worst.

Vadim · December 29, 2020, 4:54pm

I hope someone from Storj team can add some information, i do not want to pingin them as it is small vocation time now and they defenetly also need to rest time to time.
I have seen on some other topic than Filestore.write-buffer-size setup for 4MiB+ shold resolv this thing, but it dosnt do anything at all. I have serched all list of different config fietures that can be configured, but havent found anything abotu that at all. I think this thing will find good place in wiki

Yes i have about 10 new nodes to test

BrightSilence · December 29, 2020, 5:07pm

Are you sure it does nothing? I think that may just limit the number of writes to the temp file, but doesn’t skip the temp file step entirely. So instead of adding small amounts to the temp file, it writes the temp file at once and then moves it. May look quite similar as files still end up in that folder. They may also be preallocated, which would make it even harder to see the difference.

Someone from Storj will drop by soon to add some info, I’m sure. I think it’s an interesting thing to test though.

Vadim · December 29, 2020, 5:11pm

as i undertand this function shold setup buffer limin for every upload in ram, over this it will buffer to temp files, but may be my understanding is wrong.

I took it from this topic Filestore.write-buffer-size

SGC · December 29, 2020, 7:25pm

i think the point he makes here is due to the files being smaller than 4 mb 99% of the time, and the buffer is per incoming piece, then the piece will basically always fit into ram while its being uploaded.

how big are the temp files all mine seems to be about 4mb…
and so old… most of my temp files are like 6+ months old… O.o

Vadim · December 29, 2020, 7:59pm

then it seems to be a benefit how much times it write to this temp file, every 128kb or one time write.
So it will save some iops, and potentialy less difragmentation.

BrightSilence · December 29, 2020, 8:37pm

Going by @CutieePie’s message (thanks for looking at the code btw, I was kind of too lazy to go there, but it seems to confirm my suspicion), it seems to preallocate the file. This means the buffer won’t have impact on fragmentation, but it will of course limit the iops. For SMR disks that could save several track rewrites though. So could be very much worth it there.

I have plenty of RAM as well, so I might just increase this setting to 4MiB myself.

SGC · December 30, 2020, 8:50am

i wasn’t sure if i had changed it on my mainnode, so i checked the temp files on another nodes…
my temp files all seem to be 4mb… kinda makes me wonder if it is using 4mb by default…

kinda makes me wonder if 4mb is the current setting on all nodes per default, even if 128k is the default in the code.
most hdd’s run the best around 64k - 128k writes, ofc sequential is alway preferred / better…
but it is the size where hdd’s in general see the highest iops and writes when benching more random rw across many different drives and tech’s.

i do have 1 file on one node that is from oct 3 and is only 3072bytes
seems a bit like the storagenode isn’t great at clearing it’s temp files…

and another one that’s from the 19 sep at 1280 bytes
else they are all at exactly 4 MiB

maybe that’s why one doesn’t get any advantage… if you are writing to a temp file of max 4MiB’s then going beyond 4MiB might not fit, unless if the tempfile size is increased…

i duno… but seems possible imo

Egon · December 30, 2020, 9:01am

If it would be safe to do, we would’ve done it in the code directly.

The main issue is that the storage node must be able to recover from crashes such that they don’t lose data.

As an example, if storage node receives the piece into RAM and immediately responds with “success”, then the storage node may crash (or the disk gets full due to some other rogue process) and is unable to save it to disk. This would mean that the storage node will be eventually disqualified due to not holding the pieces it has agreed to hold.

In principle write-buffer-size was specifically designed for allowing to increase the number of bytes held in RAM during upload for slow disks. We cannot trivially figure out what the exact number should be, so it’s kept at 128KiB, which should be decent for most cases.

With regards to the cost of creating a temp file and rename. Renaming a file from temp directory (the one in storage directory, not the system one) to the stored blobs is a cheap operation, since it only involves changing the filename and not moving the data around. The temp directory is used to ensure that there aren’t any “half-written” pieces in the actual blob store.

Vadim · December 30, 2020, 9:11am

Thank you for reply, now i understanding this thing beter. So if i put 4MiB insted of 128k then Temp file will be written once, and not by 128kb parts. this will save lot of IOPS, ang take more RAM.

For such systems as Raspberry 128kb buffer is OK, as it has less RAM memory.
Today there is very smal Incom trafic, so it almost not matter, but there is time when it is very big, so then IOPS are needed.

Vadim · December 30, 2020, 9:42am

Thank you for your information, it also added view from insight and is very valuable.