Large temporary folder

Elektros · August 20, 2020, 11:46pm

Hey my temporary folder is really bloated by the looks of it.
Here it is.

Is this at some point maybe going to be removed automatically or should I seek approval from mods as to which files I can remove by myself?

kevink · August 21, 2020, 5:11am

That’s a good question… Mine isn’t that big but I have some pieces in there from December 2019 too, just like you…

sorry2xs · August 21, 2020, 6:11pm

@Elektros I have the same only oy node is on Mac OS and the folder icons show Screen Shot 2020-08-21 at 14.10.49 also I was under the understanding that the file size are 2.3 MB.

thepaul · August 21, 2020, 7:37pm

If the files in temp are not opened by the storagenode (you could check with lsof on Mac or Linux) then they can be deleted. Files whose mtime is older than the start time of your storagenode process are probably automatically safe to delete.

BrightSilence · August 21, 2020, 8:24pm

So, wouldn’t it be simpler to just stop the node and then delete everything in that folder?

thepaul · August 21, 2020, 8:40pm

If you do that, you’d interrupt any ongoing piece transfers and lose credit for them. But if you don’t care about that, you could just delete everything in the temp dir without stopping the storagenode. Same effect. (Except of course on Windows, where I don’t think it would let you delete the open temp files without stopping the node.)

nerdatwork · August 22, 2020, 1:33am

Shouldn’t deleting temp files be part of node startup ? More precisely it could be part of garbage collection.

thepaul · August 22, 2020, 4:56pm

Yeah, actually, I think deleting all temp files on start is a good idea. I’ll make a ticket to investigate that.

Elektros · August 23, 2020, 6:03pm

If you get an answer post it here if you can

nerdatwork · October 4, 2020, 6:02am

@thepaul any update on ^

SGC · October 4, 2020, 6:47am

adding more stuff to the node boot doesn’t seem like a great idea imo… it already takes the better part of an hour to perform its regular boot functions, last time i booted my node, because i was running my zfs pool in a degraded state to test some stuff… it took 80 minutes before my storagenode was finished having semi high utilization of the pool…

i wouldn’t want to imagine how long stuff might end up taking if it starts to run at boot, since one of the main factors becomes the seek time on hdd drive heads as more processes are trying to access data and thus makes the heads go back and forth between different tasks, slowing down reads and writes iops to a crawl…

if one did want to add it as part of the boot phase the tasks should be queried and run in sequence instead of in parallel aka at the same time.
by running the tasks in sequence the time spent on performing the tasks can often be greatly reduced, ofc not always true, but a very good general rule of thumb…

just like if one start copying 5 different files at the same time it will take longer than copying them as normally in sequence…

anyways just a thought.

Sasha · October 5, 2020, 2:22am

… mine takes a few seconds and I can right away see satellite query on upload or download and audit. Doesn’t take more than a minute from cold start.

You must be having some other issues if it takes more than a minute to start up a node let alone an hour…

I agree with others above, temp folder data should be checked on start up and if not required/complete or used it should be safely deleted or marked for delete by StorJ and let garbage collection clean up etc…

kevink · October 5, 2020, 5:10am

Node startup takes only a few seconds but the initial filewalker task that checks on all the pieces and moves some to trash can take a few hours depending on node size and performance, however it does not affect the node’s performance of upload and downlaods.

SGC · October 5, 2020, 8:05am

well there are certain issues with putting everything at boot, like say…

the less you reboot the node the more work might pile up
when your node has problems and lets say restarts from time to time, but lets assume it works fine meanwhile, in my case i see an iowait thats like 40-45% and then after the initial first round is over it will drop to around 18% maybe a bit less avg, my usual iowait is about 2% when the node is running.
and the 18% iowait takes anywhere from 80 minutes to 40minutes.
if your system is overloaded, then rebooting the node will not help, it will only make it all much much worse, just like you wouldn’t want demanding workloads to run on when you give the node a shutdown command… please wait 80 minutes for your storagenode to shutdown…
hdd’s delay scale exponentially the more workloads you give them in parallel and in theory can get so high that it will take 100 years to read the entire drive… so just adding workloads without any sense of sequence or management makes everything worse for just about everybody and benefits nobody or nothing… since there is no point that i run through those filewalkers and do trashcollection and what not 10 times in a day when i’m tinkering.

stuff like the filewalker and garbage collection should be triggered on a timer… sure maybe something like the file walker is good to run at the boot or shortly after… maybe let the node dashboard say what secondary processes the node is running aside from being online… so people can understand what is happening… wasn’t the whole reason the garbage collection was move to the boot was because people didn’t understand why their node all of a sudden ran with increase iowait and thus rebooted it and or complained on the forum.

i’m not against having lots of features that improve the storagenode, i’m just trying to point out just how bad an idea and waste of SNO’s resources to run everything at boot every time…

because i’m working on migrating my node and testing some different stuff my system ran about 8 hours of 10x to 20x the iowait that i usually have … just because i rebooted the node a few times…

there is no sensible reason to why something should do the same damn process 8-10 times in a row
an yes its not normal i reboot my node 8-10 times in a day… but i was benchmarking how long my storagenode boot times was from when it starts until it goes to the usual 2% iowait.
and what kind of effects a degraded pool, how much faster it would be when cached… so on and so forth.

so when you say node startup you define it as when it will accept and send data… thats not how i define a boot… a boot is how long it takes before all the boot related processes to be finished…

its like if you boot into your OS, then just because you can see the desktop doesn’t mean it’s done doing everything its going to do in its boot sequence, you may have updates it runs and all of that, which also takes resources and can be very distracting depending on what one is trying to do…

i don’t see why stuff needs to run 8 times in a row all because i stopped and started the node…
doesn’t take much to make a time stamp and put a timer on how often it should be run… just to atleast limit it a bit… and for gods sake never run stuff in parallel on a hdd, if one can avoid it

thepaul · October 8, 2020, 2:57pm

There is a ticket for it, but it hasn’t been scheduled or roadmapped by project management yet. I will let them know that some people are very interested in its progress- maybe that will get it in faster.

Yes, I agree. Those are valid issues. Node start, though, is an exceptionally good time to do this task; we could schedule a tempfile cleanup task every N days, but during normal runtime it may be very hard to know whether a tempfile is still open and expected to be present in some other part of the system. When the node is starting up there is an easy guarantee that files with mtime < starttime can be thrown away.

So there are good reasons to do this at start, also. However, maybe we can make everything work without adding another on-start task. We’d need to do a quick audit of all code touching files in the temporary directory to be sure, but maybe there is some age A where all tempfiles of age >= A can be safely deleted. If that is the case, we would be better off using a regularly scheduled task instead of doing it on start.

No, garbage collection normally happens when a Retain request is received from a satellite. It doesn’t have anything to do with node start.

It’s not entirely clear here- are you using hyperbole here, or is there really something that gets run 8 times in a row when you stop and start the node? Because yes, certainly, nothing should need to run 8 times in a row when that happens.

If you mean that you stopped and started the node 8 times in a row, and the directory traversal happened once each time, then yes, that’s a thing that happens. The reason for it is because the software needs to know how much space it has used on the drive (not how much space is used on the volume, but how many bytes have been used for file storage inside the node’s blobs directory). Without that, we wouldn’t be able to provide the “don’t use more than X bytes” feature.

We keep track of changes to the space-used value when writing new blobs or deleting blobs, but since the fs is of course not transactional, there is always a chance of our cached space-used count getting out of sync with what’s actually on disk. And when the service is newly started, there is always a chance that a previous invocation of the service crashed without being able to persist the space-used value to disk, so the risk of exceeding our data allowance is even higher.

There are some possible mitigations we could employ, though:

introduce a scheduling layer for blob i/o and make the space-used update traversal be low priority, so that it would only make progress when there is no other i/o traffic going on
add a config item indicating that there is no limit on the amount of space used, other than the size of the volume in which the storage directory lives. In this case, the node could use the volume’s filesystem stats to determine space used without ever doing a dir traversal
add a config item which says explicitly “don’t do a directory traversal to update space-used; just trust the last value that you saved instead”. You could use this when tinkering to avoid incurring the extra load.

kevink · October 8, 2020, 3:20pm

just delay the task by e.g. 24 hours and still use the startup time of the node like you said:

thepaul · October 8, 2020, 3:28pm

That’s a pretty good idea too. Hopefully there aren’t many nodes restarting more frequently than 24h.

SGC · October 8, 2020, 7:28pm

went on a bit of a tangent, i did however after a few tries come up with a solution that should be able to make repeated runs of the filewalking essentially obsolete, ofc one should still use that to be sure the numbers match, but there would essentially be no need to… i moved my suggestion to how to make it work to the top

oh i know, we use the timestamp on files… we know the storagenode clock is in sync for the thing to work… each file is given a timestamp on creation, so if we filewalk all files older than when the filewalk started, which gives us an exact number, and then we add the running tally total of data amounts changes since the filewalking started,

the total tally being the ingress data amount subtracted by the deleted data amount since the filewalk started, we then store this tally in memory and if the storagenode is shutdown correctly the tally is saved, and reloaded on storagenode boot.

if the storagenode isn’t shutdown correctly, the tally is lost and then filewalker will do its thing…

then i might add some kind of infrequent check if the filwalker data tally gets really outdated… like say weeks or months… or if one changes the capacity of the node or whatever…

but yeah the timestamps are the key, because then one can keep track of the exact amount of data at anytime without having to run the filewalker at all… in theory

Rant’s and reasons below might have deleted some of it, was trying to streamline it… but now this comment is just a mess and already spent like the better part of an hour coming up with this solution

i don’t disagree that some things maybe vital to booting, but if somebody is troubleshooting a node, then it can end up running many many times, which seems wasteful…

your mitigation suggestions are pretty good, tho i would say 2 and 3 is essentially the same thing, or 2 can do the same function as solution 3 and thus makes it pointless.

point 2 seems like an easy to implement solution…

now that i think about it tho… mitigation 1

if we imagine this was an SMR drive, which has difficulty keeping up with io because of node traffic… lets say it would normally take 3 hours to run the filewalker thing space accounting… if that was on low priority it might take so long that it gets close to days, which could be seen as an issue by the SNO
and potentially it could use more space that was allotted.

i like the idea, but might cause as many problems as it solves… i duno

the more i think about it the more i think there might be a solution in looking at it inversely.
if one setup so the node does the space accounting infrequently.

then on a normal boot, it could do a sequence of things.
1 check the number of files vs the number of deletions since its last space accounting / file walking
then if the number of deleted files are a good deal less than the total of files, it would do a filewalk… on the … deleted files… ofc that would require the files to still exist, which in most cases i assume they don’t… but anyways if they did, then it would take much less time to account them and subtract that from the older space accounting…

actually forget about all of it… why can’t it just track it live…i assume the node would be able to keep a tally of how much it’s deleted and how much it’s stored since the last filewalk / space accounting, sure it may not be 100% accurate, but we don’t need 100% accurate, we just need a decent estimate…
and then if it’s space accounting number is more than a certain time old, it will run a new one, which starts by resetting the tally of deleted and stored, because then whatever ingress there is while the file walking / space accounting is running will … DAMNIT that cannot work either…

because if we either have massive deletes or ingress it would give us deviations from the exact space or blob size… because in case it takes days to filewalk, then there could be like 1tb of ingress…
and when its done counting the number wouldn’t be correct anymore.

thepaul · October 9, 2020, 4:43pm

It sounds like what you’ve described is almost exactly what the storagenode does! It tracks changes to space used per satellite, live, anytime there are new writes or deletes, and periodically writes all space-used totals to a local db. And on some infrequent schedule, we walk the filesystem to update the space-used tallies in case our cache has drifted away from the real amount.

The problem is that the “infrequent schedule” being used right now is “when the storagenode is restarted”. This is partially because, on start, there is a greater chance of the tally being wrong (possibly something went wrong with the last invocation, where it wrote new files but couldn’t update the space-used db, or possibly all files have been moved to a different filesystem, etc), and partially because we have the assumption that people aren’t restarting the storagenode process very often.

Clearly that last assumption caused the trouble in your case.

kevink · October 9, 2020, 4:47pm

It can also cause trouble after a server restart for people having many nodes as the IO will be huge. But it might be tricky to solve.