High load after upgrade to v. 1.3.3

Hi there,

After upgrading to v 1.3.3 load is significantly higher. Yes I do have 8TB smr drive but before upgrade to v 1.3.3 load was below 2 and io-wait is much higher now.
Any advice will be appreciated. :wink:

You may want to look into the suggestion here.

I have similar issues, so far Iā€™m dealing with it by giving the drive some time to breathe - every 8 hours a node is recreated with lower amount of storage, which stops data intake. An hour later the change is reverted to allow uploads again. This amount of time seems to be enough for SMR drive to clean up the cache area.

1 Like

Hi. I have had the same problem since 1.3.3
The blame is always put on the SMR hard drives. Now I would be interested in why the problem has only occurred to me since 1.3.3? That was never before and I already had a lot more traffic. (now the hard drive is full)

SMR drives might not do so great if filled to capacity, however issues like that shouldnā€™t show up just after an updateā€¦ but i suppose they couldā€¦
i have noticed that if i have down time my ingress increases often greatly until iā€™m back in sync with the networkā€¦ so if your updating is a bit slow, then maybe that could affect itā€¦

i mean i had some issues the last couple of days and when i finally got back on the network i peaked at 110mbit ingress with an avg of nearly 5 MB/s
but i doubt that is it.

it seems to be a trend that the 1.3.3 upgrade has caused higher loads, which is to be expected sometimes when developing new programmingā€¦

SMR drives read just fineā€¦ so if you are at near max node capacity and leave a bit free for the disk not to get to cluttered / fragmentedā€¦ remember it has to read and move blocks around just like an ssd, so you will give it a shit ton of extra work if filling it to 100%
i got no idea where the sweet spot is thoā€¦ with SSDā€™s they say 80% but with a SMR drive i would think the number would be higherā€¦ maybe 90 or even 95% is fineā€¦ you would often with in a few weeks feel if its having trouble, because it will get slower and slower at writing ā€¦ tho reading should be just fine.

I would just adjust the max concurrent in the config yaml to like 10-20
might look a bit ugly when you boot the node, but it will keep the network from flooding your node with requests your system isnā€™t fast enough to answer anywaysā€¦

I run at 20 with 400mbit fiber, 48GB RAM dedicated SSD for my OS, another dedicated SSD handling SLOG and L2ARC, then i got 5 HDD in raidz1
itā€™s a monster that eats whatever the network throws at itā€¦ and still for whatever reason and because my local 1gbit network infrastructure, donā€™t like the strain, and i have other people using my network, so latency is a thing.
my system can keep up with the network, even if it doesnā€™t successfully manage to get every upload successfully 15% or so get cancelled, it rejects zeroā€¦ when running at max concurrent 20
tho booting the node is fā€™kered because pings and cleaning orders counts as concurrent.

anyways it slows down my number of concurrent actions for the computer and for the network to keep everything running smoothly.

and i know not everybody will agree, looks at Brightsilence but i wouldnā€™t run without itā€¦ and iā€™ve turned it on and off like 15 times sometimes letting it be off for a few days of stable activity and then turning it on againā€¦ to me it makes a noticeable difference in my performance, latency and such.

tho now i got my system optimized to a T then i might be able to run it at 200+ again or infinite at 0
but like i noted earlier, its most likely my network which needs better gearā€¦ using old 1gbit ethernet able routers converted into switchesā€¦ or i assume switches i suppose they could be hubs, maybe thats the issueā€¦ well the server connection hops through a couple of those before hitting the fiber.

anyways long story short limiting max concurrent has helped my node and infrastructure run smoothly

Add this to the end of your config.yaml
# Maximum number of simultaneous transfers
storage2.max-concurrent-requests: 7

SGC goes back to check if he can finally run infinite with his current rig

WELCOME to the network get spammed by 20 of so upload requests at once and cleanup and auth procedures my system was at 20% iowait!!! for the first 10 minutesā€¦ seems to be back down to near 0% nowā€¦ but its isnā€™t a gentle start upā€¦ ridden hard and put up wetā€¦

Itā€™s the large number of deletes coming through that is causing high system loading.

$ grep -i deleted /opt/storj/node.log |wc -l
419057
$ grep -i uploaded /opt/storj/node.log |wc -l
189679
$ grep -i downloaded /opt/storj/node.log |wc -l
147082

Since the beginning of May, my node has recieved more deletes than uploads and downloads combined.

i got a few days of those alsoā€¦ but seems like iā€™ve gotten through them
only got 775 deletes logged for today, tho i donā€™t think those include cleaning, if i was to hazard a guess then they get registered by the node when they are logged and then finally deleted after some X amount of time during the next cleaning cycle.

i did get like 500000 deletes or more over a few days
successrate.sh result on log from today
just set my max concurrent to 0 when i did the first post
but i can already see a drop in my upload successrates of 0.1%
tho my system seems to be getting close to be able to keep up.

========== AUDIT ==============
Critically failed:     0
Critical Fail Rate:    0.000%
Recoverable failed:    7
Recoverable Fail Rate: 1.039%
Successful:            667
Success Rate:          98.961%
========== DOWNLOAD ===========
Failed:                91
Fail Rate:             1.650%
Canceled:              7
Cancel Rate:           0.127%
Successful:            5417
Success Rate:          98.223%
========== UPLOAD =============
Rejected:              81
Acceptance Rate:       99.859%
---------- accepted -----------
Failed:                0
Fail Rate:             0.000%
Canceled:              8855
Cancel Rate:           15.406%
Successful:            48623
Success Rate:          84.594%
========== REPAIR DOWNLOAD ====
Failed:                3
Fail Rate:             7.692%
Canceled:              0
Cancel Rate:           0.000%
Successful:            36
Success Rate:          92.308%
========== REPAIR UPLOAD ======
Failed:                0
Fail Rate:             0.000%
Canceled:              154
Cancel Rate:           16.348%
Successful:            788
Success Rate:          83.652%
========== DELETE =============
Failed:                0
Fail Rate:             0.000%
Successful:            775
Success Rate:          100.000%

the 3 lines was because i changed something in bios which didnā€™t agree with my server and it started shutting down randomlyā€¦ seems to be fixed now whatever it was,
i bet you if i leave it at max con 0 then my network graph will start trailing downwards and never recover.
also note how much one gets punished when returning to the networkā€¦i would fear dealing with that on one hddā€¦

it also becomes a bit of a self reinforcing cycleā€¦
if the server falls behind the node seems to redirect more traffic, leading to higher load on the system and then one is attempting to for the most part getting an upload, gets half way and its cancelled and then on to the next one maybe also failing that because its trying to get 4 other uploads that it also will end up getting cancelledā€¦

iā€™ve most often seen performance gains from running lower max concurrent (ofc to a point) rather than going to high because that just lets it run into the self reinforcing downward spiral of latency and overload deathā€¦ xD

wasting tons of bandwidth on maybe 50% cancelled ingress uploads, maybe increasing disk latency but i would hope uploads stay in ram until they are complete and ready to go to the disk.

about 3hr in with max concurrent = 0 / unlimited

Doesnā€™t look to promisingā€¦

========== AUDIT ==============
Critically failed:     0
Critical Fail Rate:    0.000%
Recoverable failed:    8
Recoverable Fail Rate: 0.995%
Successful:            796
Success Rate:          99.005%
========== DOWNLOAD ===========
Failed:                114
Fail Rate:             1.593%
Canceled:              16
Cancel Rate:           0.224%
Successful:            7027
Success Rate:          98.184%
========== UPLOAD =============
Rejected:              81
Acceptance Rate:       99.881%
---------- accepted -----------
Failed:                0
Fail Rate:             0.000%
Canceled:              10480
Cancel Rate:           15.355%
Successful:            57770
Success Rate:          84.645%
========== REPAIR DOWNLOAD ====
Failed:                3
Fail Rate:             6.818%
Canceled:              0
Cancel Rate:           0.000%
Successful:            41
Success Rate:          93.182%
========== REPAIR UPLOAD ======
Failed:                0
Fail Rate:             0.000%
Canceled:              172
Cancel Rate:           15.665%
Successful:            926
Success Rate:          84.335%
========== DELETE =============
Failed:                0
Fail Rate:             0.000%
Successful:            889
Success Rate:          100.000%

does seem to be holding its ground according to the successrate, but the graph tells me overall throughput is way downā€¦

iā€™ve also gone from 16-30 ms latency peak on my hddā€™s to about 20- 700ms backlog
my ssd has gone from peaks of a few ms and maybe 10ms while cleaning, to an avg of about 40ms

ill keep it running for a bit, but i donā€™t expect it to get any better, and i know people tell me they can run unlimited, but i sure donā€™t have the resources for it yetā€¦ might hooking up 4 additional disks in a vdev more for the pool, and stripe the SLOG SSDā€™s via partitions and see if i cannot squeeze out enough performanceā€¦

these egress numbers look interesting thoā€¦ but they usually does this peak thing and then it just ends up being and over all lower avg anywaysā€¦ maybe if my machine could keep upā€¦i could get higherā€¦

and like i said, i tried this many times, always the same results long termā€¦ slowly trailing downwards over days and when i turn it off it goes up nearly immediatelyā€¦

so yeah, i cannot stress enough how important the max concurrent setting is for smooth operation.
and on top of that, limiting it will also keep your other internet usages from being disrupted, in theory you should be able to game on the connection when running the right max concurrent setting.

iā€™m not saying your node should reject a ton of requests, just enough that it can actually keep its latency down and thus keep performance upā€¦

think of it like thisā€¦ nah canā€™t come up with an analogy that makes senseā€¦
the fact is the higher the max concurrent is the slower the disks get and it hurts overall performance.
if somebody think they can explain it exactly why, the please enlighten meā€¦

so even if the successrate remains around the same mark, the throughput goes down.
as clearly shown by experimentation, ofc this might not be thus in all cases, but it sure is in mine.

iā€™m sure Storj will try to optimize it more eventually, but for now i am not aware of any other way rather than limiting the max concurrent connections in the config.yaml as described above and discussed in the post i linked.

You do realize that this means the setting isnā€™t making any difference right?

All this setting does is reject uploads when you cross a certain limit. If you see no rejections, it may as well be set to unlimited.

Itā€™s pretty rare that an update actually impacts something like this. Most of the time itā€™s a change in traffic patterns. Like a large amount of deletes. Additionally the node can do some IO heavy maintenance during updates. I guess that could kickstart CMR cache saturation and get your node into trouble at around the time of the update. It would be nice to test this by getting your node some down time by lowering available space like @hoarder does.

you keep saying thatā€¦ however i see a huge change in my storagenodeā€™s performance when i doā€¦
i duno whyā€¦ but its difficult to argue with the numbers, no matter how much you say, it wonā€™t change that fact.

I keep saying it because the source code doesnā€™t lie.

I donā€™t know what effect youā€™re seeing, but itā€™s not that setting.

but itā€™s been the only thing iā€™ve changed and because of our continued discussion about it, iā€™ve had it running for like a week with and a week without to compare itā€¦ iā€™m not saying itā€™s the programming per say, but that it clearly has an effect on how smooth my hddā€™s are running because it once in a blue moon rejects some requests thus keeping the system from getting flooded with more than it can keep up with and thus entering a decreasing spiral of continuous bad performance.

hddā€™s tend to stall out if they are given to many orders at one time, which i assume is why itā€™s improving performance to limit max concurrent on a system that seems stressedā€¦ iā€™m sure that if the system could keep up, it would only be detrimentalā€¦ but my current system just isnā€™t powerful enoughā€¦

seem to be getting kinda close now thoā€¦ just added a second SLOG SSD for even lower latency write caching, some of my hddā€™s are still having trouble thoā€¦ maybe itā€™s the one i have been having issues with, it seems to keep giving me higher latency than i get from the othersā€¦
been thinking of just adding an additional vdev of 4 drives in raidz1 to the pool, to take half the load of the 5 in raidz1 already in the pool.

doesnā€™t quite remove my issue with one drive giving me 100-200ms backlogs
also did find out i had disabled the cache on that particular hdd, so that wasnā€™t helping either iā€™m sureā€¦
was trying to find out why it was throwing read errorsā€¦ something which it seems zfs might have fixed for nowā€¦ but i might be ordering a couple of extra drives soon, so i have a couple of spares ready to goā€¦ pretty bad to have a raidz1 and no ability to replace a drive quickly.
so i might just replace it to be safe, find a less critical use case for it.
pity tho, its a nice enterprise sas drive with fairly low hours on itā€¦

anywaysā€¦ soon you may be rightā€¦ but still it seems my system benefit from having max concurrent on other than unlimitedā€¦ i duno why ā€¦ it just does thats what the multiple monitoring software i got tells me.

and if i canā€™t keep upā€¦ i ainā€™t surprised people just moderate intereste with limited hardware canā€™t keep their setups from blowing up.

I will continue to watch this, I have another guess that could solve the problem (in my case). Minimizing the storage space would also be worth a test, but it is not a ā€œsweetā€ solution ^^

i found a write cache helps a lot, but if it is a SMR drive it most likely already has like 256mb which is most likely more than it can use anywaysā€¦
on my zfs i set sync to always so that everything goes to the now dual MLC SSD write cache (which surprisingly enough also helped reduce the ssdā€™s latency, going from one to two of them ((not mirrored obviously))

anyways, so everything goes to the ssd cache and then every 5 sec its written to hddā€™s in one big sequential sweepā€¦ i found that minimized my hdd IO and improves read latency.

might need to get a good NVME drive, because my ssdā€™s just cannot keep up without getting into the 30-40 ms backlog peaks.

ssd write cache that both async and sync is written to and then flushed to disk in one go seems to be the optimal solution for optimizing random reads while sustaining reasonable writesā€¦ atleast in my caseā€¦ not sure if you can implement something like that.

I was responding to this part.

Now youā€™re saying itā€™s not 0, but once in a blue moon. Going by literal definitions of different types of blue moons thatā€™s anywhere from twice a year to once every 3 years. Iā€™m going to guess I shouldnā€™t be taking you that literally. But if itā€™s really as infrequent as that phrasing suggests, Iā€™d say the difference between results in 2 different weeks is much more likely to be a simple result of slightly different traffic patterns than the few times this setting actually rejects some traffic.

Either that, or itā€™s actually rejecting quite a few more uploads, in which case it could actually cause problems for customers.

In the end, I trust code. Code never lies and I can easily see how this setting is applied. If it doesnā€™t reject, it doesnā€™t do anything. That much I can be sure about.

it seems to be when i just booted up the system, else it basically doesnā€™t ā€¦ unless if i reboot the node, then it ofc goes all crazy when at 20ā€¦ i would almost bet that i could go a week without rejecting a single request, and then change it to unlimited and see my performance dropā€¦ i might sometimes reject a fewā€¦ but its so rare that i basically ā€¦wait i got logsā€¦ :smiley:
bearing in mind that the machine have seen very heavy use the last couple of weeks moving around 20 TB maybe if not more then while doing that and also scrubbing a lotā€¦ i might reject 1 request pr hour.
else on a day where i wasnā€™t using it and just letting it mind the storagenodeā€¦ it would maybe reject 5 in 24 hours.
so yeah it does reject a fewā€¦ especially at boot, that shit looks scary lol
and that few rejects basically gave me about 50% better performance on the storagenodeā€¦ which imo is a shit ton for very little wasteā€¦ the successrates go up, the network transfer both up and down improveā€¦ sure if it could keep up and have exactly 0 rejections it would without a doubt do betterā€¦

but iā€™m looking from a real world performance perspectiveā€¦ you cannot tell me storj doesnā€™t benefit from my network giving better numbers, rather than me taking 1 extra request an hour which slowly over days stress my system so performance eventually tanks massively.

i have little doubt thatā€™s what happens to many nodes and it doesnā€™t help the network one bitā€¦ infact it hurts its over all performance greatly.

but hey maybe its just my poorly managed systemā€¦ xD

I wonder if that initial bottleneck after starting the node could have lasting effects. With more concurrent transfers, each transfer takes longer to complete, hence the number of concurrent transfers stays higher, hence random writes become more random and concurrent, leading to more bottlenecks.

And yes, your node performing better is obviously good for customers. But the slight performance increase of a single node probably does very little to the experience of the customer. However, rejecting transfers can lead to uploads failing. This is obviously much worse for customers. My main worry around this is not the few transfers you reject now when your node reboots, as those are dependent on events that are unique to your node. My worry is that if the load on the network goes up across the board, suddenly all nodes that have a limit start rejecting transfers at the same time. This would be a coordinated failure that could really impede functioning of the network on uploads.

So I would argue based on the information I have as an outsider that you are putting the performance benefit of your own node above a risk factor for customers. Iā€™m not blaming you for that. If this is indeed a problem, the option simply shouldnā€™t be there for you to use it. The network canā€™t rely on people playing nice. But Iā€™m not buying the argument that itā€™s for the good of storj or the customers.

2 Likes

i find no fault in that argument, and yes i do believe its due to overloading the system short or long termā€¦ if the reject option is there and shouldnā€™t be used then itā€™s pointless to allow itā€¦ but not allowing nodes to atleast to some degree define what their system can manage, wouldnā€™t help the network as a whole either.

i didnā€™t mean my node makes a difference, clearly out of the thousands of nodes itā€™s basically irrelevant, no matter how well it performs or how powerful it isā€¦ which is kinda the whole idea in it being distributed.

but i mean that if this is a problem affecting many nodes, it could greatly affect the network performance as a wholeā€¦ if i can loose 33-50% of my traffic due to taking to many requests, then then overall network might be able to increase in performance by about that same number, as i doubt most nodes can keep up with my gear, even if mostly antiquated from a technology standpoint.

if rejections cannot work, maybe nodes should send some sort of information to a satellite at boot and then the satellite would limit the number of requestsā€¦ or something a kin to thatā€¦ i really have little clue on the exacts of that software / hardware / its features
for now it might not matter, but if the network performs 33-50% better with the same hardware that is essentially 33-50% more customers served, thus more payout to everybodyā€¦ when we get to that pointā€¦ well i donā€™t intend to put this to restā€¦ ill take it up in a performance troubleshooting / ideas suggestions vote or something when i feel well settled with my own system. and got some better ideas about possible solutions maybe.

breaking records now running at unlimited stillā€¦ofc its only been on for a few hoursā€¦is picking up speed thoā€¦ lol

does the webdash board autoscale if i break 300gb a day or do i go off the chart lol xD that would be awesomeā€¦ at current pace, and if the system will keep up i should break 300gb in the next graph day

I agree with everything you just said, hence this idea: Limit node transfers through node selection
Itā€™s closed now, so I donā€™t think you can vote for it anymore.