What is Suspension & Audit?

madbitz · December 15, 2020, 7:18pm

My bad, i am using more than 50% i am actually using 80% right now. Its all starting to make sense now, i did a case change about 3 months ago, and one of my drives was overheating. I added a fan and solved the issue. temps got back to normal. So i assumed it was fixed, clearly not.

deathlessdd · December 15, 2020, 7:21pm

So your running windows then running windows as a VM? and using windows GUI?

You should use linux if your going to run a VM since it is much more stable, But running a VM inside windows not such a good idea. Better to run directly on windows then running a VM.

madbitz · December 15, 2020, 7:22pm

Yes that’s correct, but i am not familiar with linux, i know i should. But cant teach an old dog new tricks. Was running great until drive failure.

deathlessdd · December 15, 2020, 7:23pm

I know but your adding alot of over head that could be pervented just by running your node directly on the windows, Instead of using windows as a base then running a VM running windows.
Might I add alot of failing points that could happen. It would be hard to track down the exact issue.
Also sounds like your running a raid that could cause issues with SMR drives.

baker · December 15, 2020, 7:27pm

Sounds like you might be getting somewhere.

Are you sure you are using a mirror? Because a mirror setup should only provide half of the capacity of the combined capacity of the two drives (i.e. 5TB in this case). But if your currently sitting at 7.27 TB, then something is definitely off with your setup. Are you using a spanned volume perhaps?

madbitz · December 15, 2020, 7:37pm

Here is my setup showing faulty drive.
. storage pool

madbitz · December 15, 2020, 10:26pm

Just to give you a reply, i have taken your suggestion onboard. I have read online there are a lot of issues with storage spaces. So maybe i made a bad move there as it reports incorrect storage space and will stop allowing files to be added to the drives even if there is space. I just rebooted my pc and the VM to clear any issues that might be there. Just checked storage spaces and the drive is again registering as ‘OK’. Strange. This might explain why for months i wasn’t getting any new files on my node it was stuck around 1.32TB. but recently more files have started to fill up my drive. Does anyone know if there was any recent storj updates that included storage not filling up if not then, it looks like this is the problem.

deathlessdd · December 15, 2020, 10:48pm

I can’t say I have ever used this feature with windows, My servers have hardware raid so I can’t really say what is going on with this. Maybe the drive failed and is having some issues. But from my experience running a VM inside windows running windows I have not ran long term before, I use proxmox to run VMs with my server. But depending on your setup an the specs of the system is unknown to me, Example how many cores you setup for the VM and ram. Which hardware your running you might be bottlenecking the VM. I know if you set 4gigs of ram for windows it does struggle with 2 cores. Especially windows 10.

But data on the storj network has been low so that could be why it’s not filling up.

I wanted to show one of my VMs I am testing a storagenode on.

This is Ubuntu running a node on its been super stable with no issues.

This is a windows VM

Notice the difference in ram.

madbitz · December 15, 2020, 11:05pm

Hardware RAID is far superior over software. I have a i7-4770k so 4 core 8 threads, 3 allocated to vm and only running 1 vm i plan to upgrade to threadripper at some point, one day. lol.
16gb ram ddr3 only, mem is dynamic so uses what it needs… Never an issue with performance. its my old mining rig from long ago, but needs a serious upgrade to be fair.

I will look into abuntu, but completely new to it. as i avoided it. I always like the ease of windows GUI i suppose.

deathlessdd · December 15, 2020, 11:09pm

Yeah that is what im using right now is a threadripper, But I don’t really use it for running a node just testing with it.

But windows running VM windows not such a great idea with very low cores and ram, Ontop of the system already needing alot of resource. Also depending on what else is running on the system it could be slowing everything down.

If you needed help setting up ubuntu its super simple. Its much better and stable then windows…

madbitz · December 15, 2020, 11:16pm

Nothing else running on the VM, only using at any time around no more than 3% most of the time its usage sits at 0% - 1% or there about. Its the memory that keeps creeping up. Right now the node hasn’t gone offline for the longest time in months. Will keep an eye on the drive. I think this is the issue. Connectivity issues maybe, will check tomorrow.

deathlessdd · December 15, 2020, 11:19pm

This is a usual sign if its going up means either the drive can’t keep up and its storing the data in ram, Or its windows 10 doing its thing in the background that no body really knows causing the system to really bottleneck and defragging in the background, less you disabled it.

If you didn’t disable defragging in both windows on these drives it can really cause some issues.

madbitz · December 15, 2020, 11:32pm

No defragging is still on in both drives. will disable. was set to weekly, but the timeouts were happening every day, all day, with no time restraints. Will see if it goes offline over night.

BrightSilence · December 16, 2020, 12:29pm

Not all SMR is made equally. Some have much larger CMR areas and caches, which can overcome a lot of the shortcomings of SMR.

You also seem to be spreading the load across different nodes, which also makes a big difference and gives SMR drives much more room to breathe and do housekeeping.

Compared to RAID, individual nodes are much easier on HDD’s. With a stripe or mirror array every write hits both drives. While with individual nodes, every write hits only one drive. The number of writes on an SMR HDD has a much bigger impact than the size of those writes. So yes, I would really recommend splitting this into 2 nodes.

deathlessdd · December 16, 2020, 3:42pm

That was my thought process but you explained it much better.

SGC · December 16, 2020, 4:03pm

the 50% or so decrease on HDD speed is purely down to the geometry, as you get closer and closer to the center of the platter they will be moving slower underneath the heads, thus the edge of the platter where the HDD will start writing will be significantly faster than the last 10-20%

and it does infact measure out to about 50% loss of write speeds when the capacity is being maxed out.
this is why people would in the past under provision HDD to increase their speeds, due to them only writing in the faster ends of a platters, which is a quite nice way to gain a bit of extra performance

not sure how the geometry affects the write iops, aside from ofc writing slower and at least in theory one would pass less… ofc if we are thinking of an iops limitation of the SMR technology which can in some cases be around the 40 mark rather than the 400 on CMR when comparing sustained writes.

then the platter geometry might not be a big limitation, but it sure doesn’t help…
worst case SMR is like 700kbyte/s to a single drive.

reach that point an you basically stall the drive, because it will be processing data at lower amounts that it might be going in… and thus more and more will pile on until audits starts to fail.

I cannot overstate how important Brightsilence’s suggestion to split up SMR raids are…
SMR drives should never be run in RAID

some SMR technology is used in RAID, but that’s not consumer grade SMR… SMR is bad at IOPS and RAID makes IOPS worse… in anything other than mirrors

SMR RAID brings the worst SMR of the worst RAID and combining it into one big headache.

If one wants to use multiple SMR drives as “one” partition / storage solution.
then i believe something a long the lines of Tiered storage would be a decent approach.

Tiered storage doesn’t combined the worst parts of two technologies, but Tiered storage ability to spread data across multiple drives and balance it depending on speed and usage, should augment slower drives like SMR.

madbitz · December 18, 2020, 10:58am

Not sure if this was intended for me or deathless. But can i point out i only have 1 Node running.
Also, to add, my node seems to be operating ok. No disconnections in the last 24hrs but mem is still high but stable.
You can see this be the increase by my online status. node
Compared to my last audit above

BrightSilence · December 18, 2020, 11:33am

I haven’t used storage spaces myself and I was a little confused about how it displays storage usage. It sounded like you had a striped or spanned array, so my comments were based on that.

Regardless, I think my advise would still be the same. Currently every write hits both HDD’s which leads to quickly saturating the writes the SMR disks can handle. If you would just split up that array and run 2 nodes you can spread out the load so that every HDD has to deal with only half of the load. Additionally you would be able to share twice as much space, which is just more money in the pocket.

Now this only holds up if you don’t use this space for anything other than Storj. If there is other stuff that you want to protect against HDD failure, I can imagine not wanting to remove the mirror.

madbitz · December 18, 2020, 11:54am

I take everything you guys are saying onboard. How on earth could i start to split what’s there without losing data or reduce downtime. Also, backup data would need a third drive to store a backup i would assume?
Other than this, i could just simply start a new node with a separate drive after i purchase one of course. this is additional costs i don’t need right now.

deathlessdd · December 18, 2020, 11:58am

I would say backing up just for storj is a waste of a hard drive…run 2 nodes instead of one node is the point of this.
Cause right now your killing your SMR drives, if you ran 1 node on each drive it would split the load and not stressing the drives so much.