Power management: Intel Xeon E5 v3 vs v4: Idle-ish power consumption

How much power savings does switching from Xeon E5 v3 to v4 (Haswell to Broadwell) provide?

I have never found answer on this quesiton myself, but I got an opportunty to buy a pair of E5 v4 for $10 today, so here are the result of this experiment:

Replacing

  • 2x Xeon E5-2680 V3 2.5GHz 12-Core (total 24 cores, 48 threads) with
  • 2x Xeon E5-2637 V4 3.5GHz 4-Core (total 8 cores, 16 threads):

reduced idle-ish (stroagenodes, plex, various jails, etc, but no nfs/smb transfers) power consumption from 212W to 195Watts (17 watts savings). Break even time: 2 months ($.52 kWh electric cost)

Higher max clocks and slightly higher IPC allowed to no longer bottleneck single-user NFS and SMB transfers.

But wait, there is more. Broadwell’s killer feature is HWPM.

Why don't we enable it?

Results:

  • Freebsd/powerd/powerd++ lost all control over the CPU frequencies and CStates (frequency control driver not supported: hwpstate_intel0); albeit it wasn’t seeing beyoung C2 for some reason anyway, hence research deeper into the config.

    Oops!
    % sudo powerdxx -fv -n adaptive
    powerd++: cannot read hw.acpi.acline
    powerd++: (EDRIVER) frequency control driver not supported: hwpstate_intel0
    
  • Varying load I now observe each CPU clock frequency change independently, unlike what was achievable with powerd/powerd++)

    Nice!
    dev.cpu.15.freq: 3598
    dev.cpu.13.freq: 3598
    dev.cpu.11.freq: 1197
    dev.cpu.9.freq: 3598
    dev.cpu.14.freq: 3598
    dev.cpu.12.freq: 3598
    dev.cpu.10.freq: 1197
    dev.cpu.8.freq: 3598
    dev.cpu.7.freq: 3598
    dev.cpu.5.freq: 1197
    dev.cpu.3.freq: 3598
    dev.cpu.1.freq: 3598
    dev.cpu.6.freq: 3598
    dev.cpu.4.freq: 1197
    dev.cpu.2.freq: 1396
    dev.cpu.0.freq: 1498
    
  • Idle-ish power consumption further dropped to about 188 watts:I expected this to be slightly better – and it somewhat is.

    screenshots

    image
    Over last day (did in the middle is me with a screwdriver replacing CPUs)

    With this, total power savings is 24 Watts, or $9/month. So break even time is just one month, not two :partying_face:

Biggest bonus so far – copying data over SMB is now goes just as fast, but at lower power – I think because not all cores spin up when just one is enough.

Note, this is on FreeBSD 13. Linux has much better built-in power management, so likely you won’t see such a dramatic power savings.

4 Likes

Great post, thank you for sharing.

Unless you’re in dire need of the PCIe lanes, you’ll see some savings taking out one of the CPUs, and moving all memory to the other (if you’re running 1DPC at the moment).

Tuning the fancurves could also be an interesting endeavor for you. In my own testing I found that the power saved by running reduced fan curves outperforms the extra gain by running at slightly higher temperatures. I saved ~10watts by going from 5000RPM to 2000RPM.

Winter was cool, I’m looking forward to seeing if I can stay within safe temperatures over the summer.

2 Likes

Oh yes, I considerd yanking one CPU. Unfortunately, this will also yank out 4 PCIE slots…

diagram

IN fact, I even considered replaceing MLB+CPU+ram with modern Xeon E-2400 – but those are all very stingy on PCIE port count. And I need one for HBA, two for a pair of SSDs and one for LAN card. So on this MLB, I’ll be missing one…

Totally onboard with hoterr and quieter is better than loud and cold. I’ve alredy ordered the queiet fans, with half the static pressure (it will have to deal with it) and I’m going to try to cram them into supermicro housing, or maybe 3Dprint a custom one.

Which fans did you order?

I have had very great luck with Arctic S8038-7K and undervolting them. They’re 38mm deep, and fit within my case in the same footprint that my old fans did.

Why do you need 2x E5-2680? each one have 40PCIe lanes, and enough power to run 20 nodes easily, it is just additional power consumption for cpu that do nothing

Years ago I changed my server from 2637 v3 x2 to a single 2680v4.

Like yours, power consumption went down only a little. maybe even less drop because the new one had more cores.

Removing a CPU entirely saved maybe 20w of power. They way my motherboard was laid out the second cpu only drove half the DIMM Slots and maybe a single pcie slot, so it wasn’t much loss at all.

now I just run the home server on a consumer i7-13700. Less power and a useful integrated gpu. and ā€œenoughā€ pcie lanes to run a SAS HBA and a 10gb NIC.

This. If a motherboard has at least one PCIe x8 for a HBA, that’s enough. GPU can be integrated and you can plug a 10G NIC into a M.2 if you don’t have at least another x4. The boot OS can be on a SATA SSD if you have to. Simple, cool, quiet, cheap and easy to repair.

I looked at those, but they don’t specify noise level, and hence, I assume they are noisy; I would just run standard supermicro fans at lower speeds then.

Main drivring force here is to

  1. avoid manually managing fans (I’ve played with PI controller, I don’t want to spend time tuning it)
  2. I considered ARCTIC P8 PWM, Noctua NF-A8, NF-R8, be quiet! Pure Wings 2, Noiseblocker M8-P, and a bunch of slower supermicro fans. sorted by static pressure and price, ended up with Noctua: they deliver half the static pressure of original fans, but at almost no noise. That’s pretty good. And I can probably run them at full speed all the time, and even avoid configuring IPMI fan thresholds.

I don’t. I did not choose them; I ended up with them. And now replaced them with 2x E5-2637 V4. Why 2x? Because I’m not replacing MLB, and removing second processor removes 4PCE ports.

Well, it serves PCE ports.

Have you tried enabling HWPM (hwpstate_intel0)? The change appears to be quite dramatic (after activities died down; I forgot I"m running replication at midnight, and even though its very light on a CPU, extra HDD activity contributed to the power consumption. I’ve updated numbers in the original post)

Oh this is nice.

Yeah, consumer processors are pretty good at power management.

But it isn’t. I have two P3600 SSDs I use as a special device. I have 2.5Gbps Nic. Where do I stick them?

I have onboard 10G Nic. I disabled it. It’s a waste of power. 2.5Gbps is plenty for home. And that MLB does not have m.2 (which I consider a positive – I’d rather have a full size PCIE slot)

Ideally, GPU should not be there at all. It still consumes power, you can’t turn it off.

Yep, that’s what onboard sata controller is for.

The next biggest power consumer is 3008 HBA (according to theartofserver’s measurements, and me touching the heatsink, it consumes 11W at idle. I’m looking at 3808, at 4W idle, but I can’t find a reliable source for it. Some Chinese sellers on ebay sell it ā€œnewā€ suspiciously cheaply, and albeit from the photos it does not look obviously fake, but I’m a bit wary, especially after their evasive responses when I asked directly.

the dude selling thousands of server parts does not understand technology. But I get the hint

The main point of replacing the HBA is not to save the measly 5 watts ; it’s to let CPU sit in low C states for longer. I’ve read various reports that old generation HBAs’ are pretty chatty, and are not letting Cpu relax enough. I don’t know how to confirm that other than by replacing HBA and looking at power consumption.

1 Like

If just using the P3600 for special meta device, as I understand it, you don’t really need high throughput for these devices, you need high IO and low latency. That will be fine in a PCIe 1x slot. I run a 2x 2.5Gbps NIC as well - that runs natively in a 1x slot, which leaves a 16x slot for your HBA of choice.

The consumerboard I have my 12400 in has 4x PCIe 1x slots and 1x PCIe 16x slots - which is perfect for my needs. It all boils down to you living with having you cache-lads in 1x mode of course :slight_smile:

The cards are physically x4. So unless there is a weird motherboard that has x1 PCIE in x4 port, twice – it’s a nonstarter.

This is a very weird design choice! But I avoid consumer boards as a matter of policy. They are all of varying degree of obnoxiousness, optimized for use cases I generally don’t care about. Like who needs four x1 slots? What is this? a server for ants ? :slight_smile:

I mean, I understand – one for WiFi/Bluetooth, one for NIC, one for USB card, one for some ebay weird contraption that blinks LEDS. Great for consumers. But for the purposes of serving data – none of this is needed. And what is – does not come in x1 sizes. I mean, it’s a hobby project, why woudl I want to use consumer stuff when I can get a real thing and feel better about it. That’s the whole point – to feel better about it.

If there was x4 slot with x1 wired it would have probably worked. I have seen a few (old-ish) HBAs that would refuse to initialize in say x8 port with x4 wired. They wanted all lanes. It was probably old and lazy design, but that happens.

Of course, I could just use NVME SSD and connect it to a better HBA – but I ran out of disk bays, and I don’t want to tie it down to a PSU sideways with sap and twigs :slight_smile:

1 Like

My machine is on a B660 DS3H DDR4, which has 5x PCIe 16x slots, but the four are wired for PCIe 1x, with the top one for 16x.

A jack of all trades board, but master of none. It ended up at my place, because the previous owner wanted a graphics card, and one of those oooooooooold PCIe 2.0 8x 10Gbit cards … which obviously won’t get the full performance in a PCIe 1x slot.

I made a video about running it here: https://www.youtube.com/watch?v=2Bzg_17PEGk

As you can see, I chose a IGPU-less CPU, so every time I need to enter BIOS, I have to use the PCIe 16x slots for a graphics card. It too could be pushed into one of the 1x slots permanently - but I’m using them.

It’s not a perfect system, and I’m not sure I’d done it this way if I could redo it, but it works for what I want it to do for now.

1 Like

Those little things are that add up to be a PITA. Server has to have IPMI! Being able to reinstall an OS from a disk image remotely (i.e. without lifting an arse from the couch, let alone needing to haul displays, keyboards, mice, power cables, extension cords, screwdriver, where-did-i-put-that GPU…) is liberating and indispensable :slight_smile:. Even if a CPU had a GPU – you still would need to haul all of that equipment.

In the meantime, I figured why my CPU spends so little in the low C states. I found the culprit. Guess who it is? It’s not an HBA. It’s Storj (Storj-induced activity on a network adapter, to be precise)!

% vmstat -i  | egrep '(igc0|mpr0)'
irq99: mpr0                        89895         49
irq109: igc0:rxq0                2847636       1546
irq110: igc0:rxq1                 423318        230
irq111: igc0:rxq2                 657938        357
irq112: igc0:rxq3                1178470        640

Just look at this monster interrupt rate… I’ve tried setting rx_int_delay – but it made no difference… It’s a legit interrupt stream..

Awesome video! Strong bigclivedotcom vibes in the beginning :slight_smile:

1 Like

Those little things are that add up to be a PITA.

I wholeheartedly agree, but I can’t decide if I care in my homelab. I could not do my dayjob without IPMI - it’s amazing - but in my home? I have a spare monitor/keyboard/mouse in the rack anyways. I can spare the extra steps, and walk all the way over there. It’s not perfect, but I’d much rather save the 1-2 watts and be smug about that :stuck_out_tongue:

CPU without iGPU though, I’ll change that next time around. Disassembling the machine every time I need to see something in BIOS is too tedious.

Good job on the C-States! It raises another question: If machine cannot enter lower sleep states because StorJ is doing something … it means StorJ is doing something, which when viewed isolated is great :slight_smile:

Strong bigclivedotcom vibes

Thanks! I’ve not seen much of his content, but from the amount that I have seen, it’s a compliment I’ll gladly take :flexed_biceps:

1 Like

I’ve been very happy with the new cheap IPKVMs: in particular the NanoKVM PCIe in a x1 slot. They just came out with a ā€˜Pro’ version with extra features… like HDMI-passthrough and PiKVM support.

What I hope they eventually release is a x1 model that also acts as a simple GPU (instead of looping back to HDMI). That would work very well with iGPU-less consumer CPUs… or even if you ran Threadripper/EPYC at home.

(Basically all my consumer boards had a free x1 slot: so adding a $50 card for remote access and power control was a no-brainer).

2 Likes

I take the opposite approach. I don’t have to do anything at home, there is no pressure – time, or resources, or completion in the first place – so why not do it correctly? You are not making money on this; without the pressure to turn a profit you can afford to do things right.

Which brings us here.

I think not getting that abomination is a no-brainer. For $50 you can pretty much buy a server board with IPMI built-in (along with a processor). Major differentiator is reliability. Something tells me that Supermicro’s firmware for Aspeed chip, albeit ugly as death, will be much more likely to save the day, that some overpriced Chinese pet project with 38 features half of which don’t work. (probably, in large part, because it’s a consumer product)

For that $50, instead of an enterprise solution you are getting a haphazardly put together toy, that is of questionable quality, reliability, shoddiness (cables everywhere – cable to reset button, cable to power sense, cable to video output, cable, cable, cable); takes up PCI slot, does not (and cannot) provide some of the useful features – like sensors and fan control – or anything for that matter that cannot be bodged onto the existing wiring.

Supermicro motherboards aren’t very common where I am. Mainly ,Dell, HP/Lenovo/IBM. Most are very old - and well above $50usd.

I’ve found iLO to be a pia in older enterprise gear. Vendor stopped maintaining it, they require old outdated browsers to operate.

I love the concept of these for the home server user. Looked at them several times.
The software I believe is open source currently, I hope it’s actively maintained for some time to come.

How well do they work?
Is it like working at console, or laggy?

Oh yeah, that one, or some older java based solutions are quite annoying to use.

Supermicro, unlike othres, generally sticks to standards. For example, a lot of their chassis support standard ATX/EATX/ITX motherboards, and they even offer adapters for connecting third party motherboards in thier chassis to the front panel and fans, and vice versa. That’s what makes them so popular on the secondary market and among enthusiasts.

Their motherboards also tend to be most thought out and coherent both in terms of features, usability, and desing. Asrock Rack is another popular vendor – but they are a bit more over the place.

Check out ebay – a lot of datacenters here get upgraded/torn down and the market is flooded with very cheap but still rather capable gear. A lot of these folks ship internationally, albeit it may add some non-trivial cost.

I’ve got one of the NanoKVMs.

For what the price, they’re pretty nice. 1080p/30 is more than what I need for hardware monitoring, low (but noticable) input lag, and will emulate keyboard/mouse no problem.

Can host ISO files if you want to FTP them to the device. On device file browser would be nice but FTP is no problem. It’s only 100megs, so in my house it’s much easier to physically walk a USB drive out there.

I made a video about it as well, if you’re interested: https://www.youtube.com/watch?v=PR2WzjCMo2g

1 Like

I don’t notice lag any time I’m in a text console: but if you’re using mouse/UI you can feel a tiny bit. Apparently the newest version is even faster. And you get the choice of compression you want to use. Lag isn’t an issue at all for me: as if I’m in the KVM I’m fixing something important… and just happy I don’t have to go to the system :slight_smile:

Since Ottetal reminded me: the one tiny issue I had with the current non-Pro versions (I have the Cube, and PCIe ones)… was that 100M networking. You don’t notice at all… until… you need to upload a multi-GB Windows boot ISO or something. It works fine… I’m just impatient. And they’re pretty generous with the space: I think you have 20GB free by default (and Windows/Ubuntu installers are only around 5GB). The Pro version comes with 1G networking and/or wifi now.

I get what arrogantrabbit is saying about built-in remote-access (from used $50 motherboards)… but I’m not interested in buying ewaste to get it. Pretty much any gaming desktop has a free x1 slot: it’s no burden to slap a card in there.

I still can’t believe there’s an entire computer in there (for the Cube version) - it’s the size of a golf ball! And the external Pro screen is certainly a gimmick: but still so cheap!