Power management: Intel Xeon E5 v3 vs v4: Idle-ish power consumption

According to pcm, at that load, each dimm consumes under a half a watt:

Count of ram sticks does not matter, only total amount of memory does.

Half a watt pr. modules seems far to low. You can reliably count on DDR4 using 2-3 watts idle and 5 watts in high read/write.

Am I reading your table wrong? Right most columns says 4.17 and 3.58. Is this pr CPU, pr memory bank or pr. dimm?

It’s per socket, joules per second. I have four 16GB modules at each socket.

It’s borderline low, but I think plausible if memory chips are not used much. Can’t access the datasheet for this specific memory model/stick, but similar modules seem to require 0.85~1W in standby. @arrogantrabbit, could you try running some memory-intensive benchmark, like sysbench memory run and show numbers from pcm-power again?

stress-ng -vm=16

sysbench memory run

Active standby current 846mA, which at 1.2V and 8 dimms yields 8W.

2 Likes

Replacecd motherboard with a single socket one. All else being equal,

pcm
 Core (SKT) | UTIL | IPC  | CFREQ | L3MISS | L2MISS | L3HIT | L2HIT | L3MPI | L2MPI |   L3OCC |   LMB  |   RMB  | TEMP

   0    0     0.05   0.46    1.20      26 K    168 K    0.84    0.40  0.0010  0.0062      888        4        0     70
   1    0     0.04   0.79    1.20      79 K    193 K    0.59    0.33  0.0018  0.0045     1896        9        0     70
   2    0     0.04   0.45    1.39      48 K    176 K    0.73    0.38  0.0018  0.0067     2712        8        0     72
   3    0     0.05   1.23    1.60      90 K    249 K    0.64    0.43  0.0008  0.0023     2352       13        0     72
   4    0     0.05   0.70    1.29      50 K    188 K    0.73    0.39  0.0011  0.0042     1752        9        0     71
   5    0     0.04   0.76    1.25      41 K    178 K    0.76    0.40  0.0010  0.0042     1320        8        0     71
   6    0     0.05   0.64    1.22      65 K    208 K    0.69    0.42  0.0016  0.0050     2208        9        0     72
   7    0     0.04   0.65    1.23      66 K    189 K    0.65    0.34  0.0021  0.0058     2112        6        0     72
---------------------------------------------------------------------------------------------------------------
 SKT    0     0.05   0.74    1.30     468 K   1552 K    0.70    0.39  0.0013  0.0042    15240       66        0     65
---------------------------------------------------------------------------------------------------------------
 TOTAL  *     0.05   0.74    1.30     468 K   1552 K    0.70    0.39  0.0013  0.0042     N/A     N/A     N/A      N/A

 Instructions retired:  365 M ; Active cycles:  493 M ; Time (TSC): 3527 Mticks;

 Core C-state residencies: C0 (active,non-halted): 4.70 %; C1: 19.92 %; C3: 0.00 %; C6: 75.38 %; C7: 0.00 %;
 Package C-state residencies:  C0: 38.14 %; C2: 23.74 %; C3: 0.00 %; C6: 38.12 %; C7: 0.00 %;
                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 Core    C-state distributionβ”‚00001111111111111111666666666666666666666666666666666666666666666666666666666666β”‚
                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
 Package C-state distributionβ”‚00000000000000000000000000000002222222222222222222666666666666666666666666666666β”‚
                             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
---------------------------------------------------------------------------------------------------------------

MEM (GB)->|  READ |  WRITE | LOCAL | CPU energy | DIMM energy | LLCRDMISSLAT (ns)| UncFREQ (Ghz)|
---------------------------------------------------------------------------------------------------------------
 SKT   0     0.08     0.03  100 %       9.82       8.03         203.00             0.73
---------------------------------------------------------------------------------------------------------------

I’m now getting 169 watts under the same conditions. I’ve played with HWPM and Native vs OOB modes of power state conrol, to ensure good idle power consumption, but also peak single threaded performance.

Turns out if we let CPU control power, it will stay at lock clocks, even if a single thread (e.g. SMB) desperately wants cycles. Asking OS for hints (in Native Mode) and using tools like powerd, that can only control clock across the whole package, results in hints to go high on all cores, even when one thread wants power.

At the first glance it’s an overkill – we want only sinfgle core to run at a high clock, and others – at lowest possible, when only one thread on the system needs compute resources.

However, here autonomous C-state control comes to rescue – in spite of all cores being hinted to run at high clock, there is only one core worh of work generated, so other cores go to low power C-states. Problem solved.

The numbers reported by dev.cpu.*.freq arenow bogus – those are what OS wants clocks to be. The actual clocks read by pcm from counters are correct, but irrelevant: that’s what CPU decides them to be. But looking at Cstate residency – it all make sense: with 1-cpu worth of workload, on a 8-core CPU, we see 12% C0 residency, in spite of all clocks reading high.

bogus frequencies, pcm and sysctl side by side

The scripts for the right pane:

#!/usr/local/bin/zsh

cmdwatch -n 1 'sysctl dev.cpu | grep freq: && printf "\n" \
    && ./measure_power_once.sh'

./measure_power_once.sh

#!/usr/local/bin/zsh

printf "%s\t%s\n" "$(date '+%H:%M:%S')" \
    "$(sudo ipmitool dcmi power reading | sed -En 's/Instantaneous power reading: *(.*)/\1/p')"

I actually ended up disabling powerd – behavior did not change. OS can ask for whatever it wants – but if there is no work – core is asleep. Sleeping is even better than running on low clock.

Awesome :slight_smile:

Why you don’t want to try SOC? In those conditions, if also ignore the β€œuse what you have now” (because I feel that you wanted to build something especially for Storj), why not?
Yes, it will be worse, but, uses LESS watts.

I need 3 PCIE slots, and good single-threaded performance. None of the SoC based boards offer that. More or less modern CPUs like Epyc 4004 and Xeon 2400 are very expensive. Older C3000 based boards are very low clock.

Which specifdic SoC do you have in mind?

yes,.. you are right, nothing there. Yet.
I hoped for Raspberry Pi 4.

P.S. and I still love this project: Odroid HC2: Feedback Odriod HC2
and

ok, I know, it’s not so effective as your setup.. but.. you know - it’s very simple, I love it.

Actually, raspberry pi5 with 16GB of ram, and PCIE slot may be an interesting contender for a home server, albeit heavily overpriced, and requiring additional hardware to connect all together β€” at least the HBA.

Maybe ampere would be a better contender.

From stability perspective β€” x64 arch is still the standard. And power efficient processors do exist. Depending on what features one can forego, it’s not impossible to arrive to a sub-20W at idle: throw away SAS backplane, HBA, stick to sata disks, low ram, i3 type CPU, non-ECc, unregistered unbuffered memory, micro itx mlb β€” but that’s way too much a sacrifice.

That still seems high.

Does this include storage?
Is this measured at wall socket?
What are you using to measure power usage?

This is reported by the power supply.

sudo ipmitool dcmi power reading | sed -En 's/Instantaneous power reading: *(.*)/\1/p'

This was independently verified to match UPS reporting.

Power supply is 92% efficient at that power, so the power delivered to the system is 169W * 92% = 155W.

PSU eficiency around that power

Yes. It also includes HBA, 2x P3600 SSD, a couple of SATA SSDs and fans.

This is what I got:

Id name source Idle, W Active, W Count Total Idle, W Total Active, W Sub types
ST20000NM007D-3D Seagate Exos X20 20TB datasheet 5.4 9.4 4 21.6 37.6 Disks 64.4
ST18000NM003D-3D Seagate Exos X20 18TB datasheet 5.4 9.4 4 21.6 37.6
ST18000NM002J-2T Seagate Exos X18 18TB datasheet 5.3 9.4 4 21.2 37.6
INTEL SSDPEDME020T4D NVMe SSD 2TB datasheet 4 25 2 8 50 SSDs 9
INTEL MEMPEK1J016GAD Optane Memory M10 (16GB) datasheet 1 2 1 1 2
INTEL SSDSC2KW128G8 SATA SSD (128GB) datasheet 0 1.3 1 0 1.3
BPN-SAS3-826EL1-N4 SAS3 Backplane guess 10 20 1 10 20 HBA/Backplane 21
AOC-S3008L-L8e RAID/HBA Controller artoftheserver 11 13 1 11 13
Intel Xeon E5-2637 Processor 4C6T pcm 12 145 1 12 145 CPU/MLB 32
Supermicro X10SRL-F MLB guess 20 50 1 20 50
18ASF2G72PDZ-2G6E1 DDR4 ECC REG RAM 16GB datasheet 1.25 2 8 10 16 RAM 10
Fans FAN-0126L4 datasheet 3 7.2 3 9 21.6 Fans 9
Totals 145.4 431.7 0 145.4

So I agree, there are about 20 watts still unaccounted for, (even ignoring 5V, albeit that is supposed ot barely used). Unless the MLB is much more power hungry than I anticipated.

Makes a bit more sense with storage.

92% efficiency, means a loss of 8% in conversion, so a usage of 169W on DC side = 169W + 8% = 183W input.

P (watt) = I (current) * V (voltage)
P = AC Current * AC Voltage

(Your power supply status page gives you the figures to calculate β€œwall” usage).

No. Efficiency is defined as Pout / Pin. 169W I referred to is Pin. The 155W is Pout. The screenshot was taken not at idle, just as an illustration of the efficiency around that load.

Also no. Power includes power factor. Ac voltage and current are complex numbers/have phase. PF is close to 1 for this, and many enterprise power supplies, due to additional circuitry, but this is not the case for many consumer power supplies. This mostly has to do with a difference between commercial and residential electric rates: home users are charged for active power they use. Everyone else pays for apparent power β€” effectively, penalized for a low power factor, and rightfully soβ€” otherwise big motors or capacitors can load the grid to the max while consuming virtually no energy.

Bottom line β€”you can’t just multiple AC current by voltage and call it a day.

You do not gain in a conversion from AC to DC. There is a LOSS. As you indicated your PSU is 92% efficient, this is confirmed using its own statistics, loss of 8%.

Yes power factor is a consideration, rule of thumb is 1.0. (and the goal of PSUs with PF Correction).

Only figure that matters is the β€œwall plug” wattage before UPS. Because that is what it is costing you.

1 Like

That was the figure I kept quoting. The 169 watts is the consumption before the power supply. That’s the starting point.

However, I disagree that it’s the only thing that matters.

To understand what actually matters you want to decompose that power budget into specific consumers, including the power supply itself.

For example, here clearly upgrading to titanium PSU would allow to shave off a few watts, with the right sizing.

Where did I argue the opposite? How would that even make sense?

My mistake, mixed up your figures.

To avoid bumping this thread every time I find out somethign else, I’ve published MLB/HWPM configuration that yields the best power/performace balance here https://blog.arrogantrabbit.com/net/freebsd/TrueNAS-Power-Management-SpeedShift/, and for my system the following would be next steps:

  1. Replace the Xeon E5-2637 v4 again, this time with E5-1650V4: for a few more cores, and more importantly, lack of QPI and other multi-socket support. Interestign how much impact will this have on the power.
  2. Keep hunting down the remaining power consumers. Maybe it makes sense to replace the HBA and backplane with better HBA and TQ backplane – this can potentially save 10-20 watts – i’m not benefitting from NVME support of my backplane anyway.
3 Likes

What PSUs are in your system? :slight_smile:

I doubt you have true A/B power at you setup (but if you have; cool!), in which case it could make sense to go for a single-input high-efficiency PSU. I saw a handful of watts being saved going from gold rated to titanium rated on my machine. I’m making a video about it, becasue I like whoring out my hobby.