Re: Reducing power consumption on idle servers - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Reducing power consumption on idle servers
Date
Msg-id 20221118005918.skhrcvn43msvdbxr@awork3.anarazel.de
Whole thread Raw
In response to Re: Reducing power consumption on idle servers  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: Reducing power consumption on idle servers
List pgsql-hackers
Hi,

On 2022-11-17 13:06:23 +0530, Bharath Rupireddy wrote:
> I understand. I know it's a bit hard to measure the power savings, I'm
> wondering if there's any info, maybe not necessarily related to
> postgres, but in general how much power gets saved if a certain number
> of waits/polls/system calls are reduced.

It's heavily hardware and hardware settings dependent.

On some systems you can get an *approximate* idea of the power usage even
while plugged in. On others you can only see it while on battery power.

On systems with RAPL support (most Intel and I think newer AMD CPUs) you can
query power usage with:
  powerstat -R 1 (adding -D provides a bit more detail)

But note that RAPL typically severely undercounts power usage because it
doesn't cover a lot of sources of power usage (display, sometimes memory,
all other peripherals).

Sometimes powerstat -R -D can split power usage up more granularly,
e.g. between the different CPU sockets and memory.


On laptops you can often measure the discharge rate when not plugged in, with
powerstat -d 0 1. But IME the latency till the values update makes it harder
to interpret.

On some workstations / servers you can read the power usage via ACPI. E.g. on
my workstation 'sensors' shows a power_meter-acpi-0


As an example of the difference it can make, here's the output of
  powerstat -D 1
on my laptop.

  Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
16:41:55   0.1   0.0   0.1  99.8   0.0    1    403    246    1    0    1   2.62
16:41:56   0.1   0.0   0.1  99.8   0.0    1    357    196    1    0    1   2.72
16:41:57   0.1   0.0   0.1  99.6   0.2    1    510    231    4    0    4   2.64
16:41:58   0.1   0.0   0.1  99.9   0.0    2   1350    758   64   62   63   4.06
16:41:59   0.3   0.0   1.0  98.7   0.0    2   4166   2406  244  243  244   7.20
16:42:00   0.2   0.0   0.7  99.1   0.0    2   4203   2353  247  246  247   7.21
16:42:01   0.5   0.0   1.6  98.0   0.0    2   4079   2395  240  239  240   7.08
16:42:02   0.5   0.0   0.9  98.7   0.0    2   4097   2405  245  243  245   7.20
16:42:03   0.4   0.0   1.3  98.3   0.0    2   4117   2311  243  242  243   7.14
16:42:04   0.1   0.0   0.4  99.4   0.1    1   1721   1152   70   70   71   4.48
16:42:05   0.1   0.0   0.2  99.8   0.0    1    433    250    1    0    1   2.92
16:42:06   0.0   0.0   0.3  99.7   0.0    1    400    231    1    0    1   2.66

In the period of higher power etc usage I ran
while true;do sleep 0.001;done
and then interupted that after a bit with ctrl-c.

Same thing on my workstation (:

  Time    User  Nice   Sys  Idle    IO  Run Ctxt/s  IRQ/s Fork Exec Exit  Watts
16:43:48   1.0   0.0   0.2  98.7   0.1    1   8218   2354    0    0    0  46.43
16:43:49   1.1   0.0   0.3  98.7   0.0    1   7866   2477    0    0    0  45.99
16:43:50   1.1   0.0   0.4  98.5   0.0    2   7753   2996    0    0    0  48.93
16:43:51   0.8   0.0   1.7  97.5   0.0    1   9395   5285    0    0    0  75.48
16:43:52   0.5   0.0   1.7  97.8   0.0    1   9141   4806    0    0    0  75.30
16:43:53   1.1   0.0   1.8  97.1   0.0    2  10065   5504    0    0    0  76.27
16:43:54   1.3   0.0   1.5  97.2   0.0    1  10962   5165    0    0    0  76.33
16:43:55   0.9   0.0   0.8  98.3   0.0    1   8452   3939    0    0    0  61.99
16:43:56   0.6   0.0   0.1  99.3   0.0    2   6541   1999    0    0    0  40.92
16:43:57   0.9   0.0   0.2  98.9   0.0    2   8199   2477    0    0    0  42.91


And if I query the power supply via ACPI instead:

while true;do sensors power_meter-acpi-0|grep power1|awk '{print $2, $3}';sleep 1;done
...
163.00 W
173.00 W
173.00 W
172.00 W
203.00 W
206.00 W
206.00 W
206.00 W
209.00 W
205.00 W
211.00 W
213.00 W
203.00 W
175.00 W
166.00 W
...

As you can see the difference is quite substantial. This is solely due to a
1ms sleep loop (albeit one where we fork a process after each sleep, which
likely is a good chunk of the CPU usage).

However, the difference in power usage will be far smaller if the system
already busy, because the CPU will already run at a high frequency.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Optimize join selectivity estimation by not reading MCV stats for unique join attributes
Next
From: vignesh C
Date:
Subject: Re: Typo in SH_LOOKUP and SH_LOOKUP_HASH comments