Thread: CPU time for pg_stat_statement

CPU time for pg_stat_statement

From
Michail Nikolaev
Date:
Hello, hackers.

Today I was doing some aggregates over pg_stat_statements in order to
find types of queries consuming most of the CPU. Aggregates were made
on two pg_state_statement snapshots within 30 sec delay.

The sum(total_time) had the biggest value for a very frequent query
with about 10ms execution. I was thinking it is the biggest CPU
consumer.

But after reducing the frequency of queries a lot I was unable to see
any significant difference in server CPU usage...

So, looks like clock_gettime is not so accurate to measure real CPU
usage for some OLTP workloads. I suppose it is caused by the wall time
vs CPU time difference (IO, thread switch, etc).

But what do you think about adding cpu_time (by calling getrusage) to
pg_stat_statements? Seems it could be very useful for CPU profiling.

I am probably able to prepare the patch, but it is always better to
get some feedback on the idea first :)

Best regards,
Michail.



Re: CPU time for pg_stat_statement

From
Tom Lane
Date:
Michail Nikolaev <michail.nikolaev@gmail.com> writes:
> So, looks like clock_gettime is not so accurate to measure real CPU
> usage for some OLTP workloads. I suppose it is caused by the wall time
> vs CPU time difference (IO, thread switch, etc).

This is a pretty broad claim to make on the basis of one undocumented
test case on one unmentioned platform.

> But what do you think about adding cpu_time (by calling getrusage) to
> pg_stat_statements? Seems it could be very useful for CPU profiling.

On what grounds do you claim getrusage will be better?  One thing we
can be pretty certain of is that it will be slower, since it has to
return many more pieces of information.  And the API for it only allows
time info to be specified to microseconds, versus nanoseconds for
clock_gettime, so it's also going to be taking a precision hit.

            regards, tom lane



Re: CPU time for pg_stat_statement

From
Thomas Munro
Date:
On Sat, May 21, 2022 at 6:50 AM Michail Nikolaev
<michail.nikolaev@gmail.com> wrote:
> But what do you think about adding cpu_time (by calling getrusage) to
> pg_stat_statements? Seems it could be very useful for CPU profiling.
>
> I am probably able to prepare the patch, but it is always better to
> get some feedback on the idea first :)

This might be interesting:

https://github.com/powa-team/pg_stat_kcache



Re: CPU time for pg_stat_statement

From
Michail Nikolaev
Date:
Hello, Thomas.

> This might be interesting:
> https://github.com/powa-team/pg_stat_kcache

Oh, nice, looks like it could help me to reduce CPU and test my
assumption (using exec_user_time and exec_system_time).

BWT, do you know why extension is not in standard contrib (looks mature)?

Best regards,
Michail.



Re: CPU time for pg_stat_statement

From
Michail Nikolaev
Date:
Hello, Tom.

> This is a pretty broad claim to make on the basis of one undocumented
> test case on one unmentioned platform.

I'll try to use pg_stat_kcache to check the difference between Wall
and CPU for my case.

> On what grounds do you claim getrusage will be better?  One thing we
> can be pretty certain of is that it will be slower, since it has to
> return many more pieces of information.  And the API for it only allows
> time info to be specified to microseconds, versus nanoseconds for
> clock_gettime, so it's also going to be taking a precision hit.

My idea was to not replace wall-clock (clock_gettime) by cpu-clock (getrusage).
I think about adding getrusage as an additional column (with flag to
enable actual measuring).
Looks like I need to be more precise in words :)

It is just two different clocks - and sometimes you need physical
time, sometimes CPU time (and sometimes, for example, amount of WAL
written).

Best regards,
Michail.



Re: CPU time for pg_stat_statement

From
Julien Rouhaud
Date:
Hi,

On Sat, May 21, 2022 at 12:21:49AM +0300, Michail Nikolaev wrote:
>
> > This might be interesting:
> > https://github.com/powa-team/pg_stat_kcache
>
> Oh, nice, looks like it could help me to reduce CPU and test my
> assumption (using exec_user_time and exec_system_time).
>
> BWT, do you know why extension is not in standard contrib (looks mature)?

Because contrib isn't meant to eventually contain all possible extensions.

There is an official postgres extension network, and also community deb/rpm
repositories that are intended to handle postgres extensibility, and this
extension is available on all of that, same as a lot of other extensions, which
are at least as mature.



Re: CPU time for pg_stat_statement

From
Michail Nikolaev
Date:
Hello, Tom.

>> This is a pretty broad claim to make on the basis of one undocumented
>> test case on one unmentioned platform.

> I'll try to use pg_stat_kcache to check the difference between Wall
> and CPU for my case.

In my case I see pretty high correlation of pg_stat_kcache and
pg_stat_statement (clock_gettime vs getrusage).
Looks like CPU usage is hidden somewhere else (planning probably, not
measured on postgres 11, but I see really high
*clauselist_selectivity* in perf).

Thanks,
Michail.