Thread: performance statistics monitoring without spamming logs

performance statistics monitoring without spamming logs

From
Patrick Hemmer
Date:
I'm looking for a way of gathering performance stats in a more usable way than turning on `log_statement_stats` (or other related modules). The problem I have with the log_*_stats family of modules is that they log every single query, which makes them unusable in production. Aside from consuming space, there's also the problem that the log system wouldn't be able to keep up with the rate.

There are a couple ideas that pop into mind that would make these stats more usable:
1. Only log when the statement would otherwise already be logged. Such as due to the `log_statement` or `log_min_duration_statement` settings.
2. Make stats available in `pg_stat_statements` (or alternate view that could be joined on). The block stats are already available here, but others like CPU usage, page faults, and context switches are not.

To answer why I want this data: I want to be able to identify queries which are consuming large amounts of CPU time so that I can either optimize the query or optimize the application making the query, and free up CPU resources on the database. The `pg_stat_statements` view provides the `total_time` metric, but many things can contribute to query time other than CPU usage, and CPU usage is my primary concern at the moment.

Do these seem like reasonable requests? And if so, what's the procedure for getting them implemented?
Any thoughts on whether they would be hard to implement? I'm unfamiliar with the PostgresQL code base, but might be willing to attempt an implementation if it wouldn't be terribly difficult.

-Patrick

Re: performance statistics monitoring without spamming logs

From
Justin Pryzby
Date:
On Tue, Jul 10, 2018 at 01:54:12PM -0400, Patrick Hemmer wrote:
> I'm looking for a way of gathering performance stats in a more usable
> way than turning on `log_statement_stats` (or other related modules).
> The problem I have with the log_*_stats family of modules is that they
> log every single query, which makes them unusable in production. Aside
> from consuming space, there's also the problem that the log system
> wouldn't be able to keep up with the rate.
> 
> There are a couple ideas that pop into mind that would make these stats
> more usable:
> 1. Only log when the statement would otherwise already be logged. Such
> as due to the `log_statement` or `log_min_duration_statement` settings.

Did you see: (Added Adrien to Cc);
https://commitfest.postgresql.org/18/1691/

I don't think the existing patch does what you want, but perhaps all that's
needed is this:

        if (save_log_statement_stats)
+               if (log_sample_rate==1 || was_logged)
                ShowUsage("EXECUTE MESSAGE STATISTICS");

In any case, I'm thinking that your request could/should be considered by
whatever future patch implements sampling (if not implemented/included in the
patch itself).

If that doesn't do what's needed, that patch might still be a good crash course
in how to start implementing what you need (perhaps on top of that patch).

> 2. Make stats available in `pg_stat_statements` (or alternate view that
> could be joined on). The block stats are already available here, but
> others like CPU usage, page faults, and context switches are not.

pg_stat_statements is ./contrib/pg_stat_statements/pg_stat_statements.c which is 3k LOC.

getrusage stuff and log_*_stat stuff is in src/backend/tcop/postgres.c

Justin


Re: performance statistics monitoring without spamming logs

From
Lukas Fittl
Date:
On Tue, Jul 10, 2018 at 11:38 AM, Justin Pryzby <pryzby@telsasoft.com> wrote:
> 2. Make stats available in `pg_stat_statements` (or alternate view that
> could be joined on). The block stats are already available here, but
> others like CPU usage, page faults, and context switches are not.

pg_stat_statements is ./contrib/pg_stat_statements/pg_stat_statements.c which is 3k LOC.

getrusage stuff and log_*_stat stuff is in src/backend/tcop/postgres.c

Before you start implementing something here, take a look at pg_stat_kcache [0]

Which already aims to collect a few more system statistics than what pg_stat_statements provides today, and might be a good basis to extend from.

It might also be worth to look at pg_stat_activity wait event sampling to determine where a system spends time, see e.g. pg_wait_sampling [1] for one approach to this.


Best,
Lukas 

--
Lukas Fittl

Re: performance statistics monitoring without spamming logs

From
Adrien NAYRAT
Date:
On 07/13/2018 12:25 AM, Lukas Fittl wrote:
> On Tue, Jul 10, 2018 at 11:38 AM, Justin Pryzby <pryzby@telsasoft.com 
> <mailto:pryzby@telsasoft.com>> wrote:
> 
>     > 2. Make stats available in `pg_stat_statements` (or alternate view that
>     > could be joined on). The block stats are already available here, but
>     > others like CPU usage, page faults, and context switches are not.
> 
>     pg_stat_statements is
>     ./contrib/pg_stat_statements/pg_stat_statements.c which is 3k LOC.
> 
>     getrusage stuff and log_*_stat stuff is in src/backend/tcop/postgres.c
> 
> 
> Before you start implementing something here, take a look at 
> pg_stat_kcache [0]
> 
> Which already aims to collect a few more system statistics than what 
> pg_stat_statements provides today, and might be a good basis to extend from.
> 
> It might also be worth to look at pg_stat_activity wait event sampling 
> to determine where a system spends time, see e.g. pg_wait_sampling 
> [1] for one approach to this.
> 

Hi,

You should look Powa stack :

https://github.com/powa-team/powa

Powa can aggregate metrics from different extensions such as 
pg_stat_statements, pg_stat_kcache and pg_wait_sampling recently : 
https://rjuju.github.io/postgresql/2018/07/09/wait-events-support-for-powa.html

Regards,

> [0]: https://github.com/powa-team/pg_stat_kcache 
> <https://github.com/powa-team/pg_stat_kcache>
> [1]: https://github.com/postgrespro/pg_wait_sampling 
> <https://github.com/postgrespro/pg_wait_sampling>
> 
> Best,
> Lukas
> 
> -- 
> Lukas Fittl



Re: performance statistics monitoring without spamming logs

From
Adrien NAYRAT
Date:
On 07/10/2018 08:38 PM, Justin Pryzby wrote:
> On Tue, Jul 10, 2018 at 01:54:12PM -0400, Patrick Hemmer wrote:
>> I'm looking for a way of gathering performance stats in a more usable
>> way than turning on `log_statement_stats` (or other related modules).
>> The problem I have with the log_*_stats family of modules is that they
>> log every single query, which makes them unusable in production. Aside
>> from consuming space, there's also the problem that the log system
>> wouldn't be able to keep up with the rate.
>>
>> There are a couple ideas that pop into mind that would make these stats
>> more usable:
>> 1. Only log when the statement would otherwise already be logged. Such
>> as due to the `log_statement` or `log_min_duration_statement` settings.
> 
> Did you see: (Added Adrien to Cc);
> https://commitfest.postgresql.org/18/1691/
> 
> I don't think the existing patch does what you want, but perhaps all that's
> needed is this:
> 
>          if (save_log_statement_stats)
> +               if (log_sample_rate==1 || was_logged)
>                  ShowUsage("EXECUTE MESSAGE STATISTICS");
> 
> In any case, I'm thinking that your request could/should be considered by
> whatever future patch implements sampling (if not implemented/included in the
> patch itself).

Hi,

Thanks for Cc, it seems a good idea. Will think about it ;)

> 
> If that doesn't do what's needed, that patch might still be a good crash course
> in how to start implementing what you need (perhaps on top of that patch).
> 
>> 2. Make stats available in `pg_stat_statements` (or alternate view that
>> could be joined on). The block stats are already available here, but
>> others like CPU usage, page faults, and context switches are not.
> 
> pg_stat_statements is ./contrib/pg_stat_statements/pg_stat_statements.c which is 3k LOC.
> 
> getrusage stuff and log_*_stat stuff is in src/backend/tcop/postgres.c
> 
> Justin
> 



Re: performance statistics monitoring without spamming logs

From
Julien Rouhaud
Date:
On Fri, Jul 13, 2018 at 9:23 AM, Adrien NAYRAT
<adrien.nayrat@anayrat.info> wrote:
> On 07/13/2018 12:25 AM, Lukas Fittl wrote:
>>
>> On Tue, Jul 10, 2018 at 11:38 AM, Justin Pryzby <pryzby@telsasoft.com
>> <mailto:pryzby@telsasoft.com>> wrote:
>>
>>     > 2. Make stats available in `pg_stat_statements` (or alternate view
>> that
>>     > could be joined on). The block stats are already available here, but
>>     > others like CPU usage, page faults, and context switches are not.
>>
>>     pg_stat_statements is
>>     ./contrib/pg_stat_statements/pg_stat_statements.c which is 3k LOC.
>>
>>     getrusage stuff and log_*_stat stuff is in src/backend/tcop/postgres.c
>>
>>
>> Before you start implementing something here, take a look at
>> pg_stat_kcache [0]
>>
>> Which already aims to collect a few more system statistics than what
>> pg_stat_statements provides today, and might be a good basis to extend from.
>>

Also no one asked for it before, but we can definitely add all the
other fields returned by get_rusage(2) in pg_stat_kcache.  You can
also look at https://github.com/markwkm/pg_proctab.