Thread: Common slow query reasons - help with a special log

Common slow query reasons - help with a special log

From
Daniel Cristian Cruz
Date:
Hi all,

I'm trying to figure out some common slow queries running on the server, by analyzing the slow queries log.

I found debug_print_parse, debug_print_rewritten, debug_print_plan, which are too much verbose and logs all queries.

I was thinking in something like a simple explain analyze just for queries logged with log_min_duration_statement with the query too.

Is there a way to configure PostgreSQL to get this kind of information, maybe I'm missing something? Is it too hard to hack into sources and do it by hand? I never touched PostgreSQL sources.

I'm thinking to write a paper that needs this information for my postgraduate course. The focus of my work will be the log data, not PostgreSQL itself. If I succeed, maybe it can be a tool to help all of us.

Thank you,
--
Daniel Cristian Cruz
クルズ クリスチアン ダニエル

Re: Common slow query reasons - help with a special log

From
Andreas Kretschmer
Date:
Daniel Cristian Cruz <danielcristian@gmail.com> wrote:

> Hi all,
>
> I'm trying to figure out some common slow queries running on the server, by
> analyzing the slow queries log.
>
> I found debug_print_parse, debug_print_rewritten, debug_print_plan, which are
> too much verbose and logs all queries.
>
> I was thinking in something like a simple explain analyze just for queries
> logged with log_min_duration_statement with the query too.
>
> Is there a way to configure PostgreSQL to get this kind of information, maybe
> I'm missing something? Is it too hard to hack into sources and do it by hand? I
> never touched PostgreSQL sources.

Consider auto_explain, it's a contrib-modul, see
http://www.postgresql.org/docs/9.1/interactive/auto-explain.html



Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.                              (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.              N 51.05082°, E 13.56889°

Re: Common slow query reasons - help with a special log

From
Tomas Vondra
Date:
There's auto_explain contrib module that does exactly what you're asking
for. Anyway, explain analyze is quite expensive - think twice before
enabling that on production server where you already have performance
issues.

Tomas

On 10.12.2011 17:52, Daniel Cristian Cruz wrote:
> Hi all,
>
> I'm trying to figure out some common slow queries running on the server,
> by analyzing the slow queries log.
>
> I found debug_print_parse, debug_print_rewritten, debug_print_plan,
> which are too much verbose and logs all queries.
>
> I was thinking in something like a simple explain analyze just for
> queries logged with log_min_duration_statement with the query too.
>
> Is there a way to configure PostgreSQL to get this kind of information,
> maybe I'm missing something? Is it too hard to hack into sources and do
> it by hand? I never touched PostgreSQL sources.
>
> I'm thinking to write a paper that needs this information for my
> postgraduate course. The focus of my work will be the log data, not
> PostgreSQL itself. If I succeed, maybe it can be a tool to help all of us.
>
> Thank you,
> --
> Daniel Cristian Cruz
> クルズ クリスチアン ダニエル


Re: Common slow query reasons - help with a special log

From
Daniel Cristian Cruz
Date:
At work we have a 24 cores server, with a load average around 2.5.

I don't know yet if a system which use some unused CPU to minimize the load of a bad query identified early is bad or worse.

Indeed, I don't know if my boss would let me test this at production too, but it could be good to know how things work in "auto-pilot" mode.

2011/12/10 Tomas Vondra <tv@fuzzy.cz>
There's auto_explain contrib module that does exactly what you're asking
for. Anyway, explain analyze is quite expensive - think twice before
enabling that on production server where you already have performance
issues.

Tomas

On 10.12.2011 17:52, Daniel Cristian Cruz wrote:
> Hi all,
>
> I'm trying to figure out some common slow queries running on the server,
> by analyzing the slow queries log.
>
> I found debug_print_parse, debug_print_rewritten, debug_print_plan,
> which are too much verbose and logs all queries.
>
> I was thinking in something like a simple explain analyze just for
> queries logged with log_min_duration_statement with the query too.
>
> Is there a way to configure PostgreSQL to get this kind of information,
> maybe I'm missing something? Is it too hard to hack into sources and do
> it by hand? I never touched PostgreSQL sources.
>
> I'm thinking to write a paper that needs this information for my
> postgraduate course. The focus of my work will be the log data, not
> PostgreSQL itself. If I succeed, maybe it can be a tool to help all of us.
>
> Thank you,
> --
> Daniel Cristian Cruz
> クルズ クリスチアン ダニエル


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



--
Daniel Cristian Cruz
クルズ クリスチアン ダニエル

Re: Common slow query reasons - help with a special log

From
Tomas Vondra
Date:
On 10.12.2011 23:40, Daniel Cristian Cruz wrote:
> At work we have a 24 cores server, with a load average around 2.5.

A single query is processes by a single CPU, so even if the system is
not busy a single query may hit CPU bottleneck. The real issue is the
instrumentation overhead - timing etc. On some systems (with slow
gettimeofday) this may be a significant problem as the query hits the
CPU boundary sooner.

> I don't know yet if a system which use some unused CPU to minimize the
> load of a bad query identified early is bad or worse.

Not really, due to the "single query / single CPU" rule.

> Indeed, I don't know if my boss would let me test this at production
> too, but it could be good to know how things work in "auto-pilot" mode.

What I was pointing out is that you probably should not enable loggin
"explain analyze" output by "auto_explain.log_analyze = true". There are
three levels of detail:

1) basic, just log_min_duration_statement

2) auto_explain, without 'analyze' - just explain plain

3) auto_explain, with 'analyze' - explain plan with actual values

Levels (1) and (2) are quite safe (unless the minimum execution time is
too low).

Tomas

>
> 2011/12/10 Tomas Vondra <tv@fuzzy.cz <mailto:tv@fuzzy.cz>>
>
>     There's auto_explain contrib module that does exactly what you're asking
>     for. Anyway, explain analyze is quite expensive - think twice before
>     enabling that on production server where you already have performance
>     issues.
>
>     Tomas
>
>     On 10.12.2011 17:52, Daniel Cristian Cruz wrote:
>     > Hi all,
>     >
>     > I'm trying to figure out some common slow queries running on the
>     server,
>     > by analyzing the slow queries log.
>     >
>     > I found debug_print_parse, debug_print_rewritten, debug_print_plan,
>     > which are too much verbose and logs all queries.
>     >
>     > I was thinking in something like a simple explain analyze just for
>     > queries logged with log_min_duration_statement with the query too.
>     >
>     > Is there a way to configure PostgreSQL to get this kind of
>     information,
>     > maybe I'm missing something? Is it too hard to hack into sources
>     and do
>     > it by hand? I never touched PostgreSQL sources.
>     >
>     > I'm thinking to write a paper that needs this information for my
>     > postgraduate course. The focus of my work will be the log data, not
>     > PostgreSQL itself. If I succeed, maybe it can be a tool to help
>     all of us.
>     >
>     > Thank you,
>     > --
>     > Daniel Cristian Cruz
>     > クルズ クリスチアン ダニエル
>
>
>     --
>     Sent via pgsql-performance mailing list
>     (pgsql-performance@postgresql.org
>     <mailto:pgsql-performance@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-performance
>
>
>
>
> --
> Daniel Cristian Cruz
> クルズ クリスチアン ダニエル


Re: Common slow query reasons - help with a special log

From
Daniel Cristian Cruz
Date:
2011/12/10 Tomas Vondra <tv@fuzzy.cz>
On 10.12.2011 23:40, Daniel Cristian Cruz wrote:
> At work we have a 24 cores server, with a load average around 2.5.

A single query is processes by a single CPU, so even if the system is
not busy a single query may hit CPU bottleneck. The real issue is the
instrumentation overhead - timing etc. On some systems (with slow
gettimeofday) this may be a significant problem as the query hits the
CPU boundary sooner.

Yes, I forgot it will run on the same PID. Since analyze will cause all queries to slow down, maybe the 24 cores could became overloaded.
 

> I don't know yet if a system which use some unused CPU to minimize the
> load of a bad query identified early is bad or worse.

Not really, due to the "single query / single CPU" rule.

I guess it will be a nice tool to run in the validation server. 
 
> Indeed, I don't know if my boss would let me test this at production
> too, but it could be good to know how things work in "auto-pilot" mode.

What I was pointing out is that you probably should not enable loggin
"explain analyze" output by "auto_explain.log_analyze = true". There are
three levels of detail:

1) basic, just log_min_duration_statement

2) auto_explain, without 'analyze' - just explain plain

3) auto_explain, with 'analyze' - explain plan with actual values

Levels (1) and (2) are quite safe (unless the minimum execution time is
too low).

I would start with 5 seconds.

Reading the manual again and I saw that enabling analyze, it analyze all queries, even the ones that wasn't 5 second slower. And understood that there is no way to disable for slower queries, since there is no way to know it before it ends...

I read Bruce blog about some features going to multi-core. Could explain analyze go multi-core too?

Another thing I saw is that I almost never look at times in explain analyze. I always look for rows divergence and methods used for scan and joins when looking for something to get better performance.

I had the nasty idea of putting a // before de gettimeofday in the code for explain analyze (I guess it could be very more complicated than this).

Sure, its ugly, but I think it could be an option for an explain analyze "with no time", and in concept, it's what I'm looking for.

--
Daniel Cristian Cruz
クルズ クリスチアン ダニエル

Re: Common slow query reasons - help with a special log

From
Tomas Vondra
Date:
On 11.12.2011 02:27, Daniel Cristian Cruz wrote:
> I would start with 5 seconds.
>
> Reading the manual again and I saw that enabling analyze, it analyze all
> queries, even the ones that wasn't 5 second slower. And understood that
> there is no way to disable for slower queries, since there is no way to
> know it before it ends...

Yes, you can't predict how long a query will run until it actually
finishes, so you have to instrument all of them. Maybe this will change
because of the "faster than light" neutrinos, but let's stick with the
current laws of physics for now.

> I read Bruce blog about some features going to multi-core. Could explain
> analyze go multi-core too?

I don't think so. This is what Bruce mentioned as "parallel execution"
and that's the very hard part requiring rearchitecting parts of the system.

> Another thing I saw is that I almost never look at times in explain
> analyze. I always look for rows divergence and methods used for scan and
> joins when looking for something to get better performance.
>
> I had the nasty idea of putting a // before de gettimeofday in the code
> for explain analyze (I guess it could be very more complicated than this).

I was thinking about that too, but I've never done it. So I'm not sure
what else is needed.

Tomas