Thread: Auto-explain patch

Auto-explain patch

From
Dean Rasheed
Date:
This is the patch allows logging of the explain plan for each query run, as described here:

http://archives.postgresql.org/pgsql-performance/2008-01/msg00245.php

I hope this is useful.

Dean.

_________________________________________________________________
Telly addicts unite!
http://www.searchgamesbox.com/tvtown.shtml
Attachment

Re: Auto-explain patch

From
David Fetter
Date:
On Mon, Jan 28, 2008 at 09:21:50AM +0000, Dean Rasheed wrote:
>
> This is the patch allows logging of the explain plan for each query run, as described here:
>
> http://archives.postgresql.org/pgsql-performance/2008-01/msg00245.php
>
> I hope this is useful.
>
> Dean.

Dean,

Maybe I missed something obvious here, but how does this patch handle
the situation where people have turned on INTEGER_DATETIMES?

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Auto-explain patch

From
Dean Rasheed
Date:
> Dean,
>
> Maybe I missed something obvious here, but how does this patch handle
> the situation where people have turned on INTEGER_DATETIMES?
>
> Cheers,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> Skype: davidfetter XMPP: david.fetter@gmail.com
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate

Sorry, I don't understand. I am new to this code, but I can't see how the INTEGER_DATETIMES flag will affect this code.
Iam using the macros and functions from instrument.h and explain.c for timing, which appear to use gettimeofday() or
QueryPerformanceCounter(),coverting the result to a double to report the total time spent running the query. 

As far as I can see, the INTEGER_DATETIMES flag doesn't affect this code. Am I missing something?

Dean.

_________________________________________________________________
Free games, great prizes - get gaming at Gamesbox.
http://www.searchgamesbox.com

Re: Auto-explain patch

From
David Fetter
Date:
On Mon, Jan 28, 2008 at 07:55:53PM +0000, Dean Rasheed wrote:
>
> > Dean,
> >
> > Maybe I missed something obvious here, but how does this patch handle
> > the situation where people have turned on INTEGER_DATETIMES?
> >
> > Cheers,
> > David.
> > --
> > David Fetter  http://fetter.org/
> > Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> > Skype: davidfetter XMPP: david.fetter@gmail.com
> >
> > Remember to vote!
> > Consider donating to Postgres: http://www.postgresql.org/about/donate
>
> Sorry, I don't understand. I am new to this code, but I can't see
> how the INTEGER_DATETIMES flag will affect this code. I am using the
> macros and functions from instrument.h and explain.c for timing,
> which appear to use gettimeofday() or QueryPerformanceCounter(),
> coverting the result to a double to report the total time spent
> running the query.

It's the double part I don't quite get.  Shouldn't that be an int64 in
the case of INTEGER_DATETIMES?

Cheers,
David.

>
> As far as I can see, the INTEGER_DATETIMES flag doesn't affect this
> code. Am I missing something?

>
> Dean.
>
> _________________________________________________________________
> Free games, great prizes - get gaming at Gamesbox.
> http://www.searchgamesbox.com
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Auto-explain patch

From
Dean Rasheed
Date:
> Date: Mon, 28 Jan 2008 12:08:00 -0800
> From: david@fetter.org
> To: dean_rasheed@hotmail.com
> CC: pgsql-patches@postgresql.org
> Subject: Re: [PATCHES] Auto-explain patch
>
> On Mon, Jan 28, 2008 at 07:55:53PM +0000, Dean Rasheed wrote:
>>
>>> Dean,
>>>
>>> Maybe I missed something obvious here, but how does this patch handle
>>> the situation where people have turned on INTEGER_DATETIMES?
>>>
>>> Cheers,
>>> David.
>>> --
>>> David Fetter http://fetter.org/
>>> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
>>> Skype: davidfetter XMPP: david.fetter@gmail.com
>>>
>>> Remember to vote!
>>> Consider donating to Postgres: http://www.postgresql.org/about/donate
>>
>> Sorry, I don't understand. I am new to this code, but I can't see
>> how the INTEGER_DATETIMES flag will affect this code. I am using the
>> macros and functions from instrument.h and explain.c for timing,
>> which appear to use gettimeofday() or QueryPerformanceCounter(),
>> coverting the result to a double to report the total time spent
>> running the query.
>
> It's the double part I don't quite get. Shouldn't that be an int64 in
> the case of INTEGER_DATETIMES?
>
> Cheers,
> David.
>

All the times are 64-bit integers (or at least structures with 2 32-bit integers in them) until the end, when the
elapsedtime is converted to a double so that the query runtime can be printed out in ms ("Query runtime: %.3f ms").
Thisis the same as EXPLAIN ANALYSE, except in that case it is the total runtime ("Total runtime: %.3f ms\n") that gets
reported,including startup/shutdown trigger times. 

Dean.

_________________________________________________________________
Share what Santa brought you
https://www.mycooluncool.com

Re: Auto-explain patch

From
David Fetter
Date:
On Mon, Jan 28, 2008 at 08:49:23PM +0000, Dean Rasheed wrote:
>
> > On Mon, Jan 28, 2008 at 07:55:53PM +0000, Dean Rasheed wrote:
> >>
> >>> Dean,
> >>>
> >>> Maybe I missed something obvious here, but how does this patch handle
> >>> the situation where people have turned on INTEGER_DATETIMES?
> >>>
> >>> Cheers,
> >>> David.
> >>> --
> >>> David Fetter http://fetter.org/
> >>> Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
> >>> Skype: davidfetter XMPP: david.fetter@gmail.com
> >>>
> >>> Remember to vote!
> >>> Consider donating to Postgres: http://www.postgresql.org/about/donate
> >>
> >> Sorry, I don't understand. I am new to this code, but I can't see
> >> how the INTEGER_DATETIMES flag will affect this code. I am using the
> >> macros and functions from instrument.h and explain.c for timing,
> >> which appear to use gettimeofday() or QueryPerformanceCounter(),
> >> coverting the result to a double to report the total time spent
> >> running the query.
> >
> > It's the double part I don't quite get. Shouldn't that be an int64 in
> > the case of INTEGER_DATETIMES?
> >
> > Cheers,
> > David.
> >
>
> All the times are 64-bit integers (or at least structures with 2
> 32-bit integers in them) until the end, when the elapsed time is
> converted to a double so that the query runtime can be printed out
> in ms ("Query runtime: %.3f ms"). This is the same as EXPLAIN
> ANALYSE, except in that case it is the total runtime ("Total
> runtime: %.3f ms\n") that gets reported, including startup/shutdown
> trigger times.

Thanks for the explanation, and sorry for the noise :)

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Auto-explain patch

From
Simon Riggs
Date:
On Mon, 2008-01-28 at 09:21 +0000, Dean Rasheed wrote:
> This is the patch allows logging of the explain plan for each query run, as described here:
>
> http://archives.postgresql.org/pgsql-performance/2008-01/msg00245.php
>
> I hope this is useful.

This looks very good, though I don't think its finished yet.

We definitely don't want "----------- query plan ---------------" to be
logged each time we execute each SQL statement. I think we should put
the EXPLAIN output as the main message, not as additional detail.

I also think we should only log the EXPLAIN if we have logged the SQL
statement. It's not much use on its own anyway. This then allows this
feature to work neatly with log_statement and
log_min_duration_statement.

We need this to work effectively when using v3 protocol prepared
queries. Since we only plan a query once, printing an EXPLAIN for all
executions is probably too much information for tuning smaller queries.

So I suggest we call this "log_explain" with settings {off | plan |
execute}, default off. For simple non-prepared queries plan and execute
work identically: If log_min_duration_statement > -1 or we are logging
the statement with log_statement then we will instrument each statement
to allow us to produce EXPLAIN ANALYZE output if the statement is
logged. If it is set to "plan" then it will print out the EXPLAIN only
when the query is planned (on first bind, when replanned etc), so it is
an EXPLAIN not an EXPLAIN ANALYZE. In this mode it will *not* instrument
each execution, nor will it print the plan in those cases.

If that last paragraph sounds too much, perhaps we should just go for on
| off for the next version of the patch.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


Re: Auto-explain patch

From
Dean Rasheed
Date:
> On Mon, 2008-01-28 at 09:21 +0000, Dean Rasheed wrote:
>> This is the patch allows logging of the explain plan for each query run, as described here:
>>
>> http://archives.postgresql.org/pgsql-performance/2008-01/msg00245.php
>>
>> I hope this is useful.
>
> This looks very good, though I don't think its finished yet.
>

Thanks for your comments. They all sound like fair points, but I have
a few questions - I am still pretty new to postgresql and this is the
first time I have looked at the source code, so I may not have fully
understood everything you said.


> We definitely don't want "----------- query plan ---------------" to be
> logged each time we execute each SQL statement. I think we should put
> the EXPLAIN output as the main message, not as additional detail.
>

OK, that's easy to change. I based the message/detail formatting on
the way debug_print_plan produces its output.


> I also think we should only log the EXPLAIN if we have logged the SQL
> statement. It's not much use on its own anyway. This then allows this
> feature to work neatly with log_statement and
> log_min_duration_statement.
>

Wouldn't that mean that only superusers could use it?

Even without logging of the SQL statement, I have found it quite
useful in practice for debugging SQL executed from stored procedures,
where the logged CONTEXT is good enough to identify the SQL being
exectued. Also, from an interactive session logging of the original
SQL doesn't matter so much.


> We need this to work effectively when using v3 protocol prepared
> queries. Since we only plan a query once, printing an EXPLAIN for all
> executions is probably too much information for tuning smaller queries.
>
> So I suggest we call this "log_explain" with settings {off | plan |
> execute}, default off. For simple non-prepared queries plan and execute
> work identically: If log_min_duration_statement> -1 or we are logging
> the statement with log_statement then we will instrument each statement
> to allow us to produce EXPLAIN ANALYZE output if the statement is
> logged. If it is set to "plan" then it will print out the EXPLAIN only
> when the query is planned (on first bind, when replanned etc), so it is
> an EXPLAIN not an EXPLAIN ANALYZE. In this mode it will *not* instrument
> each execution, nor will it print the plan in those cases.
>

I originally anticipated 2 use-cases:

1). A normal user from an interactive session, debugging individual
SQL statements and SQL embedded in stored procudures/triggers.

2). An administrator checking database access (eg. from a web app),
looking for inefficient queries.

I guess that what I have so far is more suited to (1), which is mostly
what I have been using it for (debug output not going to the log file).
I originally thought of doing this as an extra option to EXPLAIN, such
as EXAPLAIN [ANALYSE] ALL, but I thought that it would be easier to
use (and implement) as a settable parameter, mirroring Oracle's
AUTOTRACE.

For (2) I agree that it really needs a way to control the verbosity of
the output, as you suggested. And I guess that this ought to go to
level LOG rather than DEBUG1, to be consistent with the other logging
parameters.

This suggests having 2 separate parameters, one for debugging and one
for logging.


> If that last paragraph sounds too much, perhaps we should just go for on
> | off for the next version of the patch.
>

When you describe logging plans for logged statements only, are you
thinking of just explaining the top-level statement, or would this
include nested statements as well? Perhaps the latter is a 4th,
extra-verbose option "log_explain=all", which would produce EXPLAIN
ANALYSE output for each logged statement, and recursively for each
statement resulting from that top-level statement.

Dean

_________________________________________________________________
The next generation of Windows Live is here
http://www.windowslive.co.uk/get-live

Re: Auto-explain patch

From
Simon Riggs
Date:
On Sat, 2008-03-29 at 08:47 +0000, Dean Rasheed wrote:

> > I also think we should only log the EXPLAIN if we have logged the SQL
> > statement. It's not much use on its own anyway. This then allows this
> > feature to work neatly with log_statement and
> > log_min_duration_statement.
> >
>
> Wouldn't that mean that only superusers could use it?

That's what has currently been agreed for the other logging options.

> I originally anticipated 2 use-cases:
>
> 1). A normal user from an interactive session, debugging individual
> SQL statements and SQL embedded in stored procudures/triggers.
>
> 2). An administrator checking database access (eg. from a web app),
> looking for inefficient queries.
>
> I guess that what I have so far is more suited to (1), which is mostly
> what I have been using it for (debug output not going to the log file).
> I originally thought of doing this as an extra option to EXPLAIN, such
> as EXAPLAIN [ANALYSE] ALL, but I thought that it would be easier to
> use (and implement) as a settable parameter, mirroring Oracle's
> AUTOTRACE.
>
> For (2) I agree that it really needs a way to control the verbosity of
> the output, as you suggested. And I guess that this ought to go to
> level LOG rather than DEBUG1, to be consistent with the other logging
> parameters.
>
> This suggests having 2 separate parameters, one for debugging and one
> for logging.

That would make sense. The Oracle facility is actually a sql*plus
option, so perhaps a psql facility for that would be appropriate for
(1), but I think (2) is the more important usage and it would be better
to concentrate on that first and then come back for (1) next/later.

> > If that last paragraph sounds too much, perhaps we should just go for on
> > | off for the next version of the patch.

> When you describe logging plans for logged statements only, are you
> thinking of just explaining the top-level statement, or would this
> include nested statements as well? Perhaps the latter is a 4th,
> extra-verbose option "log_explain=all", which would produce EXPLAIN
> ANALYSE output for each logged statement, and recursively for each
> statement resulting from that top-level statement.

Yeh, can't think of any other way.

Bottom line is probably that the patch did what it did OK, its just that
we probably don't want exactly what it did. It would be very good if you
could write up the whole feature description again and re-post to
hackers, so we can get wider agreement before working on the patch.
Believe me, I understand exactly how you feel when I suggest that.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk