Re: New EXPLAIN ANALYZE statement - Mailing list pgsql-patches

From Tom Lane
Subject Re: New EXPLAIN ANALYZE statement
Date
Msg-id 4179.995650774@sss.pgh.pa.us
Whole thread Raw
In response to New EXPLAIN ANALYZE statement  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: New EXPLAIN ANALYZE statement  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-patches
Martijn van Oosterhout <kleptog@svana.org> writes:
> It adds a few calls in a few strategic places in the executor. Each node in
> the query tree is accounted for separately. Only time processing the node is
> counted, not the time before and after. It also adds two new files.

I'm still unconvinced that this approach will yield values that can
meaningfully be compared to the planner's cost estimates.  However,
the thing I *really* object to about this patch is that the statistics-
collection overhead is paid by every query whether it's being EXPLAINed
or not.  Kindly set it up so that that's not true.  (One way to do that
is to have EXPLAIN be responsible for traversing the tree and attaching
instrumentation structs to the nodes; then at runtime, the work is done
iff the struct links are not NULL.)

> Currently only select statements are allowed. This is because I can't work
> out how to supress the actual effect of an UPDATE or DELETE query. I'm also
> assuming that SELECT queries can't be rewritten into something that actually
> changes something. Ouch! I just realised that functions can be called which
> could do anything, so I truly need something to suppress the results.

I think this is wrongheaded; one of the things that might be interesting
is to know how much of the runtime is actually spent in the executor
toplevel, doing the update/delete/trigger firing/whatever.  And if your
query is calling expensive functions, why would you not want to know
that?  The correct approach is to indeed do the query, full up, as-is.
(And you should probably include the end-to-end time, user, system and
wall-clock, in the printout.)  What's needed is to design the user
interface so that it does not surprise people that the query is
executed.  This suggests to me that it should not be called EXPLAIN,
but something else.  (No, not ANALYZE, that's taken.)


Random other comments:

> +    InstrStopNode( node->instrument );
> +
> +    if( TupIsNull(result) )
> +        InstrEndLoop( node->instrument );

I think this isn't gonna work right for Group nodes.

> +  int tuplecount;           /* Tuples so far this loop */

Call me silly, but I think the tuple and iteration counts should be
doubles, not ints, to avoid overflow issues.

> +    bzero( i, sizeof(Instrumentation) );

Don't use bzero; use the ANSI-standard routines (memset, etc).

> +InstrStartNode( Instrumentation *i )    /* When we enter the node */
> +{
> +    if( !i )
> +        return;

This is just a personal thing, perhaps, but I *never* use a name like
"i" for anything except loop counters.  A struct argument ought to have
a less anonymous name.

> +    while( i->counter.tv_usec < 0 )
> +    {
> +        i->counter.tv_usec += 1000000;
> +        i->counter.tv_sec--;
> +    }

Don't you need to also reduce tv_usec to less than 1000000?  Else you
risk overflow of the usec field over successive executions.

> +    if( !timerisset( &i->counter ) )     /* Skip if nothing has happend */

Are these "timerisset" and "timerclear" thingies portable?  I can't find
anything about them in the man pages.  Since they're not used anywhere
in Postgres at the moment, I'd just as soon not see you add a new
platform dependency.  Just write out the clearing/testing of tv_sec and
tv_usec instead --- it's not like your code will work if those fields
don't exist with those names...

A general rule when dealing with system-defined stuff is to look
around and see how similar stuff is coded elsewhere in the Postgres
backend, and do it the same way.  That way you can take advantage of
literally years' worth of portability knowledge that we have
accumulated, rather than repeating old mistakes that we have fixed.

            regards, tom lane

pgsql-patches by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: New EXPLAIN ANALYZE statement
Next
From: Bruce Momjian
Date:
Subject: Re: Re: [JDBC] [PATCH] Cleanup of JDBC character encoding