Thread: machine-readable explain output v2

machine-readable explain output v2

From
Robert Haas
Date:
OK, here's the second version of my machine-readable explain output
patches.  Previous version here:

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00867.php

To apply the current version, you will first need to apply
explain_refactor-v4.patch and explain_options-v3.patch.

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00865.php
http://archives.postgresql.org/pgsql-hackers/2009-06/msg01094.php

As before, the infrastructure patch applies first and is separated
only for ease of reviewing.  Based on feedback from the list, and
especially from Peter Eisentraut, I've made a number of changes and
improvements since the previous version of the patch:

- The XML format now uses <explain xmlns="..."> rather than <pgexplain>.

- The JSON format no longer quotes numeric arguments.  I'm actually
not really that happy with this change; it seems likely that when
presented with an unquoted floating point number, most JSON parsers
will convert it to a native floating point value, throwing away the
information on how many significant digits were present in the
original.  I think we should rethink this one.  I don't know of a
programming language where there isn't an easy equivalent of
JavaScript's parseFloat() or C's atof().

- In XML or JSON formats, return the entire result as a single row,
rather than a series of rows that the caller is responsible for
concatenating.  Also, in XML format, return the result as type xml,
rather than type text.  If you want this to work even without
--with-libxml, you'll need to also apply do_tup_output_datum-v2.patch,
which you can find here:
http://archives.postgresql.org/pgsql-hackers/2009-06/msg00973.php

- Add a very simple regression test for EXPLAIN.

- Add documentation.

- Add a new GUC to contrib/auto_explain so that you can specify xml or
json output when using that module, as requested by Dave Page.

- Avoid using the "Trigger" tag to mean two different things; when
displaying trigger statistics, use "Trigger-Name" and
"Constraint-Name" instead of just "Trigger" and "Constraint", as
suggested by Andres Freund.  Along the way, fix a bug in the JSON
format where {} was being used in place of [].

- When VERBOSE is specified, schema-qualify all tables and functions
scanned and alias-qualify all variable names appearing within quals.

- I also changed the behavior with regard to displaying trigger and
constraint names.  Previously, the code displayed the trigger name
only if there was no constraint name.  Now, the XML and JSON formats
always display both, and the text format displays both if VERBOSE is
specified.

Hope you all like it.

Thanks,

...Robert

Attachment

Re: machine-readable explain output v2

From
Andres Freund
Date:
Hi Robert, Hi all,

explain_format_infrastructure-v2.patch:
Applies cleanly, code changes look sensible, no behaviour changes.

explain_format-v2.patch
Applies after trivial conflicts (updated version attached). Once more the code 
changes look sensible to me, except maybe that I don't like the codeflow at 
some parts. That's partially related due to having to generate xml/json output 
by hand...

One part where I find the code flow ugly is 'did_boilerplate' in 
report_triggers/its callsites.
I can see why it is done that way, but its not exactly obvious to read when 
you want to find out how the format looks.

Another, minor, issue is that the patch changes the FORMAT TEXT/default output 
if VERBOSE is specified (schema is added). I don't see that as a real problem 
because the format for VERBOSE is new anyway, but I thought I would mention 
it.

XML Format:
As discussed previously I would like the format to look a bit different - but 
everyone wants it to look different anyway, so I think its ok as one of 
multiple possible lowest common determinators... With the big advantage of 
already being implemented.

I think it would be nice in the future to add some sort of 
'category={planner,timing,..}' attribute, but I think that should be 
discussed/implemented separately.

Documentation:
I think it would be nice to add some more documentation about the xml format 
for application writers, but I think this should be a separate patch anyway.

I did not yet look at the contrib/autoexplain portions. Will do that on Monday 
or so.


Andres

Re: machine-readable explain output v2

From
Andres Freund
Date:
Hi Robert, Hi All,

On Sunday 19 July 2009 04:29:42 Andres Freund wrote:
> I did not yet look at the contrib/autoexplain portions. Will do that on
> Monday or so.
Early Monday:
Looks fine, except that the new "auto_explain.log_format" parameter is not 
documented.

Andres


Re: machine-readable explain output v2

From
Robert Haas
Date:
On Sun, Jul 19, 2009 at 6:54 AM, Andres Freund<andres@anarazel.de> wrote:
> On Sunday 19 July 2009 04:29:42 Andres Freund wrote:
>> I did not yet look at the contrib/autoexplain portions. Will do that on
>> Monday or so.
> Early Monday:
> Looks fine, except that the new "auto_explain.log_format" parameter is not
> documented.

Fixed in the attached version, which is also updated based on changes
to the precursor "explain options" patch.

This will probably need some further adjustment after I finish
implementing Tom's latest suggestions on that patch, but here it is
for now.  Note that these require explain-refactor v4 and
explain-options v5.

http://archives.postgresql.org/pgsql-hackers/2009-06/msg00865.php
http://archives.postgresql.org/pgsql-hackers/2009-07/msg01389.php

...Robert

Attachment

Re: machine-readable explain output v2

From
Robert Haas
Date:
On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote:
> One part where I find the code flow ugly is 'did_boilerplate' in
> report_triggers/its callsites.
> I can see why it is done that way, but its not exactly obvious to read when
> you want to find out how the format looks.

Suggestions?

> Another, minor, issue is that the patch changes the FORMAT TEXT/default output
> if VERBOSE is specified (schema is added). I don't see that as a real problem
> because the format for VERBOSE is new anyway, but I thought I would mention
> it.

Verbose isn't new, but in 8.4 all it does is displays the output list
for each node.  I think there's room for verbose to include some other
types of verbosity that don't merit their own options.

I think that the choice of what information to display shouldn't
depend on what format you use to display it.  The funny thing about me
being the one to implement XML and JSON output is that I don't
actually want to use them for anything - and especially not XML.  I
implemented them because they were a good (and popular) test case for
the options facility, but they're not actually the options I want.
And if I or others add functionality in the future to gather more
information via EXPLAIN, I don't want to have to use XML or JSON to
get at them.

> XML Format:
> As discussed previously I would like the format to look a bit different - but
> everyone wants it to look different anyway, so I think its ok as one of
> multiple possible lowest common determinators... With the big advantage of
> already being implemented.

Yes, I think there's not much advantage to changing this around.
There are lots of people with lots of ideas on this, but they're all
different and AFAICS there's no clear pattern.  I think it's good to
try to match the text output as much as possible.

> I think it would be nice in the future to add some sort of
> 'category={planner,timing,..}' attribute, but I think that should be
> discussed/implemented separately.

Agree that there are more things to be added.  But I haven't taken the
time to figure out exactly what.  One of things I would really like to
be able to get is the number of buckets and batches (expected and
actual) for a hash join.  Other things I've wished for:

- number of tuples discarded by a filter condition on a particular node
- amount of time spent constructing the output list of a node as
opposed to the rest of what the node does

> Documentation:
> I think it would be nice to add some more documentation about the xml format
> for application writers, but I think this should be a separate patch anyway.

Suggestions?

I have posted a new version of this patch on a separate thread; do you
have time to re-review?

...Robert


Re: machine-readable explain output v2

From
Andres Freund
Date:
Hi Robert,

On Friday 31 July 2009 23:13:54 Robert Haas wrote:
> On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote:
> I have posted a new version of this patch on a separate thread; do you
> have time to re-review?
Yes, I have seen it. I plan to spent some time on it tonight and/or tomorrow. 

You havent pushed out to your git mirror?

Andres


Re: machine-readable explain output v2

From
Andres Freund
Date:
On Friday 31 July 2009 23:13:54 Robert Haas wrote:
> On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote:
> > One part where I find the code flow ugly is 'did_boilerplate' in
> > report_triggers/its callsites.
> > I can see why it is done that way, but its not exactly obvious to read
> > when you want to find out how the format looks.
> Suggestions?
Will take a look when looking at the whole patch again.

> > Another, minor, issue is that the patch changes the FORMAT TEXT/default
> > output if VERBOSE is specified (schema is added). I don't see that as a
> > real problem because the format for VERBOSE is new anyway, but I thought
> > I would mention it.
> Verbose isn't new, but in 8.4 all it does is displays the output list
> for each node.  I think there's room for verbose to include some other
> types of verbosity that don't merit their own options.
Well, before 8.4 it was something entirely different... So its kinda "new".

> I think that the choice of what information to display shouldn't
> depend on what format you use to display it.  The funny thing about me
> being the one to implement XML and JSON output is that I don't
> actually want to use them for anything - and especially not XML.  I
> implemented them because they were a good (and popular) test case for
> the options facility, but they're not actually the options I want.
> And if I or others add functionality in the future to gather more
> information via EXPLAIN, I don't want to have to use XML or JSON to
> get at them.
I am not particularly interested in XML itself as well. Just as yours my main 
interest is having the possibility to add information without causing wreckage 
all over the place....
Although for some of the more complex queries (And as you know I have some 
rather ugly/complex ones...) a graphical viewer of the plans might be nice.

I am quite happy that the annoyance over a patch of mine "helped" you starting 
to work on this ;-) 
Thanks for all the work.

> > I think it would be nice in the future to add some sort of
> > 'category={planner,timing,..}' attribute, but I think that should be
> > discussed/implemented separately.
> Agree that there are more things to be added.  But I haven't taken the
> time to figure out exactly what.  One of things I would really like to
> be able to get is the number of buckets and batches (expected and
> actual) for a hash join.  Other things I've wished for:
I think after the patch is committed there should be a big collection of 
wishes so we can see what further infrastructure work is going to be needed...
Depending on the amount and kind of different options it might not be sufficient 
to simply extent struct Instrumentation/the current instrumentation 
infrastructure...


> > Documentation:
> > I think it would be nice to add some more documentation about the xml
> > format for application writers, but I think this should be a separate
> > patch anyway.
> Suggestions?
I think extending, correcting and commenting a schema like the one I provided 
sometime ago would be a good start. Anybody wanting to use the output should 
be familiar enough with that...
I can try to do some of that if somebody goes over my english afterwards...

Andres


Re: machine-readable explain output v2

From
Robert Haas
Date:
On Fri, Jul 31, 2009 at 6:04 PM, Andres Freund<andres@anarazel.de> wrote:
> I am quite happy that the annoyance over a patch of mine "helped" you starting
> to work on this ;-)
> Thanks for all the work.

You're welcome, thanks for all your reviewing.  For the record, I
wasn't annoyed BY the patch; I was annoyed by the inability of the
patch to be applied.

>> > I think it would be nice in the future to add some sort of
>> > 'category={planner,timing,..}' attribute, but I think that should be
>> > discussed/implemented separately.
>> Agree that there are more things to be added.  But I haven't taken the
>> time to figure out exactly what.  One of things I would really like to
>> be able to get is the number of buckets and batches (expected and
>> actual) for a hash join.  Other things I've wished for:
> I think after the patch is committed there should be a big collection of
> wishes so we can see what further infrastructure work is going to be needed...
> Depending on the amount and kind of different options it might not be sufficient
> to simply extent struct Instrumentation/the current instrumentation
> infrastructure...

We'll have to see.  The basic options framework is already in, but I
think a more far-ranging discussion should wait until post-CommitFest,
whether explain (format ...) ... is committed by then or not.

>> > Documentation:
>> > I think it would be nice to add some more documentation about the xml
>> > format for application writers, but I think this should be a separate
>> > patch anyway.
>> Suggestions?
> I think extending, correcting and commenting a schema like the one I provided
> sometime ago would be a good start. Anybody wanting to use the output should
> be familiar enough with that...
> I can try to do some of that if somebody goes over my english afterwards...

Happy to copy-edit.

...Robert


Re: machine-readable explain output v2

From
Robert Haas
Date:
On Fri, Jul 31, 2009 at 5:36 PM, Andres Freund<andres@anarazel.de> wrote:
> Hi Robert,
>
> On Friday 31 July 2009 23:13:54 Robert Haas wrote:
>> On Sat, Jul 18, 2009 at 10:29 PM, Andres Freund<andres@anarazel.de> wrote:
>> I have posted a new version of this patch on a separate thread; do you
>> have time to re-review?
> Yes, I have seen it. I plan to spent some time on it tonight and/or tomorrow.
>
> You havent pushed out to your git mirror?

Ah, fixed.

Sorry, forgot I made a new local branch.

...Robert