Re: machine-readable explain output - Mailing list pgsql-hackers

From Robert Haas
Subject Re: machine-readable explain output
Date
Msg-id 603c8f070906170806h754e95ebv23c012b97f8e8610@mail.gmail.com
Whole thread Raw
In response to Re: machine-readable explain output  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
On Wed, Jun 17, 2009 at 10:40 AM, Peter Eisentraut<peter_e@gmx.net> wrote:
> On Tuesday 16 June 2009 20:21:21 Tom Lane wrote:
>> As a concrete example of what I'm thinking about, I'd hope that PgAdmin
>> would be able to display a graphical summary of a plan tree, and then
>> pop up measurements associated with one of the nodes when you
>> right-click on that node.  To do this, it doesn't necessarily have to
>> know all about each specific measurement that a particular backend
>> version might emit; but it needs to be able to tell which things are
>> measurements.
>
> To do this, you pack all "measurements" into a <measurement> element, and then
> tools are just told to display those.

I think this is markup for the sake of markup.  Right now, if we were
to add 10 additional options, all they'd need to do is call
ExplainPropertyText() and the right stuff would happen.  If we go this
route, we'll need to worry about getting each property into the right
subgroup, and argue about whether the assignment of properties to
subgroups is correct or whether we need to rearrange the subgroups (is
"sort method" a "measurement"?).  Our chances of us not having to
change this again in the future are a lot better if we just report the
data and let third-party applications worry about categorizing it if
they want to.

Possibly it would make sense to introduce groups for the portions of
the output which are added in response to particular options; for
example, we could have a section called "ANALYZE" that contains the
data that is only present when ANALYZE is used.  But this has the same
complicating effect on the code.  You'd have to get the
explain_tuplesort() stuff into the same sub-node as the analyze times
and loop counts, for example, which would require non-trivial
restructuring of the existing code for no clear benefit.  You'll
quickly get into a situation where you print the same information from
completely different parts of the code depending on whether or not the
output is text format, which is going to make maintaining this a bear.

I think the most common use case for this output format is going to be
to feed it to an XML parser and use xpath against it, or feed it into
a JSON parser and then write things like $plan->{"Actual Rows"} or
plan["Actual Rows"], depending on what language you use to process it
after you parse it.  Or you may have people who iterate over sort keys
%$plan and print all the values for which !ref $plan->{$key}.
Unnecessary levels of nesting just make the xpath expressions (or
perl/python/javascript hash/array dereferences) longer.  Changing tags
to attributes or visca versa changes which xpath expression you use to
get the data you want, but that's about it.

...Robert


pgsql-hackers by date:

Previous
From: Petr Jelinek
Date:
Subject: Re: GRANT ON ALL IN schema
Next
From: Tom Lane
Date:
Subject: Re: Named transaction