Re: Explain XML patch v2 - Mailing list pgsql-patches

From Simon Riggs
Subject Re: Explain XML patch v2
Date
Msg-id 1215188448.4051.225.camel@ebony.site
Whole thread Raw
In response to Explain XML patch v2  (Tom Raney <raneyt@cecs.pdx.edu>)
Responses Re: Explain XML patch v2
Re: Explain XML patch v2
List pgsql-patches
On Tue, 2008-07-01 at 21:48 -0700, Tom Raney wrote:
> This is an update to my EXPLAIN XML patch submitted a few days ago.

This is an important patch and you've done well to get it this far.

I have much detailed feedback, but these are just emergent requests, now
I can see and understand what you've done.

* If the patch was implemented as an ExplainOneQuery_hook we would be
able to use this with 8.3 also, which might be interesting. So there's
no real need for this to be a patch on core.

* If I understand, the DTD option inlines the DTD into the resulting XML
document, rather than adding a schema definition?

* The DTD should be in a separate file also, so it can be referred to.
We should give that DTD a permanent URL so we can identify it across the
internet. This is the first time the project has published an XML
DTD/Schema, so we need to think through the right way to publish it. The
DTD should have a version number in it, so when this gets updated in the
future we can validate against the correct one.

* The DTD is regrettably a long, long way from where I need it to be.
The PLAN elements are all unrelated to one another, apart from their
sequence within the XML document and their "indent". That definition can
lead to XML documents that match the DTD yet would never be valid plans,
amongst other problems. For sensible analysis of the SQL we need the DTD
to match the actual structure of nodes in the executor. This requires
major DTD changes, regrettably. The "indent" comes from the nesting of
the nodes, and is not merely a visual feature of the complete plan. IMHO
it is critically important that the XML correctly conveys the structure
of the plan and not just the formatting of the current text output.
Otherwise many doors will be closed to us and this whole project wasted.
I want to be able to write xml queries to retrieve plans that contain a
merge join where both the inner and outer are merge joins, or to easily
compare the structure of two complex queries looking for differences.

* The meaning of the two time attributes is somewhat confused. It is
startuptime and totaltime, not time_start and time_end.

* It looks to me like QUALIFIER alone is not enough, since in some cases
nodes have both an index condition and a filter, for example.

* I've made a number of suggested changes below, but the DTD really
needs to follow the structures in src/include/nodes/plannodes.h
In particular you'll see the recursive structure introduced by the DTD
changes below

  * EXPLAIN has been renamed PLAN
  * PLAN has been renamed NODE
  * COST has been renamed to ESTIMATE
  * ANALYZE has been renamed to ACTUAL
  * OUTPUT, SORT and QUALIFIER have been removed for clarity only

<!ELEMENT plan (estimate, runtime?)>

<!ELEMENT node (table?, index?, estimate, actual?, outerpath?, innerpath?, initpath)>
<!ATTLIST plan\n"
    nodetype CDATA     #REQUIRED>

<!ELEMENT outerpath (node)>
<!ELEMENT innerpath (node)>
<!ELEMENT initpath (node)>

<!ELEMENT estimate EMPTY >

<!ATTLIST estimate
    startupcost CDATA  #REQUIRED
    totalcost CDATA    #REQUIRED
    rows CDATA     #REQUIRED
    width CDATA    #REQUIRED>

<!ELEMENT actual EMPTY >
<!ATTLIST actual\n"
    startuptime CDATA #REQUIRED
    totaltime CDATA #REQUIRED
    rows CDATA #REQUIRED
    loops CDATA #REQUIRED>

* I'd much prefer to define this as a Schema, since that's a bit more
flexible in what we can do, plus we can specify datatypes.

* Based upon some of those issues, I'd suggest that further work on the
DTD/Schema should only happen when a reasonable range of plans have been
accumulated to allow full understanding of the possibilities

I'm sorry I've found so much to say about this, but don't be dissuaded.
The basic structure of the patch is there, we just need to get the
details right also, so we can take full.

--
 Simon Riggs           www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


pgsql-patches by date:

Previous
From: Tom Lane
Date:
Subject: Re: Sorting writes during checkpoint
Next
From: Simon Riggs
Date:
Subject: Re: Sorting writes during checkpoint