Re: machine-readable explain output v4 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: machine-readable explain output v4
Date
Msg-id 200908100208.28769.andres@anarazel.de
Whole thread Raw
In response to Re: machine-readable explain output v4  (Andrew Dunstan <andrew@dunslane.net>)
Responses Re: machine-readable explain output v4  (Andrew Dunstan <andrew@dunslane.net>)
Re: machine-readable explain output v4  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: machine-readable explain output v4  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
On Monday 10 August 2009 01:21:35 Andrew Dunstan wrote:
> Robert Haas wrote:
> > The one significant representational choice that I'm aware of having
> > made is to use nested tags rather than attributes in the XML format.
> > This seems to me to offer several advantages.  First, it's clearly
> > impossible to standardize on attributes, because attributes can only
> > be text, and it seems to me that if we're going to try to output
> > structured data, we want to take that as far as we can, and we have
> > attributes (like sort keys) that are lists rather than scalars.  Using
> > tags means that they can have substructure when needed.  Second, it
> > seems likely to me that people will want to extend explain further in
> > the future: indeed, that was the whole point of the explain-options
> > patch which was already committed.  That's pretty simple in the
> > current design - just add a few more calls to ExplainPropertyText or
> > ExplainPropertyList in the appropriate place, and you're done.  I'm
> > pretty sure that splitting things up between attributes and nested
> > tags would complicate such modifications.
> The XML Schema standard is a language for specifying the structure of a
> given XML document type, and while it is undoubtedly complex, it is also
> much more powerful than the older DTD mechanism. I think we should be
> creating (and publishing) an XML Schema specification for any XML
> documents we are producing. There are a number of members of the
> community who are equipped to help produce these.
I produced/mailed a relaxng version for a a bit older version and I plan to 
refresh and document it once the format seems suitably stable. I am not sure 
it is yet. If yes, this should not take that long...
(Relaxng because you easily can convert it into most other XML schema 
description languages)

> There is probably a good case for using an explicit namespace with such
> docs. So we might have something like:
>
>     <pg:explain
>     xmlns:pg="http://www.postgresql.org/xmlspecs/explain/v1.xsd"> ....
>
> BTW, has anyone tried validating the XML at all? I just looked very
> briefly at the patch at
> <http://archives.postgresql.org/pgsql-hackers/2009-07/msg01944.php> and
> I noticed this which makes me suspicious:
>
> +     if (es.format == EXPLAIN_FORMAT_XML)
> +         appendStringInfoString(es.str,
> +             "<explain xmlns=\"http://www.postgresql.org/2009/explain\"
> <http://www.postgresql.org/2009/explain%5C%22>;>\n");
That bug is fixed - as referenced above I wrote a schema and validated it. So, 
yes, the generated XML was valid at least before the last round of 
refactoring. And I looked through the output quite a bit so I would surprised 
if there is such a breakage.

> That ";" after the attribute is almost certainly wrong. This is a classic
> case of what I was talking about a month or two ago. Building up XML (or
> any structured doc, really, XML is not special in this regard) by ad hoc
> methods is horribly error prone. if you don't want to rely on libxml, then
> I think you need to develop a lightweight abstraction rather than just
> appending to a StringInfo.
Actually by now a non-insignificant portion already "outsources" this - only 
some special cases (empty attributes, no newlines wanted, initial element with 
namespace) do not do this.

While it would be possible to add another step inbetween and generate a format 
neutral tree and generate the different formats out of it I am not sure that 
this is worthwile.
The current text format will need to stay special cased anyway because its far 
to inconsistent to generate it from anything abstract and I don't see any 
completely new formats coming (i.e. not just optional parts)?

Andres


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: machine-readable explain output v4
Next
From: Andrew Dunstan
Date:
Subject: Re: machine-readable explain output v4