Re: machine-readable explain output v4 - Mailing list pgsql-hackers
From | Andrew Dunstan |
---|---|
Subject | Re: machine-readable explain output v4 |
Date | |
Msg-id | 4A7F59FF.4060302@dunslane.net Whole thread Raw |
In response to | Re: machine-readable explain output v4 (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: machine-readable explain output v4
|
List | pgsql-hackers |
Robert Haas wrote: > The one significant representational choice that I'm aware of having > made is to use nested tags rather than attributes in the XML format. > This seems to me to offer several advantages. First, it's clearly > impossible to standardize on attributes, because attributes can only > be text, and it seems to me that if we're going to try to output > structured data, we want to take that as far as we can, and we have > attributes (like sort keys) that are lists rather than scalars. Using > tags means that they can have substructure when needed. Second, it > seems likely to me that people will want to extend explain further in > the future: indeed, that was the whole point of the explain-options > patch which was already committed. That's pretty simple in the > current design - just add a few more calls to ExplainPropertyText or > ExplainPropertyList in the appropriate place, and you're done. I'm > pretty sure that splitting things up between attributes and nested > tags would complicate such modifications. > > > In general, in XML one uses an attribute for a named property of an object that can only have one value at a time. A classic example is the dimensions of an object - it can only have one width and height. Children (nested tags, particularly) are used for things it can have an arbitrary number of, or things which in turn can have children. the HTML <p> and <body> elements are (respectively) examples of these. Generally, attribute values especially should be short - I recently saw an example that had an entire image hex encoded in an XML attribute, which struck me as just horrible. Enumerations, date and time values, booleans, measurements - these are common types of attribute values. Extracting a value from an attribute is no more or less difficult than from a nested tag, using the XPath query language. The XML Schema standard is a language for specifying the structure of a given XML document type, and while it is undoubtedly complex, it is also much more powerful than the older DTD mechanism. I think we should be creating (and publishing) an XML Schema specification for any XML documents we are producing. There are a number of members of the community who are equipped to help produce these. There is probably a good case for using an explicit namespace with such docs. So we might have something like: <pg:explain xmlns:pg="http://www.postgresql.org/xmlspecs/explain/v1.xsd"> .... BTW, has anyone tried validating the XML at all? I just looked very briefly at the patch at <http://archives.postgresql.org/pgsql-hackers/2009-07/msg01944.php> and I noticed this which makes me suspicious: + if (es.format == EXPLAIN_FORMAT_XML) + appendStringInfoString(es.str, + "<explain xmlns=\"http://www.postgresql.org/2009/explain\" <http://www.postgresql.org/2009/explain%5C%22>;>\n"); That ";" after the attribute is almost certainly wrong. This is a classic case of what I was talking about a month or twoago. Building up XML (or any structured doc, really, XML is not special in this regard) by ad hoc methods is horriblyerror prone. if you don't want to rely on libxml, then I think you need to develop a lightweight abstraction ratherthan just appending to a StringInfo. cheers andrew
pgsql-hackers by date: