Home > mailing lists

Re: Native XML - Mailing list pgsql-hackers

From	Yeb Havinga
Subject	Re: Native XML
Date	March 9, 2011 15:21:20
Msg-id	4D77D31F.9060501@gmail.com Whole thread Raw
In response to	Re: Native XML (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: Native XML
List	pgsql-hackers

Tree view

On 2011-03-09 19:30, Robert Haas wrote: <blockquote
cite="mid:AANLkTi=E+Lamz7onQ_w1uS55a5ymGjpWMqrv8eDH1Cmb@mail.gmail.com"type="cite"><pre wrap="">On Wed, Mar 9, 2011 at
1:11PM, Bruce Momjian <a class="moz-txt-link-rfc2396E" href="mailto:bruce@momjian.us"><bruce@momjian.us></a>
wrote:
</pre><blockquote type="cite"><pre wrap="">Robert Haas wrote:
</pre><blockquote type="cite"><pre wrap="">On Mon, Feb 28, 2011 at 10:30 AM, Tom Lane <a class="moz-txt-link-rfc2396E"
href="mailto:tgl@sss.pgh.pa.us"><tgl@sss.pgh.pa.us></a>wrote:
 
</pre><blockquote type="cite"><pre wrap="">Well, in principle we could allow them to work on both, just the same
way that (for instance) "+" is a standardized operator but works on more
than one datatype. ?But I agree that the prospect of two parallel types
with essentially duplicate functionality isn't pleasing at all.
</pre></blockquote><pre wrap="">
The real issue here is whether we want to store XML as text (as we do
now) or as some predigested form which would make "output the whole
thing" slower but speed up things like xpath lookups.  We had the same
issue with JSON, and due to the uncertainty about which way to go with
it we ended up integrating nothing into core at all.  It's really not
clear that there is one way of doing this that is right for all use
cases.  If you are storing xml in an xml column just to get it
validated, and doing no processing in the DB, then you'd probably
prefer our current representation.  If you want to build functional
indexes on xpath expressions, and then run queries that extract data
using other xpath expressions, you would probably prefer the other
representation.
</pre></blockquote><pre wrap="">
Someone should measure how much overhead the indexing of xml values
might have.  If it is minor, we might be OK with only an indexed xml
type.
</pre></blockquote><pre wrap="">
I think the relevant thing to measure would be how fast the
predigested representation speeds up the evaluation of xpath
expressions.
</pre></blockquote> About a predigested representation, I hope I'm not insulting anyone's education here, but a lot of
XMLdatabase 'accellerators' seem to be using the pre and post orders (see <a class="moz-txt-link-freetext"
href="http://en.wikipedia.org/wiki/Tree_traversal">http://en.wikipedia.org/wiki/Tree_traversal</a>)of the document
nodes.The following two pdfs show how these orders can be used to query for e.g. all ancestors of a node: second pdf
slide10: for nodes x,y : x is an ancestor of y when x.pre < y.pre AND x.post > y.post.<br /><br /><a
class="moz-txt-link-abbreviated"
href="http://www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor4.pdf">www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor4.pdf</a> 
aboutthe format<br /><a class="moz-txt-link-abbreviated"
href="http://www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor10.pdf">www.cse.unsw.edu.au/~cs4317/09s1/tutorials/tutor10.pdf</a>
aboutquerying the format<br /><br /> regards,<br /> Yeb Havinga<span id="search"><span class="f"><cite><br
/></cite></span></span><br/>

pgsql-hackers by date:

From: Robert Haas
Date: 09 March 2011, 15:14:09
Subject: Re: wrap alpha4 tomorrow ~9am Eastern (was: Alpha4 release blockers)

From: Bruce Momjian
Date: 09 March 2011, 15:25:57
Subject: Re: Problem with pg_upgrade (8.4 -> 9.0) due to ALTER DATABASE SET ROLE

Re: Native XML - Mailing list pgsql-hackers

Previous

Next