Re: Native XML - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: Native XML
Date
Msg-id 4D6CF81E.8020100@dunslane.net
Whole thread Raw
In response to Re: Native XML  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers

On 03/01/2011 08:16 AM, Robert Haas wrote:
> On Mon, Feb 28, 2011 at 6:54 PM, Andrew Dunstan<andrew@dunslane.net>  wrote:
>> There seems to be an almost universal assumption that storing XML in its
>> native form (i.e. a text stream) is going to produce inefficient results.
>> Maybe it will, but I think it needs to be fairly convincingly demonstrated.
>> And then we would have to consider the costs. For example, unless we
>> implemented our own XPath processor to work with our own XML format (do we
>> really want to do that?), to evaluate an XPath expression for a piece of XML
>> we'd actually need to produce the text format from our internal format
>> before passing it to some external library to parse into its internal format
>> and then process the XPath expression. That means we'd actually be making
>> things worse, not better. But this is clearly the sort of processing people
>> want to do - see today's discussion upthread about xpath_table.
> Well, obviously the only point of having our own internal format is if
> we have our own xpath processor&c to match.  One would think that
> this would be a lot faster than parsing the string with libxml2 every
> time we want to xpath it, especially for large documents.  But then
> again, I haven't seen any benchmarks.


That would be a huge body of code we'd need to maintain, complex and 
full of subtleties which, if we weren't deeply invested in the XML 
standards would bite us, I have no doubt.

Now, if someone wanted to start a project that added efficient 
serialization/de-serialization of libxml2 (or other library) objects so 
we could avoid constant parsing overhead, that would make lots more 
sense to me.

cheers

andrew




pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re: PD_ALL_VISIBLE flag was incorrectly set happend during repeatable vacuum
Next
From: Heikki Linnakangas
Date:
Subject: Re: PG signal handler and non-reentrant malloc/free calls