Re: Re: Storing XML in PostgreSQL - Mailing list pgsql-hackers

From John Gray
Subject Re: Re: Storing XML in PostgreSQL
Date
Msg-id slumj9.c27.ln@adzuki
Whole thread Raw
In response to Re: Re: Storing XML in PostgreSQL  (Gunnar Rønning <gunnar@polygnosis.com>)
List pgsql-hackers
In article <m2elr5b2cx.fsf@smaug.polygnosis.com>, gunnar@polygnosis.com
(Gunnar Rønning) wrote:

> Do you have any documentation on your C functions ? I'm just interested
> in knowing what functions they provide.
> 

There are only two (so far). They're very basic. I have:

pgxml_parse(text) returns bool -parses the provided text and returns true or false if it is 
well-formed or not.

pgxml_xpath(text doc, text xpath, int n) returns text -parses doc and returns the cdata of the nth occurence of
the "XPath" listed. This does handle relative and absolute paths 
but nothing else at present. I have a few variants of this. 

So, given a  table docstore:
Attribute |  Type   | Modifier 
-----------+---------+----------docid     | integer | document  | text    | 

containing documents such as:

<?XML version="1.0"?>
<site provider="Foundations" sitecode="ak97" version="1">  <name>Church Farm, Ashton Keynes</name>  <invtype>watching
brief</invtype> <location scheme="osgb">SU04209424</location>
 
</site>

I can type:
select docid, 
pgxml_xpath(document,'/site/name',1) as sitename,
pgxml_xpath(document,'/site/location',1) as locationfrom docstore;
and I get:
docid |          sitename           |  location  
-------+-----------------------------+------------    1 | Church Farm, Ashton Keynes  | SU04209424    2 | Glebe Farm,
LongItchington | SP41506500
 
(2 rows)

The next thing is to use the "function as tuple source" support which is
underway in order to allow the return of a list (in the DTD I'm using
-and doubtless many others- certain elements might be repeated, and
I think it would be good to be able to join against all the data from a 
particular element.

I hope this helps give a flavour. I'll try and tidy up the functions in the
next couple of days and then I can post what I've got so far. I'm keen to
build on this, as it's part of an (unfunded, unfortunately) project we're 
doing. Expat is MIT-licensed so I don't imagine there's a problem linking
it into PostgreSQL.

One aim is to allow people to set pg functions as the handlers "direct"
from the parser -the catch is that the expat API has lots of handlers
(OK, so most of them are less commonly used), so it's a matter of 
working out  a) an efficient API for setting handlers on a particular 
parser and b) how persistent a parser instance should be (each expat
instance can only do one document). Of course, expat may not be the 
best one to use -it would be great to be parser-agnostic and use SAX
with a java parser, but I don't think we have java as a language for 
user functions yet :-)
Incidentally, I'll be changing my email address over the next couple 
of daysto jgray@azuli.co.uk -just so you can follow this thread after
I've done that....

Regards

John
Azuli IT




pgsql-hackers by date:

Previous
From: "Howard Williams"
Date:
Subject: Can Postgres handle 2-phase commits ?
Next
From: Bruce Momjian
Date:
Subject: Re: Bug in psql tab completion