Re: From TODO, XML? - Mailing list pgsql-hackers

From jgray@beansindustry.co.uk
Subject Re: From TODO, XML?
Date
Msg-id 0g8sj9.1rh.ln@adzuki
Whole thread Raw
In response to From TODO, XML?  (mlw <markw@mohawksoft.com>)
List pgsql-hackers
In article <3B615336.D654E7E1@mohawksoft.com>, markw@mohawksoft.com (mlw)
wrote:
> I was looking over the todo list and saw that someone wanted to support
> XML. I have some quick and dirty stuff that could be used.
> 

I'm not clear from the TODO what that "XML support" might involve. The
reference to pg_dump suggests an XML dump format for databases. That only
makes sense if we build an XML frontend that can load XML-based pg_dump
files.

I can't see any very useful application though, unless someone has a
standard for database dumps using XML -I'd have thought that our current
"list of SQL statements" dump is fine (and useful if you speak SQL)

> OK, what should the feature look like?
> 

What's the feature for? The things I've been working on are trying to make
an XML parser available in the backend, and to build some XML document
manipulation functions/operators. This is useful for what I'm doing (using
XML documents as short packets of human and machine-readable descriptive
data) and may be useful to other people. This work hasn't progressed very
far (I did only spend an afternoon or so writing it though....):
(available at http://www.cabbage.uklinux.net/pgxml.tar.gz)

One obvious (and current) topic is XQuery and we might ask whether PG
could/should implement it. I think some thinking would be needed on that
because  a) It involves having a second, non-SQL parser on the front-end
and that could be quite a large undertaking  and  b)  there's probably
(from my initial reading) some discrepancy  between the PG (and indeed
SQL) data model and the XQuery one. If  we could work round that, XQuery
*might* be an attraction to people. Certainly the ability to form one XML
document out of another via a query may be good for some projects.

Perhaps if people interested in XML "stuff" could add here, we might flesh
out a little more of what's desired.

> Should it be grafted onto pg_dump or should a new utility pg_xml be
> created?
> 
> How strict should it be? A stricter parser is easier to write, one can
> use a library, unfortunately most xml is crap and for the utility to be
> useful, it has to be real fuzzy.
> 

I don't think you really can write a non-strict XML parser. At least, not
if you want the resulting DOM to be useful - violations of well-formedness
probably result in logical difficulties wth the document structure. i.e. 

<a>
<b>text
<c>more text</c>
</a>

Is <c> within <b>? Are <b> and <c> siblings? These are answerable with
well-formed XML -And they're very relevant questions to ask for many XML
processing tasks. 

> Any input would be appreciated.
> 

Likewise -I'd be very insterested to know what sort of things people were
interested in -as I've found an area where I have a need which others
might share. I'd like to contribute some effort into it.

Regards

John



pgsql-hackers by date:

Previous
From: Larry Rosenman
Date:
Subject: (forw) Caldera OpenUNIX 8
Next
From: "John Gray"
Date:
Subject: Re: Re: Re: Storing XML in PostgreSQL