Re: returning PGresult as xml - Mailing list pgsql-hackers

From Scott Lamb
Subject Re: returning PGresult as xml
Date
Msg-id 7D909592-50C8-11D8-B400-000A95891440@slamb.org
Whole thread Raw
In response to returning PGresult as xml  (Brian Moore <brianmooreca@yahoo.com>)
Responses Re: returning PGresult as xml
List pgsql-hackers
On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
> it's been said that converting a PGresult into xml is "trivial" and
> that's why it hasn't been done in the codebase as of yet. i have seen
> much code that writes xml, and many mistakes are made. most often
> improper escaping, or writing to a schema/DTD that has not been
> well-thought out. the transformation into xml is not difficult, but it
> does require attention to detail.

The escaping, at any rate, is trivial if you use a proper API. It  
sounds like your code is not using any XML API, given that you have not  
mentioned adding dependencies to libpq and that you've mentioned your  
own hashtable algorithm. It would be much easier if you did so, though  
I imagine the additional dependency would mean it would not be accepted  
into libpq.

> <PGresult num_rows='1' num_cols='2'>
>   <col_desc num='0' type='int4' format='text' name='foo' />
>   <col_desc num='1' type='int4' format='text' name='bar' />
>   <row num='0'>
>     <col num='0'>1</col>
>     <col num='1'>2</col>
>   </row>
> </PGresult>

How would you filter for a column in XSLT based on column name with  
this schema? It's certainly not trivial. I have similar code, and I  
included the column name as an attribute in each column element for  
this reason.

I also used the java.sql type names rather than PostgreSQL ones, as my  
code is not specific to PostgreSQL.

> i would expect that integration would look something like exposing
> from libpq a function that looks something like:
>   const char *PGresult_as_xml(PGresult *result, int include_dtd);

Ugh. So it returns the whole thing as one big string? That won't hold  
up well if your resultset is large.

A better way would be to pump out SAX events. This is what I did for  
three reasons:

1) The escaping becomes trivial, as mentioned above. In fact, not only  
does SAX escape things correctly, but it makes you explicitly specify  
that the string you're giving it is character data, an element name, an  
attribute name, an attribute value, etc, and handles everything  
properly based on that. So you'd really have to work to screw it up,  
unlike code that just does like
    printf("<elem foo='%s' bar='%s'>%s</elem>",  
xml_attr_escape(foo_val), xml_attr_escape(bar_val),  
xml_char_escape(elem_val));

where it would be quite easy to lose track of what needs to be escaped  
how, what variables are already escaped, etc.

2) It can stream large result sets, provided that the next stage  
supports doing so. Certainly a raw SAX serializer would, also some XSLT  
stylesheets with Xalan, and STX/Joost is designed for streaming  
transformations.

3) If the next stage is a transformation, this makes it unnecessary to  
serialize and parse the data between. So the SAX way is faster.

You're welcome to take a look at my code. I imagine it will not be  
directly useful to you, as it is written in Java, but I have a live  
example which puts this stuff to use. Designing an acceptable API and  
schema is always much easier when you see how it is put to use.

<http://www.slamb.org/projects/xmldb/> - my (so far poorly-named) xmldb  
project, which includes the org.slamb.xmldb.ResultSetProducer class to  
transform a java.sql.ResultSet to SAX events in my resultset schema.

<http://www.slamb.org/svn/repos/projects/xmldb/src/java/org/slamb/ 
xmldb/ResultSetProducer.java> - source code for said class

<http://www.slamb.org/projects/mb/> - a message board which uses this  
code and some XSLT

<https://www.slamb.org/mb/> - a live example of said message board

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/ 
resultset.xsl> - simple XSLT to take an arbitrary resultset and convert  
it to an HTML table

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/main.xsl> -  
an example XSLT file that inherits this and then provides exceptions  
for a couple columns (not displaying the id column, instead including  
it as a hyperlink in the name column).

Good luck.

Scott Lamb



pgsql-hackers by date:

Previous
From: ohp@pyrenet.fr
Date:
Subject: Most urgent
Next
From: Andrew Dunstan
Date:
Subject: Re: returning PGresult as xml