Thread: returning PGresult as xml

returning PGresult as xml

From

Brian Moore

Date:

25 January 2004, 05:07:55

hello,

this note is intended to describe my work on beginning to further
integrate xml into postgresql.  first, i'd like to thank the
contributers of contrib/xml as their work was instrumental in helping
me understand what support exists and where i wanted to go. thanks.

my first requirement is to export data from the database into a format
which can be read not only by existing (postgresql) clients but by
people and programs that don't know how to use a PGresult. xml is very
verbose, but its popularity makes it closer to universal than anything
else of which i could think. in addition, ideas like XSL/XSLT make an
export of xml very attractive to me.

it's been said that converting a PGresult into xml is "trivial" and
that's why it hasn't been done in the codebase as of yet. i have seen
much code that writes xml, and many mistakes are made. most often
improper escaping, or writing to a schema/DTD that has not been
well-thought out. the transformation into xml is not difficult, but it
does require attention to detail.

i feel badly that i have not been able to use any existing
standards. xmlrpc, i found, was not type-rich enough, and that made it
difficult or impossible to use. in particular, the only way to
represent a matrix is as a struct of structs. this makes it very
verbose for one to encode a PGresult. i found SOAP too difficult for
compliance. so my result was to create a schema, which results in a  DTD.

an example of what my code generates can be found below. the following
xml is the result of the query "SELECT 1 as foo 2 as bar":

<?xml version='1.0' encoding='ISO-8859-1'?>
<!DOCTYPE PGresult [ <!ELEMENT PGresult (col_desc*, row*)> <!ATTLIST PGresult    num_rows CDATA #REQUIRED    num_cols
CDATA#REQUIRED> <!ELEMENT col_desc EMPTY> <!ATTLIST col_desc     num CDATA #REQUIRED     format (text | binary)
#REQUIRED    type CDATA #REQUIRED     name CDATA #REQUIRED> <!ELEMENT row (col*)> <!ATTLIST row    num CDATA #REQUIRED>
<!ELEMENTcol (#PCDATA)> <!ATTLIST col    num CDATA #REQUIRED> <!ENTITY NULL ''>
 
]>
<PGresult num_rows='1' num_cols='2'> <col_desc num='0' type='int4' format='text' name='foo' /> <col_desc num='1'
type='int4'format='text' name='bar' /> <row num='0'>   <col num='0'>1</col>   <col num='1'>2</col> </row>
 
</PGresult>

a slightly more complicated example:
template1=# select oid,typname,typlen,typtype from pg_type where 
oid<20;oid | typname | typlen | typtype
-----+---------+--------+--------- 16 | bool    |      1 | b 17 | bytea   |     -1 | b 18 | char    |      1 | b 19 |
name   |     32 | b
 
(4 rows)

<!DOCTYPE PGresult [ <!ELEMENT PGresult (col_desc*, row*)> <!ATTLIST PGresult    num_rows CDATA #REQUIRED    num_cols
CDATA#REQUIRED> <!ELEMENT col_desc EMPTY> <!ATTLIST col_desc    num CDATA #REQUIRED    format (text | binary) #REQUIRED
  type CDATA #REQUIRED    name CDATA #REQUIRED> <!ELEMENT row (col*)> <!ATTLIST row    num CDATA #REQUIRED> <!ELEMENT
col(#PCDATA)> <!ATTLIST col    num CDATA #REQUIRED> <!ENTITY NULL ''>
 
]>
<PGresult num_rows='4' num_cols='4'> <col_desc num='0' type='oid' format='text' name='oid' /> <col_desc num='1'
type='name'format='text' name='typname' /> <col_desc num='2' type='int2' format='text' name='typlen' /> <col_desc
num='3'type='char' format='text' name='typtype' /> <row num='0'>   <col num='0'>16</col>   <col num='1'>bool</col>
<colnum='2'>1</col>   <col num='3'>b</col> </row> <row num='1'>   <col num='0'>17</col>   <col num='1'>bytea</col>
<colnum='2'>-1</col>   <col num='3'>b</col> </row> <row num='2'>   <col num='0'>18</col>   <col num='1'>char</col>
<colnum='2'>1</col>   <col num='3'>b</col> </row> <row num='3'>   <col num='0'>19</col>   <col num='1'>name</col>
<colnum='2'>32</col>   <col num='3'>b</col> </row>
 
</PGresult>

i have done this work for myself and my own needs, so i fully
understand if this work is not interesting to the postgresql group in
general. however, if there is some chance that the changes could be
incorporated into the tree, i would be interested in contributing, as
integration into a proper version of postgresql will make my build
easier. ;)

i would expect that integration would look something like exposing
from libpq a function that looks something like: const char *PGresult_as_xml(PGresult *result, int include_dtd);

i would also expect that psql would be modified to take a \X
and to call the above function. there is some strangeness now,
as psql doesn't call methods defined in libpq to print; it has
its own printer. i, of course, would do this work. :) i just
need to know that people are interested.

also, if integration is going to happen, i will need to replace
calls to my hashtables with calls to postgresql's hashtables.
i saw dynamic hashtables in the backend, but not in the interfaces.
i wasn't exactly sure how i should go about introducing another
module to the frontend; there could be problems of which i
remain blissfully unaware.

i look forward to feedback, and i hope this note finds you well,

b

__________________________________
Do you Yahoo!?
Yahoo! SiteBuilder - Free web site building tool. Try it!
http://webhosting.yahoo.com/ps/sb/

Re: returning PGresult as xml

From

Hannu Krosing

Date:

25 January 2004, 12:24:39

Brian Moore kirjutas P, 25.01.2004 kell 11:07:
> hello,
> 
> this note is intended to describe my work on beginning to further
> integrate xml into postgresql.  first, i'd like to thank the
> contributers of contrib/xml as their work was instrumental in helping
> me understand what support exists and where i wanted to go. thanks.

First, IMHO having unified XML support is a good thing for Postgres.

I still have some questions and suggestions:

At what place do you intend to add your converter ?

I remember someone started abstracting out the FE/BE protocol calls in
server code with an aim of supporting multiple protocols, but stopped
without making too much progress (?)

Also, I would suggest that XML Schema datatypes should be used,
preferrably together with either RelaxNG schema or something from
SQL/XML spec.

-------------
Hannu

Re: returning PGresult as xml

From

Peter Eisentraut

Date:

25 January 2004, 13:23:47

Brian Moore wrote:
> i feel badly that i have not been able to use any existing
> standards. xmlrpc, i found, was not type-rich enough, and that made
> it difficult or impossible to use. in particular, the only way to
> represent a matrix is as a struct of structs. this makes it very
> verbose for one to encode a PGresult. i found SOAP too difficult for
> compliance. so my result was to create a schema, which results in a 
> DTD.

Let me point out an implementation I made last time this subject was 
discussed:

http://developer.postgresql.org/~petere/xmltable.tar.bz2

This package contains server-side functions that convert a table (more 
generally a query result) to an XML document and/or and XSL schema both 
mimicking the SQL/XML standard.

Additionally, it contains a function to convert such an XML document 
back to a table source.  I also threw in an XSLT stylesheet to convert 
an SQL/XML table to an HTML table, so you can more easily view the 
results.

I also have some code in development that adds cursor interfaces, an XML 
data type, and some integration with the existing XPath functionality.  
I think that for processing XML in the database and as far as following 
the existing standards, this is the direction to take.

Also last time this subject was dicussed, I believe it was Mike Mascari 
who proposed and implemented another solution which is more client-side 
oriented.  He wrote a piece of code that took a normal libpq result set 
and shipped it off as SQL/XML wrapped in SOAP.  And it had streaming 
capabilities for large result sets.

These are two complementary approaches that exist more or less.  Of 
course this only covers the C API and would need sensible extensions 
for other programming langauges.  But I invite you to look at them and 
see whether they fit your needs (the ideas, not necessarily the state 
of the code).

Re: returning PGresult as xml

From

Mike Mascari

Date:

25 January 2004, 22:08:09

Peter Eisentraut wrote:

>Brian Moore wrote:
>  
>
>>i feel badly that i have not been able to use any existing
>>standards. xmlrpc, i found, was not type-rich enough, and that made
>>it difficult or impossible to use. in particular, the only way to
>>represent a matrix is as a struct of structs. this makes it very
>>verbose for one to encode a PGresult. i found SOAP too difficult for
>>compliance. so my result was to create a schema, which results in a 
>>DTD.
>>    
>>
>
>Let me point out an implementation I made last time this subject was 
>discussed:
>
>http://developer.postgresql.org/~petere/xmltable.tar.bz2
>
>Also last time this subject was dicussed, I believe it was Mike Mascari 
>who proposed and implemented another solution which is more client-side 
>oriented.  
>
I humbly confess it wasn't me. We use CORBA....

Mike Mascari

Re: returning PGresult as xml

From

Andrew Dunstan

Date:

26 January 2004, 12:51:56


Peter Eisentraut wrote:

>Let me point out an implementation I made last time this subject was 
>discussed:
>
>http://developer.postgresql.org/~petere/xmltable.tar.bz2
>
>This package contains server-side functions that convert a table (more 
>generally a query result) to an XML document and/or and XSL schema both 
>mimicking the SQL/XML standard.
>
>Additionally, it contains a function to convert such an XML document 
>back to a table source.  I also threw in an XSLT stylesheet to convert 
>an SQL/XML table to an HTML table, so you can more easily view the 
>results.
>
>I also have some code in development that adds cursor interfaces, an XML 
>data type, and some integration with the existing XPath functionality.  
>I think that for processing XML in the database and as far as following 
>the existing standards, this is the direction to take.
>
>  
>

Peter: this looks very nice. What are your intentions with this code? 
Put it in contrib? Also, do you intend to implement the SQL/XML 
functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?

cheers

andrew

Re: returning PGresult as xml

From

Scott Lamb

Date:

27 January 2004, 08:58:58

On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
> it's been said that converting a PGresult into xml is "trivial" and
> that's why it hasn't been done in the codebase as of yet. i have seen
> much code that writes xml, and many mistakes are made. most often
> improper escaping, or writing to a schema/DTD that has not been
> well-thought out. the transformation into xml is not difficult, but it
> does require attention to detail.

The escaping, at any rate, is trivial if you use a proper API. It  
sounds like your code is not using any XML API, given that you have not  
mentioned adding dependencies to libpq and that you've mentioned your  
own hashtable algorithm. It would be much easier if you did so, though  
I imagine the additional dependency would mean it would not be accepted  
into libpq.

> <PGresult num_rows='1' num_cols='2'>
>   <col_desc num='0' type='int4' format='text' name='foo' />
>   <col_desc num='1' type='int4' format='text' name='bar' />
>   <row num='0'>
>     <col num='0'>1</col>
>     <col num='1'>2</col>
>   </row>
> </PGresult>

How would you filter for a column in XSLT based on column name with  
this schema? It's certainly not trivial. I have similar code, and I  
included the column name as an attribute in each column element for  
this reason.

I also used the java.sql type names rather than PostgreSQL ones, as my  
code is not specific to PostgreSQL.

> i would expect that integration would look something like exposing
> from libpq a function that looks something like:
>   const char *PGresult_as_xml(PGresult *result, int include_dtd);

Ugh. So it returns the whole thing as one big string? That won't hold  
up well if your resultset is large.

A better way would be to pump out SAX events. This is what I did for  
three reasons:

1) The escaping becomes trivial, as mentioned above. In fact, not only  
does SAX escape things correctly, but it makes you explicitly specify  
that the string you're giving it is character data, an element name, an  
attribute name, an attribute value, etc, and handles everything  
properly based on that. So you'd really have to work to screw it up,  
unlike code that just does like
    printf("<elem foo='%s' bar='%s'>%s</elem>",  
xml_attr_escape(foo_val), xml_attr_escape(bar_val),  
xml_char_escape(elem_val));

where it would be quite easy to lose track of what needs to be escaped  
how, what variables are already escaped, etc.

2) It can stream large result sets, provided that the next stage  
supports doing so. Certainly a raw SAX serializer would, also some XSLT  
stylesheets with Xalan, and STX/Joost is designed for streaming  
transformations.

3) If the next stage is a transformation, this makes it unnecessary to  
serialize and parse the data between. So the SAX way is faster.

You're welcome to take a look at my code. I imagine it will not be  
directly useful to you, as it is written in Java, but I have a live  
example which puts this stuff to use. Designing an acceptable API and  
schema is always much easier when you see how it is put to use.

<http://www.slamb.org/projects/xmldb/> - my (so far poorly-named) xmldb  
project, which includes the org.slamb.xmldb.ResultSetProducer class to  
transform a java.sql.ResultSet to SAX events in my resultset schema.

<http://www.slamb.org/svn/repos/projects/xmldb/src/java/org/slamb/ 
xmldb/ResultSetProducer.java> - source code for said class

<http://www.slamb.org/projects/mb/> - a message board which uses this  
code and some XSLT

<https://www.slamb.org/mb/> - a live example of said message board

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/ 
resultset.xsl> - simple XSLT to take an arbitrary resultset and convert  
it to an HTML table

<http://www.slamb.org/svn/repos/projects/mb/src/WEB-INF/xsl/main.xsl> -  
an example XSLT file that inherits this and then provides exceptions  
for a couple columns (not displaying the id column, instead including  
it as a hyperlink in the name column).

Good luck.

Scott Lamb

Re: returning PGresult as xml

From

Andrew Dunstan

Date:

27 January 2004, 09:52:01


Scott Lamb wrote:

> On Jan 25, 2004, at 3:07 AM, Brian Moore wrote:
>
>> <PGresult num_rows='1' num_cols='2'>
>>   <col_desc num='0' type='int4' format='text' name='foo' />
>>   <col_desc num='1' type='int4' format='text' name='bar' />
>>   <row num='0'>
>>     <col num='0'>1</col>
>>     <col num='1'>2</col>
>>   </row>
>> </PGresult>
>
>
> How would you filter for a column in XSLT based on column name with  
> this schema? It's certainly not trivial. I have similar code, and I  
> included the column name as an attribute in each column element for  
> this reason. 



Close to trivial if you set up a key on the col-desc elements, I should 
think. Maybe something like:

<xsl:key name="coldesc" match="col-desc" use="@num" />

...

<xsl:for-each select=" key('coldesc',@num)/@name = 'colname' " >
...

Alternatively you can get there using the parent and preceding-sibling 
axes, but it's less clear.

cheers

andrew

building plperl on 7.4.1

From

Dave Cramer

Date:

27 January 2004, 10:22:29

I tried to build plperl on 7.4.1,

On my system

perl -MConfig -e 'print $Config{ccdlflags}'

returns 

-rdynamic -Wl,-rpath,/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE

however the build ends up using

-rpath,$prefix/lib

Dave
-- 
Dave Cramer
519 939 0336
ICQ # 1467551

index scan with functional indexes

From

Dave Cramer

Date:

27 January 2004, 10:27:22

I'm curious what the result of a reverse index does on a table with url
like data, so I did the following


create function fn_strrev(text) returns text as 'return reverse($_[0])'
language 'plperl' with (iscachable);

create index r_url_idx on url( fn_strrev(url));

vacuum analyze;


explain select * from url where url like fn_strrev('%beta12.html');                      QUERY PLAN
---------------------------------------------------------Seq Scan on url  (cost=0.00..13281.70 rows=1 width=454)
Filter:((url)::text ~~ 'lmth.21ateb%'::text)
 


Is it possible to get the planner to use an index scan ?

How?

the db is using locale 'C'

-- 
Dave Cramer
519 939 0336
ICQ # 1467551

Re: index scan with functional indexes

From

Tom Lane

Date:

27 January 2004, 13:33:35

Dave Cramer <pg@fastcrypt.com> writes:
> create index r_url_idx on url( fn_strrev(url));

> explain select * from url where url like fn_strrev('%beta12.html');
>                        QUERY PLAN
> ---------------------------------------------------------
>  Seq Scan on url  (cost=0.00..13281.70 rows=1 width=454)
>    Filter: ((url)::text ~~ 'lmth.21ateb%'::text)

> Is it possible to get the planner to use an index scan ?

Sure, but not that way.  Try "fn_strrev(url) like something".
You have to compare the indexed value to something...
        regards, tom lane

Re: index scan with functional indexes

From

Dave Cramer

Date:

27 January 2004, 13:42:36

same answer

davec=# show enable_seqscan;enable_seqscan
----------------off
(1 row)
davec=# explain analyze select * from url where fn_strrev(url) like
'%beta12.html';                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------Seq
Scanon url  (cost=100000000.00..100013533.04 rows=503 width=454)
 
(actual time=3851.636..3851.636 rows=0 loops=1)  Filter: (fn_strrev((url)::text) ~~ '%beta12.html'::text)Total runtime:
3851.712ms
 
(3 rows)


On Tue, 2004-01-27 at 12:33, Tom Lane wrote:
> Dave Cramer <pg@fastcrypt.com> writes:
> > create index r_url_idx on url( fn_strrev(url));
> 
> > explain select * from url where url like fn_strrev('%beta12.html');
> >                        QUERY PLAN
> > ---------------------------------------------------------
> >  Seq Scan on url  (cost=0.00..13281.70 rows=1 width=454)
> >    Filter: ((url)::text ~~ 'lmth.21ateb%'::text)
> 
> > Is it possible to get the planner to use an index scan ?
> 
> Sure, but not that way.  Try "fn_strrev(url) like something".
> You have to compare the indexed value to something...
> 
>             regards, tom lane
> 
-- 
Dave Cramer
519 939 0336
ICQ # 1467551

Re: index scan with functional indexes

From

Stephan Szabo

Date:

27 January 2004, 14:01:41

On Tue, 27 Jan 2004, Dave Cramer wrote:

> same answer
>
> davec=# show enable_seqscan;
>  enable_seqscan
> ----------------
>  off
> (1 row)
>
> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

That's still an unanchored like clause, besides I think that would get
urls that begin with lmth.21ateb.

I think the condition you want would be:
fn_strrev(url) like 'lmth.21ateb%'

Re: index scan with functional indexes

From

Alvaro Herrera

Date:

27 January 2004, 14:02:42

On Tue, Jan 27, 2004 at 12:41:41PM -0500, Dave Cramer wrote:

> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

Reverse the constant too:

davec=# explain analyze select * from url where fn_strrev(url) like
fn_strrev('%beta12.html');

You won't get an indexscan if you have a % in front of the string.

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Hay quien adquiere la mala costumbre de ser infeliz" (M. A. Evans)

Re: index scan with functional indexes

From

Tom Lane

Date:

27 January 2004, 14:03:27

Dave Cramer <pg@fastcrypt.com> writes:
> davec=# explain analyze select * from url where fn_strrev(url) like
> '%beta12.html';

Don't you need the % at the right end to have an indexable plan?
I suspect that both of your tries so far are actually semantically
wrong, and that what you intend is

select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
        regards, tom lane

Re: index scan with functional indexes

From

Dave Cramer

Date:

27 January 2004, 14:10:47

Tried, all the suggestions

--dc--
davec=# explain analyze select * from url where fn_strrev(url) like
fn_strrev('%beta12.html');                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------Seq
Scanon url  (cost=100000000.00..100013533.04 rows=503 width=454)
 
(actual time=1416.448..3817.221 rows=12 loops=1)  Filter: (fn_strrev((url)::text) ~~ 'lmth.21ateb%'::text)Total
runtime:3817.315 ms
 
(3 rows)
davec=# explain analyze select * from url where fn_strrev(url) like
'lmth.21ateb%';                                                      QUERY PLAN

------------------------------------------------------------------------------------------------------------------------Seq
Scanon url  (cost=100000000.00..100013533.04 rows=503 width=454)
 
(actual time=1412.181..3843.998 rows=12 loops=1)  Filter: (fn_strrev((url)::text) ~~ 'lmth.21ateb%'::text)Total
runtime:3844.106 ms
 
(3 rows)
davec=# explain analyze select * from url where fn_strrev(url) like
'%lmth.21ateb';                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------Seq
Scanon url  (cost=100000000.00..100013533.04 rows=503 width=454)
 
(actual time=3853.501..3853.501 rows=0 loops=1)  Filter: (fn_strrev((url)::text) ~~ '%lmth.21ateb'::text)Total runtime:
3853.583ms
 
(3 rows)

On Tue, 2004-01-27 at 13:02, Tom Lane wrote:
> Dave Cramer <pg@fastcrypt.com> writes:
> > davec=# explain analyze select * from url where fn_strrev(url) like
> > '%beta12.html';
> 
> Don't you need the % at the right end to have an indexable plan?
> I suspect that both of your tries so far are actually semantically
> wrong, and that what you intend is
> 
> select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
> 
>             regards, tom lane
> 
-- 
Dave Cramer
519 939 0336
ICQ # 1467551

Re: index scan with functional indexes

From

Tom Lane

Date:

27 January 2004, 14:28:40

Dave Cramer <pg@fastcrypt.com> writes:
> Tried, all the suggestions

Mph.  It works for me... what PG version are you using exactly,
and are you certain you've selected C locale?  (Do you get LIKE
optimization on plain indexes?)
        regards, tom lane

Re: index scan with functional indexes

From

Dave Cramer

Date:

27 January 2004, 14:39:35

I'm using 7.4.1, the db was initdb --locale='C'

and no I don't get them on plain indexes ????

Dave
On Tue, 2004-01-27 at 13:28, Tom Lane wrote:
> Dave Cramer <pg@fastcrypt.com> writes:
> > Tried, all the suggestions
> 
> Mph.  It works for me... what PG version are you using exactly,
> and are you certain you've selected C locale?  (Do you get LIKE
> optimization on plain indexes?)
> 
>             regards, tom lane
> 
-- 
Dave Cramer
519 939 0336
ICQ # 1467551

Re: index scan with functional indexes

From

Tom Lane

Date:

27 January 2004, 14:48:48

Dave Cramer <pg@fastcrypt.com> writes:
> I'm using 7.4.1, the db was initdb --locale='C'
> and no I don't get them on plain indexes ????

Oh?  If it's 7.4 then you can confirm the locale selection with
"show lc_collate" and "show lc_ctype" (I think the first of these
is what the LIKE optimization checks).
        regards, tom lane

Re: index scan with functional indexes -- solved

From

Dave Cramer

Date:

27 January 2004, 15:09:52

Interesting it works now, and the good news is it is *WAY* faster, this
might be able to speed up marc's doc search by orders of magnitude

this is searching 100536 rows
select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
1.57ms


explain select * from url where url like '%beta12.html';3310.38 ms

Dave

On Tue, 2004-01-27 at 13:48, Tom Lane wrote:
> Dave Cramer <pg@fastcrypt.com> writes:
> > I'm using 7.4.1, the db was initdb --locale='C'
> > and no I don't get them on plain indexes ????
> 
> Oh?  If it's 7.4 then you can confirm the locale selection with
> "show lc_collate" and "show lc_ctype" (I think the first of these
> is what the LIKE optimization checks).
> 
>             regards, tom lane
> 
-- 
Dave Cramer
519 939 0336
ICQ # 1467551

Re: returning PGresult as xml

From

Peter Eisentraut

Date:

29 January 2004, 13:32:10

Andrew Dunstan wrote:
> Peter: this looks very nice. What are your intentions with this code?

Once we figure out how to handle the on-the-wire character set recoding 
when faced with XML documents (see separate thread a few weeks ago), I 
would like to finish it.

> Put it in contrib? Also, do you intend to implement the SQL/XML
> functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?

You have to implement these directly in the parser, which I'm not yet 
excited about.

Re: returning PGresult as xml

From

Hannu Krosing

Date:

30 January 2004, 18:48:21

Peter Eisentraut kirjutas N, 29.01.2004 kell 19:31:
> Andrew Dunstan wrote:
> > Peter: this looks very nice. What are your intentions with this code?
> 
> Once we figure out how to handle the on-the-wire character set recoding 
> when faced with XML documents (see separate thread a few weeks ago), I 
> would like to finish it.
> 
> > Put it in contrib? Also, do you intend to implement the SQL/XML
> > functions XMLElement, XMLForest, XMLAttributes, XMLConcat and XMLAgg?
> 
> You have to implement these directly in the parser, which I'm not yet 
> excited about.

Why not use some standard parser ?

libxml2 (www.xmlsoft.org) seems nice and is either available as a
separate dynamic library or can also be (IIRC) configured to build with
just the minimal needed functionality.

---------------
Hannu

Re: index scan with functional indexes -- solved

From

Kevin Brown

Date:

30 January 2004, 23:35:54

Dave Cramer wrote:
> Interesting it works now, and the good news is it is *WAY* faster, this
> might be able to speed up marc's doc search by orders of magnitude
> 
> this is searching 100536 rows
> 
>  select * from url where fn_strrev(url) like fn_strrev('%beta12.html');
> 1.57ms
> 
> 
> explain select * from url where url like '%beta12.html';
>  3310.38 ms

The nice thing about this is that you can create your query thusly:

SELECT * from table WHERE column like 'string' AND fn_strrev(column)   LIKE fn_strrev('string')

and, if you have both a standard index on column and a functional index
on fn_strrev(column), the query will be fast (well, as fast as the
pattern in question allows) as long as 'string' is anchored on either end.

I've implemented the 'locate' utility in Perl using a PG backend instead
of the standard locate database.  I internally convert globs given as
arguments into LIKE strings, and with a functional index like that the
searches are now blazingly fast -- faster than the original 'locate'
utility.  It has the added advantage that you can specify a file type
to further narrow the search (thus 'locate --type file "core"' will find
all regular files named 'core' in the database).

I'll be happy to share my code with anyone who's interested.

-- 
Kevin Brown                          kevin@sysexperts.com