Thread: XML ouput for psql

XML ouput for psql

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Patch to add XML output to psql:

http://www.gtsm.com/xml.patch.txt

Notes and questions:

The basic output looks something like this:

<?xml version="1.0" encoding="SQL_ASCII"?>
<resultset psql_version="7.4devel" query="select * from foo;">

<columns>
  <col num="1">a</col>
  <col num="2">b</col>
  <col num="3">c</col>
  <col num="4">mucho nacho  </col>
</columns>
<row num="1">
  <a>1</a>
  <b>pizza</b>
  <c>2003-02-25 15:19:22.169797</c>
  <"mucho nacho  "></"mucho nacho  ">
</row>
<row num="2">
  <a>2</a>
  <b>mushroom</b>
  <c>2003-02-25 15:19:26.969415</c>
  <"mucho nacho  "></"mucho nacho  ">
</row>
<footer>(2 rows)</footer>
</resultset>

and with the \x option:

<?xml version="1.0" encoding="SQL_ASCII"?>
<resultset psql_version="7.4devel" query="select * from foo;">

<columns>
  <col num="1">a</col>
  <col num="2">b</col>
  <col num="3">c</col>
  <col num="4">mucho nacho  </col>
</columns>
<row num="1">
  <cell name="a">1</cell>
  <cell name="b">pizza</cell>
  <cell name="c">2003-02-25 15:19:22.169797</cell>
  <cell name="mucho nacho  "></cell>
</row>
<row num="2">
  <cell name="a">2</cell>
  <cell name="b">mushroom</cell>
  <cell name="c">2003-02-25 15:19:26.969415</cell>
  <cell name="mucho nacho  "></cell>
</row>
</resultset>


The default encoding "SQL-ASCII" is not valid for XML.
Should it be automatically changed to something else?

The flag "-X" is already taken, unfortunately, although \X is not.
I used "-L" and "\L" but they are not as memorable as "X". Anyone
see a way around this? Can we still use \X inside of psql?


It would be nice to include the string representation of the column
types in the xml output:
<col type="int8">foo</col>
....but I could not find an easy way to do this: PQftype returns the
OID only (which is close but not quite there). Is there an
existing way to get the name of the type of a column from a
PQresult item?

The HTML, XML, and Latex modes should have better documentation -
I'll submit a separate doc patch when/if this gets finalized.


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302261518

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+XSR/vJuQZxSWSsgRAi2jAJ9IAKnMBmNcVEEI8TXQBBd/rtm4XQCg0Vjq
IO9OsCSkdnNJqnrYYutM3jw=
=9kwY
-----END PGP SIGNATURE-----



Re: XML ouput for psql

From
Hannu Krosing
Date:
greg@turnstep.com kirjutas K, 26.02.2003 kell 22:46:

>
> and with the \x option:
>
> <?xml version="1.0" encoding="SQL_ASCII"?>
> <resultset psql_version="7.4devel" query="select * from foo;">
>
> <columns>
>   <col num="1">a</col>
>   <col num="2">b</col>
>   <col num="3">c</col>
>   <col num="4">mucho nacho  </col>
> </columns>
> <row num="1">
>   <cell name="a">1</cell>
>   <cell name="b">pizza</cell>
>   <cell name="c">2003-02-25 15:19:22.169797</cell>
>   <cell name="mucho nacho  "></cell>
> </row>
> <row num="2">
>   <cell name="a">2</cell>
>   <cell name="b">mushroom</cell>
>   <cell name="c">2003-02-25 15:19:26.969415</cell>
>   <cell name="mucho nacho  "></cell>
> </row>
> </resultset>
>
>
> The default encoding "SQL-ASCII" is not valid for XML.
> Should it be automatically changed to something else?

I think you should force conversion to something standard, try using
automatic conversion to some known client encoding.

btw, "UNICODE" is also not any known encoding in XML, but PostgreSQL
uses it to mean utf-8

> The flag "-X" is already taken, unfortunately, although \X is not.
> I used "-L" and "\L" but they are not as memorable as "X". Anyone
> see a way around this? Can we still use \X inside of psql?
>
>
> It would be nice to include the string representation of the column
> types in the xml output:
> <col type="int8">foo</col>
> ....but I could not find an easy way to do this: PQftype returns the
> OID only (which is close but not quite there). Is there an
> existing way to get the name of the type of a column from a
> PQresult item?

Run "select  oid,typname from pg_type;" first if run in xml mode and
store the oid/columnname pairs.

you could also store the result in ~/.psql for faster access later on
and manually clear it if new types are defined

----------------
Hannu


Re: XML ouput for psql

From
Peter Eisentraut
Date:
greg@turnstep.com writes:

> Patch to add XML output to psql:

This would get me more excited if you do one or both of the following:

1. Look into the SQL/XML standard draft (ftp.sqlstandards.org) to find out
whether the standard addresses this sort of thing.

2. Use an established/standardized XML (or SGML) table model rather than
rolling your own.

Incidentally, the HTML table model is such an established and standardized
XML and SGML table model, so the easiest way to get the task "add XML
output to psql" done is to update the HTML output to conform to XHTML.
That way you get both the strict XML and you can look at the formatted
result with any old (er, new) browser.

--
Peter Eisentraut   peter_e@gmx.net


Re: XML ouput for psql

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Hannu Krosing wrote:
> I think you should force conversion to something standard, try using
> automatic conversion to some known client encoding.

I've thought about this some more, and the only thing I can think
about doing without being too heavy-handed is to change the encoding
to "US-ASCII" whenever someone enters "XML" mode if the encoding is set
to "SQL-ASCII". Perhaps with a warning.

"The character set most commonly use in the Internet and used especially in
protocol standards is US-ASCII, this is strongly encouraged."
http://www.iana.org/assignments/character-sets

On the other hand, SQLX seems to lean toward a strict unicode encoding
(see my reply to Peter Eisentraut for more on that).

> Run "select oid,typname from pg_type;" first if run in xml mode and
> store the oid/columnname pairs.

I realize that I could run a SQL query against pg_type to grab the info,
but I was hoping there was an internal function similar to PQtype which
would return the information.

> you could also store the result in ~/.psql for faster access
> later on and manually clear it if new types are defined

Not only does pg_type has literally hundreds of entries, but there is no
way to guarantee that these are correct at the time when the query is
run, so I don't think this is viable.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302280938
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+X3hCvJuQZxSWSsgRArMTAKChouxnFF1ugI1mutXYJf14p1ICGwCfUDG9
yISxrIvqxnYWHfvD0lOWZAQ=
=M6nd
-----END PGP SIGNATURE-----




Re: XML ouput for psql

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


Peter Eisentraut wrote:
> 1. Look into the SQL/XML standard draft (ftp.sqlstandards.org) to find out
> whether the standard addresses this sort of thing.

The URL you gave leads to a site curiously content-free and full of dead links.
I've looked around a bit, but found nothing definitive. One good resource I
did find was this:

http://www.wiscorp.com/sql/SQLX_Bringing_SQL_and_XML_Together.pdf

The article mentions a lot of links on the sqlstandards.org and iso.org sites, none
of which work or are restricted. If anyone knows of some good links, please
let me know. (especially ISO 9075). From what I've read of the SQLX stuff, the
format in my patch should be mostly standard:

<row>
 <name>Joe Sixpack</name>
 <age>35</age>
 <state>Alabama</state>
</row>

One problem is that the recommended way to handle non-standard characters
(including spaces) is to escape them like this:

foobar baz => <foobar_x0020_baz>

This also includes escaping things like "_x*" and "xml*". We don't have
anything like that in the code yet (?), but we should probably think about
heading that way. I think escaping whitespace in quotes is good enough
for now for:

foobar baz => <"foobar baz">

The xsd and xsi standards are also interesting, but needlessly complicated
for psql output, IMO.

> Incidentally, the HTML table model is such an established and standardized
> XML and SGML table model, so the easiest way to get the task "add XML
> output to psql" done is to update the HTML output to conform to XHTML.
> That way you get both the strict XML and you can look at the formatted
> result with any old (er, new) browser.

I don't agree with this: XML and XHTML are two different things. We could
certainly upgrade the HTML portion, but I am pretty sure that the XML
standard calls for this format:

<columnname>data here</columnname>

..which is not valid XHTML and won't be viewable by any browser. The other
suggested XML formats are even further from XHTML than the above. The HTML
format should be "html table/layout" specific and the XML should be
"schema/data" specific.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200302280938

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+X3k5vJuQZxSWSsgRAuXFAKDGO1IsjB9Lwtkcws1xJy47PibcLQCg3dx5
fsy27qguZv841lPvCjzdUic=
=4f9B
-----END PGP SIGNATURE-----



Re: XML ouput for psql

From
Bruce Momjian
Date:
Greg, do you have a newer patch to address the feedback you received, or
is this one good?

---------------------------------------------------------------------------

greg@turnstep.com wrote:
[ There is text before PGP section. ]
>
[ PGP not available, raw data follows ]
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Peter Eisentraut wrote:
> > 1. Look into the SQL/XML standard draft (ftp.sqlstandards.org) to find out
> > whether the standard addresses this sort of thing.
>
> The URL you gave leads to a site curiously content-free and full of dead links.
> I've looked around a bit, but found nothing definitive. One good resource I
> did find was this:
>
> http://www.wiscorp.com/sql/SQLX_Bringing_SQL_and_XML_Together.pdf
>
> The article mentions a lot of links on the sqlstandards.org and iso.org sites, none
> of which work or are restricted. If anyone knows of some good links, please
> let me know. (especially ISO 9075). From what I've read of the SQLX stuff, the
> format in my patch should be mostly standard:
>
> <row>
>  <name>Joe Sixpack</name>
>  <age>35</age>
>  <state>Alabama</state>
> </row>
>
> One problem is that the recommended way to handle non-standard characters
> (including spaces) is to escape them like this:
>
> foobar baz => <foobar_x0020_baz>
>
> This also includes escaping things like "_x*" and "xml*". We don't have
> anything like that in the code yet (?), but we should probably think about
> heading that way. I think escaping whitespace in quotes is good enough
> for now for:
>
> foobar baz => <"foobar baz">
>
> The xsd and xsi standards are also interesting, but needlessly complicated
> for psql output, IMO.
>
> > Incidentally, the HTML table model is such an established and standardized
> > XML and SGML table model, so the easiest way to get the task "add XML
> > output to psql" done is to update the HTML output to conform to XHTML.
> > That way you get both the strict XML and you can look at the formatted
> > result with any old (er, new) browser.
>
> I don't agree with this: XML and XHTML are two different things. We could
> certainly upgrade the HTML portion, but I am pretty sure that the XML
> standard calls for this format:
>
> <columnname>data here</columnname>
>
> ..which is not valid XHTML and won't be viewable by any browser. The other
> suggested XML formats are even further from XHTML than the above. The HTML
> format should be "html table/layout" specific and the XML should be
> "schema/data" specific.
>
> - --
> Greg Sabino Mullane greg@turnstep.com
> PGP Key: 0x14964AC8 200302280938
>
> -----BEGIN PGP SIGNATURE-----
> Comment: http://www.turnstep.com/pgp.html
>
> iD8DBQE+X3k5vJuQZxSWSsgRAuXFAKDGO1IsjB9Lwtkcws1xJy47PibcLQCg3dx5
> fsy27qguZv841lPvCjzdUic=
> =4f9B
> -----END PGP SIGNATURE-----
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
[ Decrypting message... End of raw data. ]

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> Greg, do you have a newer patch to address the feedback you received, or
> is this one good?

I have a newer patch, but I am not 100% sure a consensus was reached. I recall
the thread veering into talk of XML on the backend, but don't recall if anyone
still had strong objections to a quick psql wrapper. If not, I will clean up
the existing patch and resubmit tomorrow.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200303171641

-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+dkE5vJuQZxSWSsgRAkVSAJ9aLoLC23OoNcVEw4hQiaBrPcSqNQCfTxH3
crC4ssFKbBo60gHvJT3WsU0=
=Qsif
-----END PGP SIGNATURE-----



Re: XML ouput for psql

From
Bruce Momjian
Date:
I like the idea of doing XML in psql --- it seems like a natural place
for it.

---------------------------------------------------------------------------

greg@turnstep.com wrote:
[ There is text before PGP section. ]
>
[ PGP not available, raw data follows ]
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> > Greg, do you have a newer patch to address the feedback you received, or
> > is this one good?
>
> I have a newer patch, but I am not 100% sure a consensus was reached. I recall
> the thread veering into talk of XML on the backend, but don't recall if anyone
> still had strong objections to a quick psql wrapper. If not, I will clean up
> the existing patch and resubmit tomorrow.
>
> - --
> Greg Sabino Mullane greg@turnstep.com
> PGP Key: 0x14964AC8 200303171641
>
> -----BEGIN PGP SIGNATURE-----
> Comment: http://www.turnstep.com/pgp.html
>
> iD8DBQE+dkE5vJuQZxSWSsgRAkVSAJ9aLoLC23OoNcVEw4hQiaBrPcSqNQCfTxH3
> crC4ssFKbBo60gHvJT3WsU0=
> =Qsif
> -----END PGP SIGNATURE-----
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>
[ Decrypting message... End of raw data. ]

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I like the idea of doing XML in psql --- it seems like a natural place
> for it.

Not really; what of applications other than shell scripts that would
like to get XML-formatted output?

There was some talk in the FE/BE protocol thread of adding hooks to
support more than one output format from the backend.  Much of the
infrastructure already exists (see DestReceiver in the backend); we
just need an agreement on the protocol.  On the whole I'd rather see
it done that way than burying the logic in psql.

            regards, tom lane

Re: XML ouput for psql

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I like the idea of doing XML in psql --- it seems like a natural place
> > for it.
>
> Not really; what of applications other than shell scripts that would
> like to get XML-formatted output?
>
> There was some talk in the FE/BE protocol thread of adding hooks to
> support more than one output format from the backend.  Much of the
> infrastructure already exists (see DestReceiver in the backend); we
> just need an agreement on the protocol.  On the whole I'd rather see
> it done that way than burying the logic in psql.

Well, programs can run psql using popen.  It seems overkill to get the
protocol involved, specially since it is output-only.  I can't imagine
who would bother with the wire protocol messiness just to get xml.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> Not really; what of applications other than shell scripts that would
>> like to get XML-formatted output?

> Well, programs can run psql using popen.  It seems overkill to get the
> protocol involved, specially since it is output-only.  I can't imagine
> who would bother with the wire protocol messiness just to get xml.

Having to popen a psql isn't overkill?  This seems like a far messier
solution than the other.  Furthermore, it's just plain not an available
solution in many scenarios (think of a Java program running JDBC; it may
not have privileges to do popen, and may not have access to a copy of
psql anyway).

If we were not already opening up the protocol for changes, I'd be
resistant to the idea too.  But since we are, I think it should be fixed
where it's cleanest to fix it.

            regards, tom lane

Re: XML ouput for psql

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> Not really; what of applications other than shell scripts that would
> >> like to get XML-formatted output?
>
> > Well, programs can run psql using popen.  It seems overkill to get the
> > protocol involved, specially since it is output-only.  I can't imagine
> > who would bother with the wire protocol messiness just to get xml.
>
> Having to popen a psql isn't overkill?  This seems like a far messier
> solution than the other.  Furthermore, it's just plain not an available
> solution in many scenarios (think of a Java program running JDBC; it may
> not have privileges to do popen, and may not have access to a copy of
> psql anyway).
>
> If we were not already opening up the protocol for changes, I'd be
> resistant to the idea too.  But since we are, I think it should be fixed
> where it's cleanest to fix it.

What would be interesting would be to enable libpq to dump XML, and have
psql use that.  Why put XML capability in the backend?  Of course, that
doesn't help jdbc.  How do you propose the backend would do XML?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> What would be interesting would be to enable libpq to dump XML, and have
> psql use that.

... or in the backend so libpq could use it, and thence psql.

>  Why put XML capability in the backend?

So that non-libpq-based clients could use it.

            regards, tom lane

Re: XML ouput for psql

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What would be interesting would be to enable libpq to dump XML, and have
> > psql use that.
>
> ... or in the backend so libpq could use it, and thence psql.
>
> >  Why put XML capability in the backend?
>
> So that non-libpq-based clients could use it.

OK, I have two ideas here.  First, can we create a function that takes a
query result and returns one big XML string.  I am not sure how to pump
a result into a function.  The other downside is that we would have to
construct the entire result string in memory.

The other idea I had was a GUC variable that returned all query results
as one big XML string.  That would prevent creating the entire string in
backend memory, and might enable cursor fetches through the XML string.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
Bruce Momjian
Date:
I assume we are not moving in the XML/psql direction, right?  We want it
int he backend, or the psql HTML converted to XHTML?

---------------------------------------------------------------------------

greg@turnstep.com wrote:
[ There is text before PGP section. ]
>
[ PGP not available, raw data follows ]
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
>
> Patch to add XML output to psql:
>
> http://www.gtsm.com/xml.patch.txt
>
> Notes and questions:
>
> The basic output looks something like this:
>
> <?xml version="1.0" encoding="SQL_ASCII"?>
> <resultset psql_version="7.4devel" query="select * from foo;">
>
> <columns>
>   <col num="1">a</col>
>   <col num="2">b</col>
>   <col num="3">c</col>
>   <col num="4">mucho nacho  </col>
> </columns>
> <row num="1">
>   <a>1</a>
>   <b>pizza</b>
>   <c>2003-02-25 15:19:22.169797</c>
>   <"mucho nacho  "></"mucho nacho  ">
> </row>
> <row num="2">
>   <a>2</a>
>   <b>mushroom</b>
>   <c>2003-02-25 15:19:26.969415</c>
>   <"mucho nacho  "></"mucho nacho  ">
> </row>
> <footer>(2 rows)</footer>
> </resultset>
>
> and with the \x option:
>
> <?xml version="1.0" encoding="SQL_ASCII"?>
> <resultset psql_version="7.4devel" query="select * from foo;">
>
> <columns>
>   <col num="1">a</col>
>   <col num="2">b</col>
>   <col num="3">c</col>
>   <col num="4">mucho nacho  </col>
> </columns>
> <row num="1">
>   <cell name="a">1</cell>
>   <cell name="b">pizza</cell>
>   <cell name="c">2003-02-25 15:19:22.169797</cell>
>   <cell name="mucho nacho  "></cell>
> </row>
> <row num="2">
>   <cell name="a">2</cell>
>   <cell name="b">mushroom</cell>
>   <cell name="c">2003-02-25 15:19:26.969415</cell>
>   <cell name="mucho nacho  "></cell>
> </row>
> </resultset>
>
>
> The default encoding "SQL-ASCII" is not valid for XML.
> Should it be automatically changed to something else?
>
> The flag "-X" is already taken, unfortunately, although \X is not.
> I used "-L" and "\L" but they are not as memorable as "X". Anyone
> see a way around this? Can we still use \X inside of psql?
>
>
> It would be nice to include the string representation of the column
> types in the xml output:
> <col type="int8">foo</col>
> ....but I could not find an easy way to do this: PQftype returns the
> OID only (which is close but not quite there). Is there an
> existing way to get the name of the type of a column from a
> PQresult item?
>
> The HTML, XML, and Latex modes should have better documentation -
> I'll submit a separate doc patch when/if this gets finalized.
>
>
> - --
> Greg Sabino Mullane greg@turnstep.com
> PGP Key: 0x14964AC8 200302261518
>
> -----BEGIN PGP SIGNATURE-----
> Comment: http://www.turnstep.com/pgp.html
>
> iD8DBQE+XSR/vJuQZxSWSsgRAi2jAJ9IAKnMBmNcVEEI8TXQBBd/rtm4XQCg0Vjq
> IO9OsCSkdnNJqnrYYutM3jw=
> =9kwY
> -----END PGP SIGNATURE-----
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>
[ Decrypting message... End of raw data. ]

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: XML ouput for psql

From
greg@turnstep.com
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


> I assume we are not moving in the XML/psql direction, right?  We want it
> int he backend, or the psql HTML converted to XHTML?

I don't think a consensus was ever reached. It would certainly be better if
this was done on the backend, but that seems to be a long time away, and
some have argued that it is not the job of the engine to do this anyway.

I agree at the very least we should update the HTML ourput for psql: I'll
try to make a patch for that this weekend.

I still think we should at least have a rudimentary xml output option inside
of psql. It won't be perfect, but we can certainly have the flag toggle
a backend variable when/if the backend supports XML directly.


- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 200305301452
-----BEGIN PGP SIGNATURE-----
Comment: http://www.turnstep.com/pgp.html

iD8DBQE+16npvJuQZxSWSsgRAnTHAJ0UN3HFWVybqDd/5lnsV2CcotRxSgCgp7md
W9Iho/Y1mwUYEl8SX/9oAVc=
=G1Jx
-----END PGP SIGNATURE-----



Re: XML ouput for psql

From
Sean Chittenden
Date:
> > I assume we are not moving in the XML/psql direction, right?  We
> > want it int he backend, or the psql HTML converted to XHTML?
>
> I don't think a consensus was ever reached. It would certainly be
> better if this was done on the backend, but that seems to be a long
> time away, and some have argued that it is not the job of the engine
> to do this anyway.

Few points for the archives regarding XML and databases (spent 9mo
working on this kinda stuff during the .com days):

*) Use libxml2.  MIT Licensed, most complete opensource XML
   implementation available, and fast.  See the XML benchmarks on
   sf.net for details.  To avoid library naming conflicts, the library
   should likely be renamed to pgxml.so and imported into the src
   tree.  Mention java in this context and risk being clubbed to death.

*) There should be two storage formats for XML data:

   a) DOM-esque storage: broken down xmlNodes.  This is necessary for
      indexing specific places in documents (ala XPath queries).
      Actual datums on the disk should be similar in structure to the
      xmlNode struct found in libxml2 (would help with the
      serialization in either direction).  In database xslt
      transformations are also possible with the data stored this way.

   b) SAX-esque storage: basically a single BYTEA/TEXT column.  Not
      all documents need to be indexed/searchable and SAX processing
      of data is generally more efficient if you don't know what
      you're looking for.  This format is the low hanging fruit
      though.

-sc

--
Sean Chittenden

Re: XML ouput for psql

From
Hannu Krosing
Date:
Sean Chittenden kirjutas R, 30.05.2003 kell 23:20:
> > > I assume we are not moving in the XML/psql direction, right?  We
> > > want it int he backend, or the psql HTML converted to XHTML?
> >
> > I don't think a consensus was ever reached. It would certainly be
> > better if this was done on the backend, but that seems to be a long
> > time away, and some have argued that it is not the job of the engine
> > to do this anyway.
>
> Few points for the archives regarding XML and databases (spent 9mo
> working on this kinda stuff during the .com days):
>
> *) Use libxml2.  MIT Licensed, most complete opensource XML
>    implementation available, and fast.  See the XML benchmarks on
>    sf.net for details.  To avoid library naming conflicts, the library
>    should likely be renamed to pgxml.so and imported into the src
>    tree.  Mention java in this context and risk being clubbed to death.

Agree completely on all points ;)

> *) There should be two storage formats for XML data:
>
>    a) DOM-esque storage: broken down xmlNodes.  This is necessary for
>       indexing specific places in documents (ala XPath queries).
>       Actual datums on the disk should be similar in structure to the
>       xmlNode struct found in libxml2 (would help with the
>       serialization in either direction).  In database xslt
>       transformations are also possible with the data stored this way.
>
>    b) SAX-esque storage: basically a single BYTEA/TEXT column.  Not
>       all documents need to be indexed/searchable and SAX processing
>       of data is generally more efficient if you don't know what
>       you're looking for.  This format is the low hanging fruit
>       though.

I think that Oleg and Todor very recently proposed somethink that could
use b) and still provide indexed access.

Most flexible would be some way to define, how much of a tree is kept
together, as xmlNode/tuple would probably be too much overhead for most
operations, whereas xmlFile/tuple would also, just for other ops;)

--------------
Hannu