Thread: Loose ends in PG XML patch

Loose ends in PG XML patch

From
Tom Lane
Date:
I cleaned up some things I didn't like in the recent XML patch, but
there are several loose ends that I lack the energy to tackle now:

* Isn't mapping XMLSERIALIZE to a cast completely wrong?  Aside from
the issue already noted in the code that it won't reverse-list
correctly, this loses the DOCUMENT-vs-CONTENT flag, which I suppose
must be important.

* Shouldn't the xml type support binary I/O?  Right now it is the only
standard datatype that doesn't.  I have no idea whether there is an
appropriate representation besides text, but if not we could define the
binary representation to be the same as text.

* Reverse-listing of XMLELEMENT and XMLPI is currently wrong because
the name string is not "de-escaped" back to a plain SQL identifier.

* It doesn't look to me like any thought has been given to localization
in xml_ereport() --- there are a ton of strings there that won't get
translated.

* I'm also quite afraid of xml_errmsg remaining non-null when the
storage it points at has been deallocated.  Since this is apparently
only intended as debug support, maybe we could compile it only in debug
builds, to reduce the probability that it will fail in production?
        regards, tom lane


Re: Loose ends in PG XML patch

From
Tom Lane
Date:
I wrote:
> * I'm also quite afraid of xml_errmsg remaining non-null when the
> storage it points at has been deallocated.  Since this is apparently
> only intended as debug support, maybe we could compile it only in debug
> builds, to reduce the probability that it will fail in production?

Actually, there's an easier and cleaner way to do the whole thing:
replace the handwritten management of xml_errmsg with a StringInfo
buffer allocated in TopMemoryContext, so that it never goes away after
having been first created.  I'll go fix that.

What I'm wondering about is why this printout is emitted as a separate
DEBUG message ... wouldn't it be better to incorporate it as the DETAIL
field of the error message?
        regards, tom lane


Re: Loose ends in PG XML patch

From
"Nikolay Samokhvalov"
Date:
On 12/24/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> What I'm wondering about is why this printout is emitted as a separate
> DEBUG message ... wouldn't it be better to incorporate it as the DETAIL
> field of the error message?

Surely, it would. But the thing is that I couldn't manage to format
libxml2's native messages properly..



-- 
Best regards,
Nikolay


Re: Loose ends in PG XML patch

From
Tom Lane
Date:
"Nikolay Samokhvalov" <samokhvalov@gmail.com> writes:
> On 12/24/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I'm wondering about is why this printout is emitted as a separate
>> DEBUG message ... wouldn't it be better to incorporate it as the DETAIL
>> field of the error message?

> Surely, it would. But the thing is that I couldn't manage to format
> libxml2's native messages properly..

I'm not really seeing the problem.  Right now I'm getting results like
this:

regression=# \set VERBOSITY verbose
regression=# set client_min_messages TO debug1;
SET
regression=# select '<foo><zit'::xml;
DEBUG:  00000: Entity: line 1: parser error : Couldn't find end of Start Tag zit line 1
<foo><zit        ^
Entity: line 1: parser error : Premature end of data in tag foo line 1
<foo><zit        ^

LOCATION:  xml_ereport_by_code, xml.c:685
ERROR:  2200M: could not parse XML data
DETAIL:  Closing tag not found
LOCATION:  xml_ereport_by_code, xml.c:857
regression=# select '<?xml><foo><zit'::xml;
DEBUG:  00000: dummy.xml:1: parser error : XML declaration allowed only at the start of the document
<?xml><foo><zit    ^
dummy.xml:1: parser error : ParsePI: PI xml space expected
<?xml><foo><zit    ^
dummy.xml:1: parser error : ParsePI: PI xml never end ...
<?xml><foo><zit              ^
dummy.xml:1: parser error : Start tag expected, '<' not found
<?xml><foo><zit              ^

LOCATION:  xml_ereport, xml.c:611
ERROR:  2200M: could not parse XML data
DETAIL:  Start tag expected, '<' not found.
LOCATION:  xml_ereport, xml.c:641
regression=# 

I claim the libxml output is just fine as-is for a DETAIL message, and
we could get rid of all the pushups being done in the two versions of
xml_ereport to generate a detail message that's really insufficiently
detailed anyway.  In particular, in any respectable-size chunk of XML
I think you'd *really* want some kind of error cursor position, which
the libxml output gives you.
        regards, tom lane


Re: Loose ends in PG XML patch

From
Peter Eisentraut
Date:
Am Sonntag, 24. Dezember 2006 02:44 schrieb Tom Lane:
> * Isn't mapping XMLSERIALIZE to a cast completely wrong?  Aside from
> the issue already noted in the code that it won't reverse-list
> correctly, this loses the DOCUMENT-vs-CONTENT flag, which I suppose
> must be important.

It is important, but at the moment it's not so important as to not provide any 
standards-conforming method to obtain a character string from XML at all.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: Loose ends in PG XML patch

From
"Christopher Kings-Lynne"
Date:
> * Shouldn't the xml type support binary I/O?  Right now it is the only
> standard datatype that doesn't.  I have no idea whether there is an
> appropriate representation besides text, but if not we could define the
> binary representation to be the same as text.

There is an effort to develop a binary xml format:

http://www.w3.org/TR/xbc-characterization/