Thread: Loose ends in PG XML patch
I cleaned up some things I didn't like in the recent XML patch, but there are several loose ends that I lack the energy to tackle now: * Isn't mapping XMLSERIALIZE to a cast completely wrong? Aside from the issue already noted in the code that it won't reverse-list correctly, this loses the DOCUMENT-vs-CONTENT flag, which I suppose must be important. * Shouldn't the xml type support binary I/O? Right now it is the only standard datatype that doesn't. I have no idea whether there is an appropriate representation besides text, but if not we could define the binary representation to be the same as text. * Reverse-listing of XMLELEMENT and XMLPI is currently wrong because the name string is not "de-escaped" back to a plain SQL identifier. * It doesn't look to me like any thought has been given to localization in xml_ereport() --- there are a ton of strings there that won't get translated. * I'm also quite afraid of xml_errmsg remaining non-null when the storage it points at has been deallocated. Since this is apparently only intended as debug support, maybe we could compile it only in debug builds, to reduce the probability that it will fail in production? regards, tom lane
I wrote: > * I'm also quite afraid of xml_errmsg remaining non-null when the > storage it points at has been deallocated. Since this is apparently > only intended as debug support, maybe we could compile it only in debug > builds, to reduce the probability that it will fail in production? Actually, there's an easier and cleaner way to do the whole thing: replace the handwritten management of xml_errmsg with a StringInfo buffer allocated in TopMemoryContext, so that it never goes away after having been first created. I'll go fix that. What I'm wondering about is why this printout is emitted as a separate DEBUG message ... wouldn't it be better to incorporate it as the DETAIL field of the error message? regards, tom lane
On 12/24/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: > What I'm wondering about is why this printout is emitted as a separate > DEBUG message ... wouldn't it be better to incorporate it as the DETAIL > field of the error message? Surely, it would. But the thing is that I couldn't manage to format libxml2's native messages properly.. -- Best regards, Nikolay
"Nikolay Samokhvalov" <samokhvalov@gmail.com> writes: > On 12/24/06, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What I'm wondering about is why this printout is emitted as a separate >> DEBUG message ... wouldn't it be better to incorporate it as the DETAIL >> field of the error message? > Surely, it would. But the thing is that I couldn't manage to format > libxml2's native messages properly.. I'm not really seeing the problem. Right now I'm getting results like this: regression=# \set VERBOSITY verbose regression=# set client_min_messages TO debug1; SET regression=# select '<foo><zit'::xml; DEBUG: 00000: Entity: line 1: parser error : Couldn't find end of Start Tag zit line 1 <foo><zit ^ Entity: line 1: parser error : Premature end of data in tag foo line 1 <foo><zit ^ LOCATION: xml_ereport_by_code, xml.c:685 ERROR: 2200M: could not parse XML data DETAIL: Closing tag not found LOCATION: xml_ereport_by_code, xml.c:857 regression=# select '<?xml><foo><zit'::xml; DEBUG: 00000: dummy.xml:1: parser error : XML declaration allowed only at the start of the document <?xml><foo><zit ^ dummy.xml:1: parser error : ParsePI: PI xml space expected <?xml><foo><zit ^ dummy.xml:1: parser error : ParsePI: PI xml never end ... <?xml><foo><zit ^ dummy.xml:1: parser error : Start tag expected, '<' not found <?xml><foo><zit ^ LOCATION: xml_ereport, xml.c:611 ERROR: 2200M: could not parse XML data DETAIL: Start tag expected, '<' not found. LOCATION: xml_ereport, xml.c:641 regression=# I claim the libxml output is just fine as-is for a DETAIL message, and we could get rid of all the pushups being done in the two versions of xml_ereport to generate a detail message that's really insufficiently detailed anyway. In particular, in any respectable-size chunk of XML I think you'd *really* want some kind of error cursor position, which the libxml output gives you. regards, tom lane
Am Sonntag, 24. Dezember 2006 02:44 schrieb Tom Lane: > * Isn't mapping XMLSERIALIZE to a cast completely wrong? Aside from > the issue already noted in the code that it won't reverse-list > correctly, this loses the DOCUMENT-vs-CONTENT flag, which I suppose > must be important. It is important, but at the moment it's not so important as to not provide any standards-conforming method to obtain a character string from XML at all. -- Peter Eisentraut http://developer.postgresql.org/~petere/
> * Shouldn't the xml type support binary I/O? Right now it is the only > standard datatype that doesn't. I have no idea whether there is an > appropriate representation besides text, but if not we could define the > binary representation to be the same as text. There is an effort to develop a binary xml format: http://www.w3.org/TR/xbc-characterization/