Home > mailing lists

Re: [PATCH] Add pretty-printed XML output option - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: [PATCH] Add pretty-printed XML output option
Date	March 14, 2023 20:40:25
Msg-id	2752578.1678815625@sss.pgh.pa.us Whole thread Raw
In response to	Re: [PATCH] Add pretty-printed XML output option (Jim Jones <jim.jones@uni-muenster.de>)
Responses	Re: [PATCH] Add pretty-printed XML output option (Jim Jones <jim.jones@uni-muenster.de>)
List	pgsql-hackers

Tree view

Jim Jones <jim.jones@uni-muenster.de> writes:
> [ v22-0001-Add-pretty-printed-XML-output-option.patch ]

I poked at this for awhile and ran into a problem that I'm not sure
how to solve: it misbehaves for input with embedded DOCTYPE.

regression=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' as text indent);
 xmlserialize 
--------------
 <!DOCTYPE a>+
 <a></a>     +
 
(1 row)

regression=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' as text indent);
 xmlserialize 
--------------
 
(1 row)

The bad result for CONTENT is because xml_parse() decides to
parse_as_document, but xmlserialize_indent has no idea that happened
and tries to use the content_nodes list anyway.  I don't especially
care for the laissez faire "maybe we'll set *content_nodes and maybe
we won't" API you adopted for xml_parse, which seems to be contributing
to the mess.  We could pass back more info so that xmlserialize_indent
knows what really happened.  However, that won't fix the bogus output
for the DOCUMENT case.  Are we perhaps passing incorrect flags to
xmlSaveToBuffer?

            regards, tom lane

pgsql-hackers by date:

From: Jeff Davis
Date: 14 March 2023, 20:10:42
Subject: Re: ICU locale validation / canonicalization

From: Andres Freund
Date: 14 March 2023, 20:45:21
Subject: DROP DATABASE is interruptible

Re: [PATCH] Add pretty-printed XML output option - Mailing list pgsql-hackers

Previous

Next