Jim Jones <jim.jones@uni-muenster.de> writes:
> [ v22-0001-Add-pretty-printed-XML-output-option.patch ]
I poked at this for awhile and ran into a problem that I'm not sure
how to solve: it misbehaves for input with embedded DOCTYPE.
regression=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' as text indent);
xmlserialize
--------------
<!DOCTYPE a>+
<a></a> +
(1 row)
regression=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' as text indent);
xmlserialize
--------------
(1 row)
The bad result for CONTENT is because xml_parse() decides to
parse_as_document, but xmlserialize_indent has no idea that happened
and tries to use the content_nodes list anyway. I don't especially
care for the laissez faire "maybe we'll set *content_nodes and maybe
we won't" API you adopted for xml_parse, which seems to be contributing
to the mess. We could pass back more info so that xmlserialize_indent
knows what really happened. However, that won't fix the bogus output
for the DOCUMENT case. Are we perhaps passing incorrect flags to
xmlSaveToBuffer?
regards, tom lane