Thread: Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes

Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes

From
Jim Jones
Date:

On 28.08.24 10:19, Jim Jones wrote:
> Hi,
>
> While testing a feature reported by Pavel in this thread[1] I realized
> that elements containing whitespaces between them won't be indented with
> XMLSERIALIZE( ... INDENT)
>

mmh... xmlDocContentDumpOutput seems to add a trailing newline in the
end of a document by default, making the serialization of the same xml
string with DOCUMENT and CONTENT different:

-- postgres v16

SELECT xmlserialize(CONTENT '<foo><bar>42</bar></foo>' AS text INDENT);
  xmlserialize   
-----------------
 <foo>          +
   <bar>42</bar>+
 </foo>
(1 row)

SELECT xmlserialize(DOCUMENT '<foo><bar>42</bar></foo>' AS text INDENT);
  xmlserialize   
-----------------
 <foo>          +
   <bar>42</bar>+
 </foo>         +
 
(1 row)


I do recall a discussion along these lines some time ago, but I just
can't find it now. Does anyone know if this is the expected behaviour?
Or should we in this case consider something like this in
xmltotext_with_options()?

result = cstring_to_text_with_len((const char *) xmlBufferContent(buf),
xmlBufferLength(buf) - 1);

-- 
Jim




Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes

From
Tom Lane
Date:
Jim Jones <jim.jones@uni-muenster.de> writes:
> mmh... xmlDocContentDumpOutput seems to add a trailing newline in the
> end of a document by default, making the serialization of the same xml
> string with DOCUMENT and CONTENT different:

Does seem a bit inconsistent.

> Or should we in this case consider something like this in
> xmltotext_with_options()?
> result = cstring_to_text_with_len((const char *) xmlBufferContent(buf),
> xmlBufferLength(buf) - 1);

I think it'd be quite foolish to assume that every extant and future
version of libxml2 will share this glitch.  Probably should use
logic more like pg_strip_crlf(), although we can't use that directly.

Would it ever be the case that trailing whitespace would be valid
data?  In a bit of testing, it seems like that could be true in
CONTENT mode but not DOCUMENT mode.

            regards, tom lane



Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes

From
Tom Lane
Date:
Jim Jones <jim.jones@uni-muenster.de> writes:
> [ xmlserialize patches ]

Pushed with minor editorialization.  Notably, I got rid of scribbling
on xmlBufferContent's buffer --- I don't know how likely that is to
upset libxml2, but it seems like a fairly bad idea given that they
declare the result as "const xmlChar*".  Casting away the const is
poor form in any case.

            regards, tom lane