Thread: Re: [BUG?] XMLSERIALIZE( ... INDENT) won't work with blank nodes
On 28.08.24 10:19, Jim Jones wrote: > Hi, > > While testing a feature reported by Pavel in this thread[1] I realized > that elements containing whitespaces between them won't be indented with > XMLSERIALIZE( ... INDENT) > mmh... xmlDocContentDumpOutput seems to add a trailing newline in the end of a document by default, making the serialization of the same xml string with DOCUMENT and CONTENT different: -- postgres v16 SELECT xmlserialize(CONTENT '<foo><bar>42</bar></foo>' AS text INDENT); xmlserialize ----------------- <foo> + <bar>42</bar>+ </foo> (1 row) SELECT xmlserialize(DOCUMENT '<foo><bar>42</bar></foo>' AS text INDENT); xmlserialize ----------------- <foo> + <bar>42</bar>+ </foo> + (1 row) I do recall a discussion along these lines some time ago, but I just can't find it now. Does anyone know if this is the expected behaviour? Or should we in this case consider something like this in xmltotext_with_options()? result = cstring_to_text_with_len((const char *) xmlBufferContent(buf), xmlBufferLength(buf) - 1); -- Jim
Jim Jones <jim.jones@uni-muenster.de> writes: > mmh... xmlDocContentDumpOutput seems to add a trailing newline in the > end of a document by default, making the serialization of the same xml > string with DOCUMENT and CONTENT different: Does seem a bit inconsistent. > Or should we in this case consider something like this in > xmltotext_with_options()? > result = cstring_to_text_with_len((const char *) xmlBufferContent(buf), > xmlBufferLength(buf) - 1); I think it'd be quite foolish to assume that every extant and future version of libxml2 will share this glitch. Probably should use logic more like pg_strip_crlf(), although we can't use that directly. Would it ever be the case that trailing whitespace would be valid data? In a bit of testing, it seems like that could be true in CONTENT mode but not DOCUMENT mode. regards, tom lane
Jim Jones <jim.jones@uni-muenster.de> writes: > [ xmlserialize patches ] Pushed with minor editorialization. Notably, I got rid of scribbling on xmlBufferContent's buffer --- I don't know how likely that is to upset libxml2, but it seems like a fairly bad idea given that they declare the result as "const xmlChar*". Casting away the const is poor form in any case. regards, tom lane