Thread: BUG #15517: JSONB_BUILD_ARRAY and JSON_BUILD_ARRAY omit XMLDeclaration (if present) from XML colums

The following bug has been logged on the website:

Bug reference:      15517
Logged by:          Mark Drake
Email address:      mark.drake@golden-hind.com
PostgreSQL version: 11.1
Operating system:   Windows
Description:

postgres=# create table foo(x xml);
CREATE TABLE
postgres=# insert into foo values ('<XYZ>XXX</XYZ>');
INSERT 0 1
postgres=# insert into foo values ('<?xml
version="1.0"?><ABC>123</ABC>'::XML);
INSERT 0 1
postgres=# select x, x::text, JSONB_BUILD_ARRAY(x,x::text) from foo;
       x        |                  x                  |
jsonb_build_array
----------------+-------------------------------------+-------------------------------------------------------------
 <XYZ>XXX</XYZ> | <XYZ>XXX</XYZ>                      | ["<XYZ>XXX</XYZ>",
"<XYZ>XXX</XYZ>"]
 <ABC>123</ABC> | <?xml version="1.0"?><ABC>123</ABC> | ["<ABC>123</ABC>",
"<?xml version=\"1.0\"?><ABC>123</ABC>"]
(2 rows)

As can be seen the JSONB_BUILD_ARRAY of x has omitted the declaration. I
would have expected the declaration to be included in the output for column
X.

As can be seen a simple workaround is to generate based on the casting to
text. Not sure how much overhead (if any) this adds.


=?utf-8?q?PG_Bug_reporting_form?= <noreply@postgresql.org> writes:
> postgres=# select x, x::text, JSONB_BUILD_ARRAY(x,x::text) from foo;
>        x        |                  x                  |                     jsonb_build_array
> ----------------+-------------------------------------+-------------------------------------------------------------
>  <XYZ>XXX</XYZ> | <XYZ>XXX</XYZ>                      | ["<XYZ>XXX</XYZ>", "<XYZ>XXX</XYZ>"]
>  <ABC>123</ABC> | <?xml version="1.0"?><ABC>123</ABC> | ["<ABC>123</ABC>", "<?xml version=\"1.0\"?><ABC>123</ABC>"]
> (2 rows)

> As can be seen the JSONB_BUILD_ARRAY of x has omitted the declaration. I
> would have expected the declaration to be included in the output for column
> X.

I don't see anything particularly wrong here, at least not with
jsonb_build_array(): it's producing a JSON string equivalent to
the displayable value of each expression.

Now it is fair to wonder why the displayed form of the second XML value is
different from what you get from a cast-to-text.  The reason seems to be
that xml_out() tries to suppress uninteresting XML declarations, while
cast-to-text is just a binary coercion that suppresses nothing.  I don't
personally use XML enough to knowledgeably attack or defend either of
those choices ... but I do note that they've been like that for long
enough that it seems unlikely we'd consider changing them.

(If the whole idea of cast-to-text producing a different result from the
datatype output function upsets you, you're not alone, but there's lots
of precedent for that too.  Type boolean is probably the poster child
there.)

            regards, tom lane