Re: XMLDocument (SQL/XML X030) - Mailing list pgsql-hackers

From Jim Jones
Subject Re: XMLDocument (SQL/XML X030)
Date
Msg-id 82cbac18-7bc6-42ec-be2f-4e66358bd43f@uni-muenster.de
Whole thread Raw
In response to Re: XMLDocument (SQL/XML X030)  (Robert Treat <rob@xzilla.net>)
Responses Re: XMLDocument (SQL/XML X030)
List pgsql-hackers
Hi Robert

On 28.01.25 05:54, Robert Treat wrote:
> Was playing around with the patch and was thinking about this wording:
> "The xmldocument function returns the input argument
>  unchanged... and is provided for compatibility."
>
> When I run an example similar to the db2 example you gave earlier:
>
> pagila=# SELECT xmldocument(xmlforest(10 as X, 20 as Y));
>     xmldocument
> --------------------
>  <x>10</x><y>20</y>
>
> In the db2 case, this is preserved as UPPER (which is to say, db2 case
> folds UPPER, and the input happens to match that), but we case fold
> lower, because we do; presumably you'd get the opposite effect in db2
> running the input with unquoted lower field names(?). 


Yes.

SELECT 42 AS foo FROM SYSIBM.SYSDUMMY1


FOO        
-----------
         42

  1 record(s) selected.


> In any case (no
> pun intended), SQL folks probably don't care much about that
> discrepancy, but given xml is case sensitive, maybe xml people do?


That's a good point. DB2 converts unquoted identifiers to uppercase by
default, which, if memory serves, aligns with the SQL standard. In the
case of this xmlforest example, my guess is that DB2 treats the elements
as identifiers and normalizes them to uppercase as well, as DB2 does not
handle XML documents as text like we do. To preserve case, you'd need to
explicitly quote the identifiers:

SELECT xmlforest(10 AS "x", 20 AS "y") FROM SYSIBM.SYSDUMMY1

--------------------
<x>10</x><y>20</y>

  1 record(s) selected.


Things look different when constructing the xml document directly from a
string:


SELECT xmlparse(DOCUMENT '<root><foo>bar</foo></root>') FROM
SYSIBM.SYSDUMMY1

----------------------------
<root><foo>bar</foo></root>

  1 record(s) selected.


I'd say the difference is due to how the two systems handle the XML data
type and unquoted identifiers in general, rather than a difference in
the behaviour of the function itself. Sticking to quoted identifiers in
both systems helps:


SELECT xmlforest(42 AS "foo", 73 AS "bar");
         xmlforest          
----------------------------

 <foo>42</foo><bar>73</bar>


Probably that's why most DB2 examples in their documentation use quoted
identifiers :)


Best regards, Jim





pgsql-hackers by date:

Previous
From: Keisuke Kuroda
Date:
Subject: Re: [PATCH] Improve code coverage of network address functions
Next
From: Andrei Lepikhov
Date:
Subject: Re: POC, WIP: OR-clause support for indexes