Thread: Re: Add XMLNamespaces to XMLElement

Re: Add XMLNamespaces to XMLElement

From
Pavel Stehule
Date:
Hi

so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
Hi,

I'd like to propose the implementation of the XMLNamespaces option for
XMLElement.

XMLNAMESPACES(nsuri AS nsprefix)
XMLNAMESPACES(DEFAULT default-nsuri)
XMLNAMESPACES(NO DEFAULT)

* nsprefix:              Namespace's prefix.
* nsuri:                 Namespace's URI.
* DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
the scope of a namespace declaration.
* NO DEFAULT:            Specifies that NO DEFAULT namespace is to be
used within the scope of a namespace declaration.

This basically works pretty much like XMLAttributes, but with a few more
restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):

* XML namespace declaration shall contain at most one DEFAULT namespace
declaration item.
* No namespace prefix shall be equivalent to xml or xmlns.
* No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
or to http://www.w3.org/XML/1998/namespace.
* The value of a namespace URI contained in an regular namespace
declaration item (no DEFAULT) shall not be a zero-length string.

Examples:

SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y' AS bar));
          xmlelement           
-------------------------------
 <foo xmlns:bar="http://x.y"/>

SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y'));
        xmlelement         
---------------------------
 <foo xmlns="http://x.y"/>

SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
   xmlelement    
-----------------
 <foo xmlns=""/>

In transformXmlExpr() it seemed convenient to use the same parameters to
store the prefixes and URIs as in XMLAttributes (arg_names and
named_args), but I am still not so sure it is the right approach. Is
there perhaps a better way?

Any thoughts? Feedback welcome!

+1

Pavel

Best, Jim

Re: Add XMLNamespaces to XMLElement

From
Umar Hayat
Date:
Hi,
+1 for the enhancement.

I haven't compiled and reviewed the full patch yet, please see a few
comments from my side based on static analysis.

1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
my opinion it will affect XMLTABLE as well because the
'xml_namespace_list' rule is shared now.
Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
specify DEFAULT in NAMESPACE but resulting in following error:
"ERROR:  DEFAULT namespace is not supported"
What would be behavior with this change for XMLTABLE, should this be
allowed and the error messages need to be updated (may be this will
not be an error at all) or we need to restrict users to not use 'NO
DEFAULT' with XMLTable.

2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
definition is the same.

3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
(empty uri) , should not it be more like 'DEFAULT NULL' to result in
the following ?
SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
    xmlelement
------------------
<root/>

instead of

SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
    xmlelement
------------------
<root xmlns=""/>

Regards
Umar Hayat

On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> Hi
>
> so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
>>
>> Hi,
>>
>> I'd like to propose the implementation of the XMLNamespaces option for
>> XMLElement.
>>
>> XMLNAMESPACES(nsuri AS nsprefix)
>> XMLNAMESPACES(DEFAULT default-nsuri)
>> XMLNAMESPACES(NO DEFAULT)
>>
>> * nsprefix:              Namespace's prefix.
>> * nsuri:                 Namespace's URI.
>> * DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
>> the scope of a namespace declaration.
>> * NO DEFAULT:            Specifies that NO DEFAULT namespace is to be
>> used within the scope of a namespace declaration.
>>
>> This basically works pretty much like XMLAttributes, but with a few more
>> restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):
>>
>> * XML namespace declaration shall contain at most one DEFAULT namespace
>> declaration item.
>> * No namespace prefix shall be equivalent to xml or xmlns.
>> * No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
>> or to http://www.w3.org/XML/1998/namespace.
>> * The value of a namespace URI contained in an regular namespace
>> declaration item (no DEFAULT) shall not be a zero-length string.
>>
>> Examples:
>>
>> SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y' AS bar));
>>           xmlelement
>> -------------------------------
>>  <foo xmlns:bar="http://x.y"/>
>>
>> SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y'));
>>         xmlelement
>> ---------------------------
>>  <foo xmlns="http://x.y"/>
>>
>> SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
>>    xmlelement
>> -----------------
>>  <foo xmlns=""/>
>>
>> In transformXmlExpr() it seemed convenient to use the same parameters to
>> store the prefixes and URIs as in XMLAttributes (arg_names and
>> named_args), but I am still not so sure it is the right approach. Is
>> there perhaps a better way?
>>
>> Any thoughts? Feedback welcome!
>
>
> +1
>
> Pavel
>>
>>
>> Best, Jim



--
Umar Hayat
Bitnine (https://bitnine.net/)



Re: Add XMLNamespaces to XMLElement

From
Pavel Stehule
Date:


čt 26. 12. 2024 v 14:46 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
Hi Umar, hi Pavel,

Thanks for taking a look at this patch!

On 26.12.24 05:15, Umar Hayat wrote:
> Hi,
> +1 for the enhancement.
>
> I haven't compiled and reviewed the full patch yet, please see a few
> comments from my side based on static analysis.
>
> 1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
> my opinion it will affect XMLTABLE as well because the
> 'xml_namespace_list' rule is shared now.
> Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
> with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
> specify DEFAULT in NAMESPACE but resulting in following error:
> "ERROR:  DEFAULT namespace is not supported"

I also considered creating a new rule to avoid any conflict with
XMLTable, but as it didn't break any regression test and the result
would be pretty much the same as with "DEFAULT 'str'", I thought that
extending the existing rule would be the way to go.

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),
                      '/rows/row'
                      PASSING '<rows
xmlns="http://x.y"><row><a>10</a></row></rows>'
                      COLUMNS a int PATH 'a');
ERROR:  DEFAULT namespace is not supported

What do you think?


> What would be behavior with this change for XMLTABLE, should this be
> allowed and the error messages need to be updated (may be this will
> not be an error at all) or we need to restrict users to not use 'NO
> DEFAULT' with XMLTable.


Perhaps updating the error message would suffice?


>
> 2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
> definition is the same.


That would be good. I'm just afraid it would deviate a bit from the
scope of this patch - here I mean touching other function. Would you
suggest to add it to a patch series?

> 3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
> (empty uri) , should not it be more like 'DEFAULT NULL' to result in
> the following ?
> SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
>     xmlelement
> ------------------
> <root/>
>
> instead of
>
> SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
>     xmlelement
> ------------------
> <root xmlns=""/>
>
The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.

SELECT
  xmlserialize(DOCUMENT
    xmlelement(NAME "root",
      xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
      xmlelement(NAME "foo",
        xmlnamespaces(NO DEFAULT))
  ) AS text INDENT);

         xmlserialize         
------------------------------
 <root xmlns="http:/x.y/ns1">+
   <foo xmlns=""/>           +
 </root>
(1 row)


I believe this behaviour might be confusing if NO DEFAULT is used in the
root element, as it there is no previously declared namespace. Perhaps
making NO DEFAULT behave like DEFAULT NULL only in the root element
would make things clearer? The SQL/XML spec doesn't say anything
specific about it, but DB2 had the same thought[1]. For reference, here
are the regress tests[2] of this patch tested with the DB2 implementation.

You can check Oracle too. 
 

> On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> +1
>>
>> Pavel

rebase in v2 attached - due to changes in gram.y


I checked this patch

The parser part looks a little bit dirty - it multiplies numbers of XMLELEMENT rules. Maybe xmlattributes and xml_namespaces can be processed elsewhere like list of xml_element_options?

Regards

Pavel

 
Thanks a lot

Best, Jim

1 - https://dbfiddle.uk/0QsWlfZR
2 - https://dbfiddle.uk/SyiDfXod

Re: Add XMLNamespaces to XMLElement

From
Jim Jones
Date:
On 21.01.25 11:48, Umar Hayat wrote:
> For XMLAttributes attribute it should have ColumnRef/Expr because
> that's the data/content we want to generate. But namespaces and xml
> tags, IMO they should be considered as part of the structure/schema of
> XML. Allowing namespaces (default or otherwise) to be generated
> arbitrarily for each record does not seem correct to me, it's like
> generating arbitrary XML using print string which does not require XML
> functions.


I'm not sure I completely get your 'print string' argument. Namespaces
are added to the element using libxml2's xmlTextWriterWriteAttributeNS
function, so we’re letting libxml2 handle whether a namespace
declaration is valid or not. Of course, there are still some extra
checks required by the SQL/XML standard.


>
> - DB2 allows XMLElements namespace but it does not allow Expr/ColumnRef.
> - Oracle Allow namespace in only XMLTable, and it does not allow Expr/ColumnRef.
>
> - Having SConst/String or numeric can limit the error handling at
> parsing stage which can validate the schema instead of expression
> evaluation per record, which leads to following problem at runtime:
>
> CREATE TABLE xmltab (uri TEXT);
> INSERT INTO xmltab VALUES ('good'), ('');
> SELECT XMLElement(NAME "root", XMLNamespaces(uri AS zz)) from xmltab;
> ERROR:  invalid XML namespace URI for "zz"
> DETAIL:  a regular XML namespace cannot be a zero-length string
>
> Imagine there millions of records and in the middle it fails.


I don't think discarding a feature just because the input data might
raise an exception in a long transaction is a strong argument here. For
your specific case, the user can always use a WHERE clause to filter out
any URIs that aren’t valid for XMLNamespace. Additionally, the
documentation already mentions this specific limitation:

"The value of a <replaceable>regular-nsuri</replaceable> cannot be a
zero-length string."

So it shouldn’t really catch anyone off guard :)


Thanks for the review!

Best, Jim




Re: Add XMLNamespaces to XMLElement

From
Umar Hayat
Date:
Pavel and Jim,
If that's the case, it looks good to me.
Just wanted to highlight potential issues and implementation
differences compared to other databases.

Regards
Umar Hayat



Re: Add XMLNamespaces to XMLElement

From
Pavel Stehule
Date:


po 27. 1. 2025 v 14:57 odesílatel Umar Hayat <postgresql.wizard@gmail.com> napsal:
Pavel and Jim,
If that's the case, it looks good to me.
Just wanted to highlight potential issues and implementation
differences compared to other databases.

It is correct.

Every Time there will be some differences  - DB2 has an absolutely different stack for XML processing, Oracle has handy written parser, and allows some syntaxes that Postgres does not, but for some cases it is unfriendly, strict and restrictive.

And we are searching for some good compromise - between consistency with self, consistency with standard, and good usability and good portability.

Regards

Pavel


Regards
Umar Hayat