Thread: Re: Add XMLNamespaces to XMLElement
Hi
so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
Hi,
I'd like to propose the implementation of the XMLNamespaces option for
XMLElement.
XMLNAMESPACES(nsuri AS nsprefix)
XMLNAMESPACES(DEFAULT default-nsuri)
XMLNAMESPACES(NO DEFAULT)
* nsprefix: Namespace's prefix.
* nsuri: Namespace's URI.
* DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
the scope of a namespace declaration.
* NO DEFAULT: Specifies that NO DEFAULT namespace is to be
used within the scope of a namespace declaration.
This basically works pretty much like XMLAttributes, but with a few more
restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):
* XML namespace declaration shall contain at most one DEFAULT namespace
declaration item.
* No namespace prefix shall be equivalent to xml or xmlns.
* No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
or to http://www.w3.org/XML/1998/namespace.
* The value of a namespace URI contained in an regular namespace
declaration item (no DEFAULT) shall not be a zero-length string.
Examples:
SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y' AS bar));
xmlelement
-------------------------------
<foo xmlns:bar="http://x.y"/>
SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y'));
xmlelement
---------------------------
<foo xmlns="http://x.y"/>
SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
xmlelement
-----------------
<foo xmlns=""/>
In transformXmlExpr() it seemed convenient to use the same parameters to
store the prefixes and URIs as in XMLAttributes (arg_names and
named_args), but I am still not so sure it is the right approach. Is
there perhaps a better way?
Any thoughts? Feedback welcome!
+1
Pavel
Best, Jim
Hi, +1 for the enhancement. I haven't compiled and reviewed the full patch yet, please see a few comments from my side based on static analysis. 1. Though changes are targeted for XMLNAMESPACES for XMLElement but in my opinion it will affect XMLTABLE as well because the 'xml_namespace_list' rule is shared now. Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to specify DEFAULT in NAMESPACE but resulting in following error: "ERROR: DEFAULT namespace is not supported" What would be behavior with this change for XMLTABLE, should this be allowed and the error messages need to be updated (may be this will not be an error at all) or we need to restrict users to not use 'NO DEFAULT' with XMLTable. 2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the definition is the same. 3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>' (empty uri) , should not it be more like 'DEFAULT NULL' to result in the following ? SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT)); xmlelement ------------------ <root/> instead of SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT)); xmlelement ------------------ <root xmlns=""/> Regards Umar Hayat On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > Hi > > so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal: >> >> Hi, >> >> I'd like to propose the implementation of the XMLNamespaces option for >> XMLElement. >> >> XMLNAMESPACES(nsuri AS nsprefix) >> XMLNAMESPACES(DEFAULT default-nsuri) >> XMLNAMESPACES(NO DEFAULT) >> >> * nsprefix: Namespace's prefix. >> * nsuri: Namespace's URI. >> * DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within >> the scope of a namespace declaration. >> * NO DEFAULT: Specifies that NO DEFAULT namespace is to be >> used within the scope of a namespace declaration. >> >> This basically works pretty much like XMLAttributes, but with a few more >> restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>): >> >> * XML namespace declaration shall contain at most one DEFAULT namespace >> declaration item. >> * No namespace prefix shall be equivalent to xml or xmlns. >> * No namespace URI shall be identical to http://www.w3.org/2000/xmlns/ >> or to http://www.w3.org/XML/1998/namespace. >> * The value of a namespace URI contained in an regular namespace >> declaration item (no DEFAULT) shall not be a zero-length string. >> >> Examples: >> >> SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y' AS bar)); >> xmlelement >> ------------------------------- >> <foo xmlns:bar="http://x.y"/> >> >> SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y')); >> xmlelement >> --------------------------- >> <foo xmlns="http://x.y"/> >> >> SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT)); >> xmlelement >> ----------------- >> <foo xmlns=""/> >> >> In transformXmlExpr() it seemed convenient to use the same parameters to >> store the prefixes and URIs as in XMLAttributes (arg_names and >> named_args), but I am still not so sure it is the right approach. Is >> there perhaps a better way? >> >> Any thoughts? Feedback welcome! > > > +1 > > Pavel >> >> >> Best, Jim -- Umar Hayat Bitnine (https://bitnine.net/)
čt 26. 12. 2024 v 14:46 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
Hi Umar, hi Pavel,
Thanks for taking a look at this patch!
On 26.12.24 05:15, Umar Hayat wrote:
> Hi,
> +1 for the enhancement.
>
> I haven't compiled and reviewed the full patch yet, please see a few
> comments from my side based on static analysis.
>
> 1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
> my opinion it will affect XMLTABLE as well because the
> 'xml_namespace_list' rule is shared now.
> Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
> with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
> specify DEFAULT in NAMESPACE but resulting in following error:
> "ERROR: DEFAULT namespace is not supported"
I also considered creating a new rule to avoid any conflict with
XMLTable, but as it didn't break any regression test and the result
would be pretty much the same as with "DEFAULT 'str'", I thought that
extending the existing rule would be the way to go.
SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),
'/rows/row'
PASSING '<rows
xmlns="http://x.y"><row><a>10</a></row></rows>'
COLUMNS a int PATH 'a');
ERROR: DEFAULT namespace is not supported
What do you think?
> What would be behavior with this change for XMLTABLE, should this be
> allowed and the error messages need to be updated (may be this will
> not be an error at all) or we need to restrict users to not use 'NO
> DEFAULT' with XMLTable.
Perhaps updating the error message would suffice?
>
> 2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
> definition is the same.
That would be good. I'm just afraid it would deviate a bit from the
scope of this patch - here I mean touching other function. Would you
suggest to add it to a patch series?
> 3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
> (empty uri) , should not it be more like 'DEFAULT NULL' to result in
> the following ?
> SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
> xmlelement
> ------------------
> <root/>
>
> instead of
>
> SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
> xmlelement
> ------------------
> <root xmlns=""/>
>
The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.
SELECT
xmlserialize(DOCUMENT
xmlelement(NAME "root",
xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
xmlelement(NAME "foo",
xmlnamespaces(NO DEFAULT))
) AS text INDENT);
xmlserialize
------------------------------
<root xmlns="http:/x.y/ns1">+
<foo xmlns=""/> +
</root>
(1 row)
I believe this behaviour might be confusing if NO DEFAULT is used in the
root element, as it there is no previously declared namespace. Perhaps
making NO DEFAULT behave like DEFAULT NULL only in the root element
would make things clearer? The SQL/XML spec doesn't say anything
specific about it, but DB2 had the same thought[1]. For reference, here
are the regress tests[2] of this patch tested with the DB2 implementation.
You can check Oracle too.
> On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> +1
>>
>> Pavel
rebase in v2 attached - due to changes in gram.y
I checked this patch
The parser part looks a little bit dirty - it multiplies numbers of XMLELEMENT rules. Maybe xmlattributes and xml_namespaces can be processed elsewhere like list of xml_element_options?
Regards
Pavel
Thanks a lot
Best, Jim
1 - https://dbfiddle.uk/0QsWlfZR
2 - https://dbfiddle.uk/SyiDfXod
On 21.01.25 11:48, Umar Hayat wrote: > For XMLAttributes attribute it should have ColumnRef/Expr because > that's the data/content we want to generate. But namespaces and xml > tags, IMO they should be considered as part of the structure/schema of > XML. Allowing namespaces (default or otherwise) to be generated > arbitrarily for each record does not seem correct to me, it's like > generating arbitrary XML using print string which does not require XML > functions. I'm not sure I completely get your 'print string' argument. Namespaces are added to the element using libxml2's xmlTextWriterWriteAttributeNS function, so we’re letting libxml2 handle whether a namespace declaration is valid or not. Of course, there are still some extra checks required by the SQL/XML standard. > > - DB2 allows XMLElements namespace but it does not allow Expr/ColumnRef. > - Oracle Allow namespace in only XMLTable, and it does not allow Expr/ColumnRef. > > - Having SConst/String or numeric can limit the error handling at > parsing stage which can validate the schema instead of expression > evaluation per record, which leads to following problem at runtime: > > CREATE TABLE xmltab (uri TEXT); > INSERT INTO xmltab VALUES ('good'), (''); > SELECT XMLElement(NAME "root", XMLNamespaces(uri AS zz)) from xmltab; > ERROR: invalid XML namespace URI for "zz" > DETAIL: a regular XML namespace cannot be a zero-length string > > Imagine there millions of records and in the middle it fails. I don't think discarding a feature just because the input data might raise an exception in a long transaction is a strong argument here. For your specific case, the user can always use a WHERE clause to filter out any URIs that aren’t valid for XMLNamespace. Additionally, the documentation already mentions this specific limitation: "The value of a <replaceable>regular-nsuri</replaceable> cannot be a zero-length string." So it shouldn’t really catch anyone off guard :) Thanks for the review! Best, Jim
Pavel and Jim, If that's the case, it looks good to me. Just wanted to highlight potential issues and implementation differences compared to other databases. Regards Umar Hayat
po 27. 1. 2025 v 14:57 odesílatel Umar Hayat <postgresql.wizard@gmail.com> napsal:
Pavel and Jim,
If that's the case, it looks good to me.
Just wanted to highlight potential issues and implementation
differences compared to other databases.
It is correct.
Every Time there will be some differences - DB2 has an absolutely different stack for XML processing, Oracle has handy written parser, and allows some syntaxes that Postgres does not, but for some cases it is unfriendly, strict and restrictive.
And we are searching for some good compromise - between consistency with self, consistency with standard, and good usability and good portability.
Regards
Pavel
Regards
Umar Hayat