Thread: [PATCH] Add XMLText function (SQL/XML X038)
Hi, This small patch proposes the implementation of the standard SQL/XML function XMLText (X038). It basically converts a text parameter into an xml text node. It uses the libxml2 function xmlEncodeSpecialChars[1] to escape possible predefined entities. This patch also contains documentation and regression tests. Any thoughts? Best, Jim 1 - https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-entities.html#xmlEncodeSpecialChars
Attachment
so 25. 3. 2023 v 12:49 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:
Hi,
This small patch proposes the implementation of the standard SQL/XML
function XMLText (X038). It basically converts a text parameter into an
xml text node. It uses the libxml2 function xmlEncodeSpecialChars[1] to
escape possible predefined entities.
This patch also contains documentation and regression tests.
Any thoughts?
+1
Pavel
Best, Jim
1 -
https://gnome.pages.gitlab.gnome.org/libxml2/devhelp/libxml2-entities.html#xmlEncodeSpecialChars
On 25.03.23 12:53, Pavel Stehule wrote:
so 25. 3. 2023 v 12:49 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:Hi,
This small patch proposes the implementation of the standard SQL/XML
function XMLText (X038). It basically converts a text parameter into an
xml text node. It uses the libxml2 function xmlEncodeSpecialChars[1] to
escape possible predefined entities.
This patch also contains documentation and regression tests.
Any thoughts?+1Pavel
Thanks!
I just realized that I forgot to add a few examples to my last message :D
postgres=# SELECT xmltext('foo ´/[({bar?})]\`');
xmltext
--------------------
foo ´/[({bar?})]\`
(1 row)
postgres=# SELECT xmltext('foo & <bar>');
xmltext
-----------------------
foo & <bar>
(1 row)
On 25.03.23 13:25, I wrote:
I just realized that I forgot to add a few examples to my last message :Dpostgres=# SELECT xmltext('foo ´/[({bar?})]\`');
xmltext
--------------------
foo ´/[({bar?})]\`
(1 row)
postgres=# SELECT xmltext('foo & <bar>');
xmltext
-----------------------
foo & <bar>
(1 row)
It seems that an encoding issue appears in the regression tests on Debian + Meson, 32 bit.
´ > ´ ° > °
v2 attached updates the regression tests to fix it.
Jim
Attachment
On 3/25/23 12:49, Jim Jones wrote: > Hi, > > This small patch proposes the implementation of the standard SQL/XML > function XMLText (X038). It basically converts a text parameter into an > xml text node. It uses the libxml2 function xmlEncodeSpecialChars[1] to > escape possible predefined entities. > > This patch also contains documentation and regression tests. > > Any thoughts? I am replying to this email, but my comments are based on the v2 patch. Thank you for working on this, and I think this is a valuable addition. However, I have two issues with it. 1) There seems to be several spurious blank lines added that I do not think are warranted. 2) This patch does nothing to address the <XML returning clause> so we can't claim to implement X038 without a disclaimer. Upon further review, the same is true of XMLCOMMENT() so maybe that is okay for this patch, and a more comprehensive patch for our xml features is necessary. -- Vik Fearing
Hi Vik Thanks for reviewing my patch! On 25.08.23 12:05, Vik Fearing wrote: > I am replying to this email, but my comments are based on the v2 patch. > > Thank you for working on this, and I think this is a valuable > addition. However, I have two issues with it. > > 1) There seems to be several spurious blank lines added that I do not > think are warranted. I tried to copy the aesthetics of other functions, but it seems I failed :) I removed a few blank lines. I hope it's fine now. Is there any tool like pgindent to take care of it automatically? > > 2) This patch does nothing to address the <XML returning clause> so we > can't claim to implement X038 without a disclaimer. Upon further > review, the same is true of XMLCOMMENT() so maybe that is okay for > this patch, and a more comprehensive patch for our xml features is > necessary. If we decide to not address this point here, I can take a look at it and work in a separated patch. v3 attached. Thanks Jim
Attachment
> On 25 Aug 2023, at 14:42, Jim Jones <jim.jones@uni-muenster.de> wrote: > Is there any tool like pgindent to take care of it automatically? No, pgindent doesn't address whitespace, only indentation of non-whitespace. -- Daniel Gustafsson
On 8/25/23 14:42, Jim Jones wrote: > Hi Vik > > Thanks for reviewing my patch! Thank you for writing it! > On 25.08.23 12:05, Vik Fearing wrote: >> I am replying to this email, but my comments are based on the v2 patch. >> >> Thank you for working on this, and I think this is a valuable >> addition. However, I have two issues with it. >> >> 1) There seems to be several spurious blank lines added that I do not >> think are warranted. > > I tried to copy the aesthetics of other functions, but it seems I failed > :) I removed a few blank lines. I hope it's fine now. I am talking specifically about this: @@ -505,6 +506,10 @@ xmlcomment(PG_FUNCTION_ARGS) appendStringInfoText(&buf, arg); appendStringInfoString(&buf, "-->"); + + + + PG_RETURN_XML_P(stringinfo_to_xmltype(&buf)); #else NO_XML_SUPPORT(); >> 2) This patch does nothing to address the <XML returning clause> so we >> can't claim to implement X038 without a disclaimer. Upon further >> review, the same is true of XMLCOMMENT() so maybe that is okay for >> this patch, and a more comprehensive patch for our xml features is >> necessary. > > If we decide to not address this point here, I can take a look at it and > work in a separated patch. I do not think this should be addressed in this patch because there are quite a lot of functions that need to handle this. -- Vik Fearing
On 25.08.23 16:49, Vik Fearing wrote: > > I am talking specifically about this: > > @@ -505,6 +506,10 @@ xmlcomment(PG_FUNCTION_ARGS) > appendStringInfoText(&buf, arg); > appendStringInfoString(&buf, "-->"); > > + > + > + > + > PG_RETURN_XML_P(stringinfo_to_xmltype(&buf)); > #else > NO_XML_SUPPORT(); I have no idea how xmlcomment() got changed in this patch :D nice catch! > > I do not think this should be addressed in this patch because there > are quite a lot of functions that need to handle this. v4 attached. Jim
Attachment
On 2023-08-25 10:49, Vik Fearing wrote: > I do not think this should be addressed in this patch because > there are quite a lot of functions that need to handle this. Indeed, as described in [0], we still largely provide the SQL/XML:2003 notion of a single XML datatype, not the distinguishable XML(DOCUMENT), XML(CONTENT), XML(SEQUENCE) types from :2006 and later, which has a number of adverse consequences for developers[1], and that wiki page proposed a couple possible ways forward[2]. Regards, -Chap [0] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards [1] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Obstacles_to_improving_conformance [2] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#Possible_ways_forward
On 8/25/23 17:56, Chapman Flack wrote: > [0] https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards I was not aware of this page. What a wealth of information! -- Vik Fearing
On 2023-Aug-25, Chapman Flack wrote: > On 2023-08-25 10:49, Vik Fearing wrote: > > I do not think this should be addressed in this patch because > > there are quite a lot of functions that need to handle this. > > Indeed, as described in [0], we still largely provide the SQL/XML:2003 > notion of a single XML datatype, not the distinguishable XML(DOCUMENT), > XML(CONTENT), XML(SEQUENCE) types from :2006 and later, which has a > number of adverse consequences for developers[1], and that wiki page > proposed a couple possible ways forward[2]. Sadly, all the projects seem to have been pretty much abandoned in the meantime. Zorba has been dead for 9 years, xqilla for 6. Even XQC, the API they claim to implement, is dead. It sounds unlikely that there is *any* way forward here. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte" (Ijon Tichy en Viajes, Stanislaw Lem)
On 2023-08-26 13:02, Alvaro Herrera wrote: > Sadly, all the projects seem to have been pretty much abandoned in the > meantime. Zorba has been dead for 9 years, xqilla for 6. Even XQC, > the > API they claim to implement, is dead. Sounds like bad news for the "XQC as integration point" proposal, anyway. Saxon 11.6 came out two days ago[0], supporting XPath/XQuery 3.1 etc. (12.3 came out last month, but 12 isn't considered the 'stable' release yet. It's working toward XSLT/XPath/XQuery 4.0.) Regards, -Chap [0] https://blog.saxonica.com/announcements/2023/08/saxon-11.6.html
Hi
so 26. 8. 2023 v 21:23 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 2023-08-26 13:02, Alvaro Herrera wrote:
> Sadly, all the projects seem to have been pretty much abandoned in the
> meantime. Zorba has been dead for 9 years, xqilla for 6. Even XQC,
> the
> API they claim to implement, is dead.
Sounds like bad news for the "XQC as integration point" proposal,
anyway.
Saxon 11.6 came out two days ago[0], supporting XPath/XQuery 3.1 etc.
(12.3 came out last month, but 12 isn't considered the 'stable'
release yet. It's working toward XSLT/XPath/XQuery 4.0.)
Saxon can be an interesting library, but nobody knows if integration with Postgres is possible. Their C implementation is Java compiled/executed by GraalV.
Regards
Pavel
Regards,
-Chap
[0] https://blog.saxonica.com/announcements/2023/08/saxon-11.6.html
On 2023-08-26 16:00, Pavel Stehule wrote: > Saxon can be an interesting library, but nobody knows if integration > with > Postgres is possible. Their C implementation is Java compiled/executed > by GraalV. Indeed, such an integration would probably not be in core. Of the two possible-ways-forward described on that wiki page, the one that didn't rely on the defunct XQC was one involving query rewriting. Have the parser understand the SQL/XML customized syntax, and define a set of ordinary functions it will be rewritten into. (This idea is bolstered somewhat by the fact that many things in SQL/XML, XMLTABLE for example, are /defined in the standard/ in terms of query rewriting into calls on simpler functions.) Then let there be an extension, or ideally someday a choice of extensions, supplying those functions. As to whether running Saxon in a Postgres extension is possible, that's been an example that ships with PL/Java since 1.5.1 five years ago. It's too bad the other projects have stalled; it's good to have more than one ready option. But Saxon shows no sign of going away. Perhaps the act of devising a standardized rewriting of queries onto a standardized set of loadable functions could be of interest to other DBMS projects as well. It's hard to imagine another DBMS not being in the same boat (if it isn't from a rich commercial firm that happens to have a modern XQuery implementation in-house). Maybe having that set of functions specified, with the prospect that more than one DBMS might be interested in a project implementing them, even inspires someone to go look at the xqilla or zorba repos to see how far they got, and pick up the baton, and then there could be more than one option. Regards, -Chap
so 26. 8. 2023 v 22:47 odesílatel Chapman Flack <chap@anastigmatix.net> napsal:
On 2023-08-26 16:00, Pavel Stehule wrote:
> Saxon can be an interesting library, but nobody knows if integration
> with
> Postgres is possible. Their C implementation is Java compiled/executed
> by GraalV.
Indeed, such an integration would probably not be in core.
Of the two possible-ways-forward described on that wiki page, the one
that didn't rely on the defunct XQC was one involving query rewriting.
Have the parser understand the SQL/XML customized syntax, and define
a set of ordinary functions it will be rewritten into. (This idea is
bolstered somewhat by the fact that many things in SQL/XML, XMLTABLE
for example, are /defined in the standard/ in terms of query rewriting
into calls on simpler functions.)
Then let there be an extension, or ideally someday a choice of
extensions, supplying those functions.
As to whether running Saxon in a Postgres extension is possible, that's
been an example that ships with PL/Java since 1.5.1 five years ago.
The most simple "solution" can be the introduction of some new hooks there. Then you can write an extension that will call PL/Java functions
It's too bad the other projects have stalled; it's good to have more
than one ready option. But Saxon shows no sign of going away.
Perhaps the act of devising a standardized rewriting of queries
onto a standardized set of loadable functions could be of interest
to other DBMS projects as well. It's hard to imagine another DBMS
not being in the same boat (if it isn't from a rich commercial firm
that happens to have a modern XQuery implementation in-house).
Maybe having that set of functions specified, with the prospect
that more than one DBMS might be interested in a project
implementing them, even inspires someone to go look at the
xqilla or zorba repos to see how far they got, and pick up
the baton, and then there could be more than one option.
Another possibility is revitalization of libxml2.
There was an extension http://www.explain.com.au/libx/ But the code is not available to download too, but extending libxml2 is feasible.
I am not sure how valuable this work can be. Probably whoever really needs it uses some Java based solution already.
Regards
Pavel
Regards,
-Chap
> On 25 Aug 2023, at 17:40, Jim Jones <jim.jones@uni-muenster.de> wrote: > On 25.08.23 16:49, Vik Fearing wrote: >> I do not think this should be addressed in this patch because there are quite a lot of functions that need to handle this. > > v4 attached. I had a look at v4 of this patch and apart from pgindenting and moving the function within xml.c next to xmlcomment() I think this is ready. Just like Vik says upthread we can't really claim X038 conformance without a disclaimer, so I've added a 0002 which adds this to the XML spec conformance page in the docs. The attached v5 contains the above mentioned changes. I've marked this ready for committer in the CF app as well. -- Daniel Gustafsson
Attachment
On 11/3/23 16:30, Daniel Gustafsson wrote: >> On 25 Aug 2023, at 17:40, Jim Jones <jim.jones@uni-muenster.de> wrote: > > Just like Vik says upthread we can't really claim X038 conformance without a > disclaimer, so I've added a 0002 which adds this to the XML spec conformance > page in the docs. We should put a short version of the disclaimer in sql_features.txt as well. -- Vik Fearing
Hi Daniel, hi Vik, Thanks a lot for the review! On 03.11.23 16:45, Vik Fearing wrote: > We should put a short version of the disclaimer in sql_features.txt as > well. You mean to add a disclaimer in the X038 entry? Something along these lines perhaps? X038 XMLText YES It does not address the <literal><XML returning clause></literal>, as it is not supported in <productname>PostgreSQL</productname>. Jim
On 11/3/23 17:14, Jim Jones wrote: > Hi Daniel, hi Vik, > > Thanks a lot for the review! > > On 03.11.23 16:45, Vik Fearing wrote: >> We should put a short version of the disclaimer in sql_features.txt as >> well. > You mean to add a disclaimer in the X038 entry? Something along these > lines perhaps? > > X038 XMLText YES It does not address the <literal><XML > returning clause></literal>, as it is not supported in > <productname>PostgreSQL</productname>. I was thinking of something much shorter than that. Such as X038 XMLText YES supported except for RETURNING -- Vik Fearing
On 03.11.23 19:05, Vik Fearing wrote: > I was thinking of something much shorter than that. Such as > > X038 XMLText YES supported except for RETURNING v6 attached includes this change and the doc addition from Daniel. Thanks! -- Jim
Attachment
On 11/3/23 21:28, Jim Jones wrote: > > On 03.11.23 19:05, Vik Fearing wrote: >> I was thinking of something much shorter than that. Such as >> >> X038 XMLText YES supported except for RETURNING > > v6 attached includes this change and the doc addition from Daniel. There are some typos in the commit message, but otherwise this looks commitable to me. -- Vik Fearing
> On 4 Nov 2023, at 15:01, Vik Fearing <vik@postgresfriends.org> wrote: > > On 11/3/23 21:28, Jim Jones wrote: >> On 03.11.23 19:05, Vik Fearing wrote: >>> I was thinking of something much shorter than that. Such as >>> >>> X038 XMLText YES supported except for RETURNING >> v6 attached includes this change and the doc addition from Daniel. > > There are some typos in the commit message, but otherwise this looks commitable to me. I took another look at this today, fixes the above mentioned typos and some tiny cosmetic things and pushed it. -- Daniel Gustafsson
On 06.11.23 11:49, Daniel Gustafsson wrote: > I took another look at this today, fixes the above mentioned typos and some > tiny cosmetic things and pushed it. > > -- > Daniel Gustafsson > Awesome! Thanks Daniel and Vik for reviewing and pushing this patch :) -- Jim