Thread: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
[PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
"Robin Haberkorn"
Date:
Dear PG hackers, here is a patch that improves the xslt_process() error handling. Error handling was one of the points criticized earlier [1] and as far as I understand it is one of the reasons that xslt_process() was never adopted into core and contrib/xml2 is considered deprecated since PG 8. However the point is also still on the TODO list ("Report errors returned by the XSLT library") and xslt_process() has been asked for several times on the mailing list. Memory handling bugs that have been reported earlier [2] have obviously long been fixed. The point raised in [1] was that xslt_process() would not report missing stylesheet parameters at all. This hasn't been the case at the current HEAD. However, xslt_process() still wouldn't report any details about errors related to XSLT processing. This is what is fixed by this patch. It now installs an XSLT error handler. See the patch comment for details. Since the goal is to eventually get xslt_process() adopted into core, I also got rid of PG_XML_STRICTNESS_LEGACY. Perhaps you can tell me what else is preventing adoption into core. I believe that xslt_process() should also accept the `xml` type as an alternative to strings. Strings should be kept for backwards compatibility, though. Also, the stylesheet parameter passing is very crude and limited. I suggest, that the current method of passing them as strings (see parse_params()) is deprecated and we don't try to tackle its limitations, but will instead accept hstores. If you agree, I will work on these two tasks next. Yours sincerely, Robin Haberkorn [1]: https://www.postgresql.org/message-id/4C5ECEF9.3030806%40mlfowler.com [2]: https://www.postgresql.org/message-id/flat/4A6A276A.6090405%40dunslane.net -- Robin Haberkorn Senior Software Engineer B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
"Robin Haberkorn"
Date:
I forgot the patch of course... -- Robin Haberkorn Senior Software Engineer B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
Attachment
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
Tom Lane
Date:
"Robin Haberkorn" <haberkorn@b1-systems.de> writes: > I forgot the patch of course... Hi Robin, thank you for the patch! This has arrived too late to be considered for PG v18, but please add the thread to the open commitfest https://commitfest.postgresql.org/53/ so that we don't forget about it for v19, and so that it will receive automated testing from our cfbot. Just from a quick eyeballing of the patch, it seems generally sane, but I wonder why you have xml_generic_error_handler creating and then destroying an errorBuf instead of just appending the results directly to xmlerrcxt->err_buf. The extra buffer doesn't seem to be doing much except increasing our odds of an OOM failure right there. I'm also a little suspicious of - xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_LEGACY); + xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_ALL); as that looks like it is probably changing the behavior of the module in ways bigger than just providing more error details. This code has been in legacy status for so long that I feel a little hesitant about doing that. If we are willing to do it, should we go further and clean out the use of PG_XML_STRICTNESS_LEGACY in xml2/xpath.c as well? regards, tom lane
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
"Robin Haberkorn"
Date:
Hello Tom! On Tue Apr 22, 2025 at 18:29:02 GMT +03, Tom Lane wrote: > Hi Robin, thank you for the patch! This has arrived too late to > be considered for PG v18, but please add the thread to the open > commitfest I will do that. But what about the other changes I proposed (xml data type and hstore support)? Should they all be discussed in this thread or should I create separate threads and consequently separate patch entries for the Commitfest? And should they all be based on the master branch or can they be based upon each other? Do you think they will help to bring xslt_process() closer to adoption into core? > Just from a quick eyeballing of the patch, it seems generally > sane, but I wonder why you have xml_generic_error_handler > creating and then destroying an errorBuf instead of just > appending the results directly to xmlerrcxt->err_buf. The > extra buffer doesn't seem to be doing much except increasing > our odds of an OOM failure right there. You are right. It was a leftover from thinning out xml_errorHandler(). I attached a new version of the patch where this is fixed. I also ran all touched files through pgindent. > I'm also a little suspicious of > > - xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_LEGACY); > + xmlerrcxt = pgxml_parser_init(PG_XML_STRICTNESS_ALL); > > as that looks like it is probably changing the behavior of > the module in ways bigger than just providing more error > details. This code has been in legacy status for so long > that I feel a little hesitant about doing that. You wouldn't actually need that change to get more error details as the details are provided by xml_generic_error_handler() which ignores the strictness level. The reason is that libxml generic error handlers cannot discern between warnings and errors. The only thing that the strictness level does is affecting how xml_errorHandler() works: With STRICTNESS_ALL, it can actually log warnings without appending anything to the err_buf and set the err_occurred flag which is apparently useful because some libxml functions can result in error-level messages without the function calls signalling failure. At least, otherwise the code doesn't make sense to me. In other words, using the STRICTNESS_ALL should allow you to see warnings, you wouldn't otherwise see and catch errors, you wouldn't otherwise catch -- at least if they occur in libxml itself. As far as I understand from the code comment xml_errorHandler(), STRICTNESS_LEGACY was introduced, just so you don't have to touch the existing contrib/xml2 code that did not check err_occurred: /* * Legacy error handling mode. err_occurred is never set, we just add the * message to err_buf. This mode exists because the xml2 contrib module * uses our error-handling infrastructure, but we don't want to change its * behaviour since it's deprecated anyway. This is also why we don't * distinguish between notices, warnings and errors here --- the old-style * generic error handler wouldn't have done that either. */ Since we now check that flag in xslt_proc.c, we should be fine. But I might be overlooking something. > If we are willing to do it, should we go further and clean out the use > of PG_XML_STRICTNESS_LEGACY in xml2/xpath.c as well? I think this is a separate construction site. We don't necessarily have to touch xpath.c and I didn't plan to do so in the near future. We should first determine whether the XPath support is actually worth keeping (should be adopted into core) or rather stay deprecated. Best regards, Robin Haberkorn -- Robin Haberkorn Senior Software Engineer B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
Attachment
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
"Robin Haberkorn"
Date:
On Wed Apr 23, 2025 at 00:42:06 GMT +03, Robin Haberkorn wrote: > /* > * Legacy error handling mode. err_occurred is never set, we just add the > * message to err_buf. This mode exists because the xml2 contrib module > * uses our error-handling infrastructure, but we don't want to change its > * behaviour since it's deprecated anyway. This is also why we don't > * distinguish between notices, warnings and errors here --- the old-style > * generic error handler wouldn't have done that either. > */ btw. you wrote that comment in cacd42d62cb2ddf32135b151f627780a5509780f back in 2011. The commit message also contains some background information. -- Robin Haberkorn Senior Software Engineer B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
Tom Lane
Date:
"Robin Haberkorn" <haberkorn@b1-systems.de> writes: > btw. you wrote that comment in cacd42d62cb2ddf32135b151f627780a5509780f > back in 2011. > The commit message also contains some background information. Hm. I wonder if we couldn't get rid of the HAVE_XMLSTRUCTUREDERRORCONTEXT conditionalization. If the libxml2 versions that lacked that were "old" in 2011, surely they are extinct in the wild by now. I'm thinking that we really don't want to be using the generic handler at all, given the commit log's comment that it can't distinguish between error, warning, and notice cases. regards, tom lane
Re: [PATCH] contrib/xml2: xslt_process() should report XSLT-related error details
From
"Robin Haberkorn"
Date:
On Thu Apr 24, 2025 at 01:59:10 GMT +03, Tom Lane wrote: > Hm. I wonder if we couldn't get rid of the > HAVE_XMLSTRUCTUREDERRORCONTEXT conditionalization. If the > libxml2 versions that lacked that were "old" in 2011, surely > they are extinct in the wild by now. I'm thinking that we > really don't want to be using the generic handler at all, given > the commit log's comment that it can't distinguish between > error, warning, and notice cases. libxml 2.7.4 was released 15 years ago, so yes these #ifdefs can probably be removed already. What's the oldest distribution/OS you want to support PG on? Even Ubuntu 14.04 from 2014 already had libxml 2.9.1. If you like, I will prepare another patch to remove HAVE_XMLSTRUCTUREDERRORCONTEXT in a separate thread. PS: I added the libxslt error handling patch to the next commitfest: https://commitfest.postgresql.org/patch/5718/ -- Robin Haberkorn Senior Software Engineer B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / https://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt, HRB 3537