Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11) - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)
Date
Msg-id CAFj8pRAeDNoqOVZG9-5nFxev-ZpTeOP8qsxZN-i4dQ48E67NaQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] proposal - Default namespaces for XPath expressions(PostgreSQL 11)  (Dmitry Dolgov <9erthalion6@gmail.com>)
List pgsql-hackers
Hi

út 18. 9. 2018 v 17:33 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> [ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch.  I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser.  That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it.  (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

I understand, and I would be much more happy if xmllib2 supports this feature. But the development of new feature of this lib was frozen - and there are not any possibility.

On second hand the parser is very simple - tokenizer detects identifiers, strings, numbers, and parser try to find unqualified names and prints default namespace identifier before. It doesn't do more.

I renamed function   _transformXPath to transform_xpath_recurse and I descibed params


* I think the business with "pgdefnamespace.pgsqlxml.internal" is just
plain awful.  It's a wart, and I don't think it's even saving you any
code once you account for all the places where you have to throw errors
for somebody trying to use that as a regular name.  This should be done
with out-of-band signaling if possible.  The existing convention above
this code is that a NULL pointer means a default namespace; can't that
be continued throughout?  If that's not practical, can you pick a string
that simply can't syntactically be a namespace name?  (In the SQL world,
for example, an empty string is an illegal identifier so that that could
be used for the purpose.  But I don't know if that applies to XML.)
Or perhaps you can build a random name that is chosen just to make it
different from any of the listed namespaces?  If none of those work,
I think passing around an explicit "isdefault" boolean alongside the name
would be better than having a reserved word.

The string used like default namespace name should be valid XML namespace name, because it is injected to XPath expression and it is passed to libxml2 as one namespace name. libxml2 requires not null valid value.

I changed it and now the namespace name is generated.
 

* _transformXPath recurses, so doesn't it need check_stack_depth()?

fixed

* I'm not especially in love with using function names that start
with an underscore.  I do not think that adds anything, and it's
unlike the style in most places in PG.

renamed


* This is a completely unhelpful error message:
+       if (!parser->buffer_is_empty)
+               elog(ERROR, "internal error");
If you can't be bothered to write a message that says something
useful, either drop the test or turn it into an Assert.  I see some
other internal errors that aren't up to project norms either.

use assert
 

* Either get rid of the "elog(DEBUG1)"s, or greatly lower their
message priority.  They might've been appropriate for developing
this patch, but they are not okay to commit that way.

removed


* Try to reduce the amount of random whitespace changes in the patch.

The formatting was really strange, fixed
 

* .h files should never #include "postgres.h", per project policy.

I moved the code to xml.c like you propose
 

* I'm not sure I'd bother with putting the new code into a separate
file rather than cramming it into xml.c.  The main reason why not
is that you're going to get "empty translation unit" warnings from
some compilers in builds without USE_LIBXML.

* Documentation, comments, and error messages could all use some
copy-editing by a native English speaker (you knew that of course).

I hope so some native speakers looks there.

Thank you for comments

Attached updated patch

Regards

Pavel


                        regards, tom lane
Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: Problem while updating a foreign table pointing to a partitionedtable on foreign server
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: prefix function