Re: [HACKERS] patch: function xmltable - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: [HACKERS] patch: function xmltable
Date
Msg-id CAFj8pRAPau_HzD1Yo_B64t6X4bv-XXXEND5aKRo-A0w47BK6KQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] patch: function xmltable  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: [HACKERS] patch: function xmltable  (Pavel Stehule <pavel.stehule@gmail.com>)
Re: [HACKERS] patch: function xmltable  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers


2017-03-03 19:15 GMT+01:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:

> attached update with fixed tests

Heh, I noticed that you removed the libxml "context" lines that
differentiate xml.out from xml_2.out when doing this.  My implementation
emits those lines, so it was failing for me.  I restored them.

I also changed a few things to avoid copying into TableFuncScanState
things that come from the TableFunc itself, since the executor state
node can grab them from the plan node.  Let's do that.  So instead of
"evalcols" the code now checks that the column list is empty; and also,
read the ordinality column number from the plan node.

I have to bounce this back to you one more time, hopefully the last one
I hope.  Two things:

1. Please verify that pg_stat_statements behaves correctly.  The patch
doesn't have changes to contrib/ so without testing I'm guessing that it
doesn't work.  I think something very simple should do.

2. As I've complained many times, I find the way we manage an empty
COLUMNS clause pretty bad.  The standard doesn't require that syntax
(COLUMNS is required), and I don't like the implementation, so why not
provide the feature in a different way?  My proposal is to change the
column options in gram.y to be something like this:

The clause COLUMNS is optional on Oracle and DB2
 
So I prefer a Oracle, DB2 design. If you are strongly against it, then we can remove it to be ANSI/SQL only.

I am don't see an good idea to introduce third syntax.


xmltable_column_option_el:
                        IDENT b_expr
                                { $$ = makeDefElem($1, $2, @1); }
                        | DEFAULT b_expr
                                { $$ = makeDefElem("default", $2, @1); }
                        | FULL VALUE_P
                                { $$ = makeDefElem("full_value", NULL, @1); }
                        | NOT NULL_P
                                { $$ = makeDefElem("is_not_null", (Node *) makeInteger(true), @1); }
                        | NULL_P
                                { $$ = makeDefElem("is_not_null", (Node *) makeInteger(false), @1); }
                ;

Note the FULL VALUE.  Then we can process it like

                        else if (strcmp(defel->defname, "full_value") == 0)
                        {
                                if (fc->colexpr != NULL)
                                        ereport(ERROR,
                                                        (errcode(ERRCODE_SYNTAX_ERROR),
                                                         errmsg("FULL ROW may not be specified together with PATH"),
                                                         parser_errposition(defel->location)));
                                fc->full_row = true;
                        }

So if you want the full XML value of the row, you have to specify it,

 .. XMLTABLE ( ... COLUMNS ..., whole_row xml FULL VALUE, ... )

This has the extra feature that you can add, say, an ORDINALITY column
together with the XML value, something that you cannot do with the
current implementation.

It doesn't have to be FULL VALUE, but I couldn't think of anything
better.  (I didn't want to add any new keywords for this.)  If you have
a better idea, let's discuss.

I don't see a introduction own syntax as necessary solution here - use Oracle, DB2 compatible syntax, or ANSI.

It is partially corner case - the benefit of this case is almost bigger compatibility with mentioned databases.
 

Code-wise, this completely removes the "else" block in transformRangeTableFunc
which I marked with an XXX comment.  That's a good thing -- let's get
rid of that.  Also, it should remove the need for the separate "if
!columns" case in tfuncLoadRows.  All those cases would become part of
the normal code path instead of special cases.  I think
XmlTableSetColumnFilter doesn't need any change (we just don't call if
for the FULL VALUE row); and XmlTableGetValue needs a special case that
if the column filter is NULL (i.e. SetColumnFilter wasn't called for
that column) then return the whole row.


Of course, this opens an implementation issue: how do you annotate
things from parse analysis till execution?  The current TableFunc
structure doesn't help, because there are only lists of column names and
expressions; and we can't use the case of a NULL colexpr, because that
case is already used by the column filter being the column name (a
feature required by the standard).  A simple way would be to have a new
"colno" struct member, to store a column number for the column marked
FULL VALUE (just like ordinalitycol).  This means you can't have more
than one of those FULL VALUE columns, but that seems okay.


(Of course, this means that the two cases that have no COLUMNS in the
"xmltable" production in gram.y should go away).

You are commiter, and you should to decide - as first I prefer current state, as second a remove this part - it should be good for you too, because code that you don't like will be left.

I dislike introduce new syntax - this case is not too important for this.

Regards

Pavel


--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] Logical replication failing when foreign key present
Next
From: Adam Brightwell
Date:
Subject: Re: [HACKERS] RADIUS fallback servers