Re: UNNEST with multiple args, and TABLE with multiple funcs - Mailing list pgsql-hackers

From Boszormenyi Zoltan
Subject Re: UNNEST with multiple args, and TABLE with multiple funcs
Date
Msg-id 52124690.8050309@cybertec.at
Whole thread Raw
In response to UNNEST with multiple args, and TABLE with multiple funcs  (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Responses Re: UNNEST with multiple args, and TABLE with multiple funcs  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Hi,

2013-08-13 15:54 keltezéssel, Andrew Gierth írta:
> Summary:
>
> This patch implements a method for expanding multiple SRFs in parallel
> that does not have the surprising LCM behaviour of SRFs-in-select-list.
> (Functions returning fewer rows are padded with nulls instead.)
>
> It then uses this method combined with a parse-time hack to implement
> the (intended to be) spec-conforming behaviour of UNNEST with multiple
> parameters, including flattening of composite results.
>
> The upshot is that given a table like this:
>
> postgres=# select * from t1;
>         a       |         b         |                      c
> ---------------+-------------------+----------------------------------------------
>   {11,12,13}    | {wombat}          |
>   {5,10}        | {foo,bar}         | {"(123,xyzzy)","(456,plugh)","(789,plover)"}
>   {21,31,41,51} | {fred,jim,sheila} | {"(111,xyzzy)","(222,plugh)"}
> (3 rows)
>
> (where column "c" is an array of a composite type with 2 cols, "x" and "y")
>
> You can do this:
>
> postgres=# select u.* from t1, unnest(a,b,c) with ordinality as u;
>   ?column? | ?column? |  x  |   y    | ordinality
> ----------+----------+-----+--------+------------
>         11 | wombat   |     |        |          1
>         12 |          |     |        |          2
>         13 |          |     |        |          3
>          5 | foo      | 123 | xyzzy  |          1
>         10 | bar      | 456 | plugh  |          2
>            |          | 789 | plover |          3
>         21 | fred     | 111 | xyzzy  |          1
>         31 | jim      | 222 | plugh  |          2
>         41 | sheila   |     |        |          3
>         51 |          |     |        |          4
> (10 rows)
>
> Or for an example of general combination of functions:
>
> postgres=# select * from table(generate_series(10,20,5), unnest(array['fred','jim']));
>   ?column? | ?column?
> ----------+----------
>         10 | fred
>         15 | jim
>         20 |
> (3 rows)
>
> Implementation Details:
>
> The spec syntax for table function calls, <table function derived table>
> in <table reference>, looks like TABLE(func(args...)) AS ...
>
> This patch implements that, plus an extension: it allows multiple
> functions, TABLE(func1(...), func2(...), func3(...)) [WITH ORDINALITY]
> and defines this as meaning that the functions are to be evaluated in
> parallel.
>
> This is implemented by changing RangeFunction, function RTEs, and the
> FunctionScan node to take lists of function calls rather than a single
> function. The calling convention for SRFs is completely unchanged; each
> function returns its own rows (or a tuplestore in materialize mode) just
> as before, and FunctionScan combines the results into a single output
> tuple (keeping track of which functions are exhausted in order to
> correctly fill in nulls on a backwards scan).
>
> Then, a hack in the parser converts unnest(...) appearing as a
> func_table (and only there) into a list of unnest() calls, one for each
> parameter.  So
>
>     select ... from unnest(a,b,c)
>
> is converted to
>
>     select ... from TABLE(unnest(a),unnest(b),unnest(c))
>
> and if unnest appears as part of an existing list inside TABLE(), it's
> expanded to multiple entries there too.
>
> This parser hackery is of course somewhat ugly. But given the objective
> of implementing the spec's unnest syntax, it seems to be the least ugly
> of the possible approaches. (The hard part of doing it any other way
> would be generating the description of the result type; composite array
> parameters expand into multiple result columns.)

Harder maybe but it may still be cleaner in the long run.

> Overall, it's my intention here to remove as many as feasible of the old
> reasons why one might use an SRF in the select list.

Indeed, it's a big nail in the coffin for SRFs-in-targetlist. Having
WITH ORDINALITY and this feature, I would vote for removing
SRF-in-targetlist and call the release PostgreSQL 10.0.

>   This should also
> address the points that Josh brought up in discussion of ORDINALITY
> regarding use of SRF-in-select to unnest multiple arrays.
>
> (As a side issue, this patch also sets up pathkeys for ordinality along
> the lines of a patch I suggested to Greg a while back in response to
> his.)
>
> Current patch status:
>
> This is a first working cut: no docs, no tests, not enough comments, the
> deparse logic probably needs more work (it deparses correctly but the
> formatting may be suboptimal). However all the functionality is believed
> to be in place.

With this last paragraph in mind, I am trying a little review.

* Is the patch in a patch format which has context? (eg: context diff format)

Yes.

* Does it apply cleanly to the current git master?

Applies with some offsets on a few files but without fuzz.

* Does it include reasonable tests, necessary doc patches, etc?

No, as told by the patch author.

* Does the patch actually implement what it's supposed to do?

Yes.

* Do we want that?

Yes.

* Do we already have it?

No.

* Does it follow SQL spec, or the community-agreed behavior?

The SQL spec says these:

In 7.6 <table reference>

<table primary> ::=
...
| <table function derived table> [ AS ] <correlation name> [ <left paren> <derived column 
list> <right paren> ]

also later in the same section:

<table function derived table> ::=        TABLE <left paren> <collection value expression> <right paren>

In 6.26 <value expression>

<collection value expression> ::=        <array value expression> | <multiset value expression>

In 6.36 <array value expression>

<array value expression> ::= <array concatenation> | <array primary>
<array concatenation> ::= <array value expression 1> <concatenation operator> <array primary>
<array value expression 1> ::= <array value expression>
<array primary> ::= <array value function> | <value expression primary>

6.3 <value expression primary>

<value expression primary> ::=        <parenthesized value expression>        | <nonparenthesized value expression
primary>

<parenthesized value expression> ::= <left paren> <value expression> <right paren>

<nonparenthesized value expression primary> ::=        <unsigned value specification>        | <column reference>
| <set function specification>        | <window function>        | <nested window function>        | <scalar subquery>
     | <case expression>        | <cast specification>        | <field reference>        | <subtype treatment>        |
<methodinvocation>        | <static method invocation>        | <new specification>        | <attribute or method
reference>       | <reference resolution>        | <collection value constructor>        | <array element reference>
   | <multiset element reference>        | <next value expression>        | <routine invocation>
 

collection value constructor> ::=        | <array value constructor>        | <multiset value constructor>

So, the FROM TABLE(...) AS (...) syntax is a big can of worms and
I haven't even quoted <multiset value expression>.

As far as I can tell, these should also be allowed but isn't:

zozo=# select * from table('a'::text) as x;
ERROR:  syntax error at or near "'a'"
LINE 1: select * from table('a'::text) as x;                            ^
zozo=# select x.* from t1, table(t1.a) as x;
ERROR:  syntax error at or near ")"
LINE 1: select x.* from t1, table(t1.a) as x;                                      ^
zozo=# select x.* from table((6)) as x(a int4);
ERROR:  syntax error at or near "("
LINE 1: select x.* from table((6)) as x(a int4);                              ^
zozo=# select x.* from table(values (6)) as x(a int4);
ERROR:  syntax error at or near "("
LINE 1: select x.* from table(values (6)) as x(a int4);                                     ^
zozo=# select x.* from table(values(6)) as x(a int4);
ERROR:  syntax error at or near "("
LINE 1: select x.* from table(values(6)) as x(a int4);                                    ^

What the patch implements is only the last choice for
<nonparenthesized value expression primary>: <routine invocation>

When you add documentation, it would be nice to mention it.

Also, the grammar extension is a start for adding all the other
standard choices for TABLE().

* Does it include pg_dump support (if applicable)?

n/a

* Are there dangers?

I can't see any. 8-)

* Have all the bases been covered?

My previous comments about the TABLE() syntax says it all.
You can interpret it either way. :-)

* Does the feature work as advertised?

Yes.

* Are there corner cases the author has failed to consider?

I don't know.

* Are there any assertion failures or crashes?

No.

* Does the patch slow down simple tests?

No.

* If it claims to improve performance, does it?

It certainly improves writing queries, as functions inside
unnest() get processed in one scan.

* Does it slow down other things?

I don't think so.

* Does it follow the project coding guidelines?

Yes.

* Are there portability issues?

No.

* Will it work on Windows/BSD etc?

It should, the code uses standard internal PostgreSQL APIs
and extends them. No new system call.

* Are the comments sufficient and accurate?

According to the author, no.

* Does it do what it says, correctly?

Yes.

* Does it produce compiler warnings?

No.

* Can you make it crash?

No.

* Is everything done in a way that fits together coherently with other features/modules?

I think so

* Are there interdependencies that can cause problems?

I don't know.

Best regards,
Zoltán Böszörményi

-- 
----------------------------------
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/




pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Feature Request on Extensions
Next
From: 'Bruce Momjian'
Date:
Subject: Re: 9.3 release notes suggestions