Re: Changed SRF in targetlist handling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Changed SRF in targetlist handling
Date
Msg-id 29253.1465239221@sss.pgh.pa.us
Whole thread Raw
In response to Re: Changed SRF in targetlist handling  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: Changed SRF in targetlist handling  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Robert Haas wrote:
>> On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> No, because then you get the cross-product of multiple SRFs, not the
>>> run-in-lockstep behavior.

>> Oh.  I assumed that was the expected behavior.  But, ah, what do I know?

> Lots, I assume -- but in this case, probably next to nothing, just like
> most of us, because what sane person or application would be really
> relying on the wacko historical behavior, in order to generate some
> collective knowledge?  However, I think that it is possible that
> someone, somewhere has two SRFs-in-targetlist that return the same
> number of rows and that the current implementation works fine for them;

Yes.  Run-in-lockstep is an extremely useful behavior, so much so that
we made a LATERAL variant for it.  I do not see a reason to break such
cases in the targetlist.

> My vote is to raise an error in the case of more than one SRF in targetlist.

Note that that risks breaking cases that the user does not think are "more
than one SRF".  Consider this example using a regression-test table:

regression=# create function foo() returns setof int8_tbl as
regression-# 'select * from int8_tbl' language sql;
CREATE FUNCTION
regression=# select foo();                foo                  

--------------------------------------(123,456)(123,4567890123456789)(4567890123456789,123)(4567890123456789,4567890123456789)(4567890123456789,-4567890123456789)
(5 rows)

regression=# explain verbose select foo();                 QUERY PLAN                  
----------------------------------------------Result  (cost=0.00..5.25 rows=1000 width=32)  Output: foo()
(2 rows)

regression=# select (foo()).*;       q1        |        q2         
------------------+-------------------             123 |               456             123 |
45678901234567894567890123456789|               1234567890123456789 |  45678901234567894567890123456789 |
-4567890123456789
(5 rows)

regression=# explain verbose select (foo()).*;                 QUERY PLAN                  
----------------------------------------------Result  (cost=0.00..5.50 rows=1000 width=16)  Output: (foo()).q1,
(foo()).q2
(2 rows)

The reason we can get away with this simplistic treatment of
composite-returning SRFs is precisely the run-in-lockstep behavior.
Otherwise the second query would have returned 25 rows.

Now, if we decide to try to rewrite tlist SRFs as LATERAL, it would likely
behoove us to do that rewrite before expanding * not after, so that we can
eliminate the multiple evaluation of foo() that happens currently.  (That
makes it a parser problem not a planner problem.)  And maybe we should
rewrite non-SRF composite-returning functions this way too, because people
have definitely complained about the extra evaluations in that context.
But my point here is that lockstep evaluation does have practical use
when the SRFs are iterating over matching collections of generated rows.
And that seems like a pretty common use-case.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [sqlsmith] Failed assertions on parallel worker shutdown
Next
From: Robert Haas
Date:
Subject: Re: Changed SRF in targetlist handling