Re: Changed SRF in targetlist handling - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: Changed SRF in targetlist handling
Date
Msg-id CAKFQuwbs-hUru-cifwNJ18cKrLbriSrXM9kWm=ZbAcya8jDgug@mail.gmail.com
Whole thread Raw
In response to Re: Changed SRF in targetlist handling  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Changed SRF in targetlist handling  (Vik Fearing <vik@2ndquadrant.fr>)
Re: Changed SRF in targetlist handling  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Mon, Jun 6, 2016 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, May 23, 2016 at 4:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> 2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...).  This would
>> have the same behavior as before if the SRFs all return the same number
>> of rows, and otherwise would behave differently.

> I thought the idea was to rewrite it as LATERAL ROWS FROM (srf1()),
> LATERAL ROWS FROM (srf2()), ...

No, because then you get the cross-product of multiple SRFs, not the
run-in-lockstep behavior.

> The rewrite you propose here seems to NULL-pad rows after the first
> SRF is exhausted:

Yes.  That's why I said it's not compatible if the SRFs don't all return
the same number of rows.  It seems like a reasonable definition to me
though, certainly much more reasonable than the current run-until-LCM
behavior.

​IOW, this is why this mode query has to fail.
 

> The latter is how I'd expect SRF-in-targetlist to work.

That's not even close to how it works now.  It would break *every*
existing application that has multiple SRFs in the tlist, not just
the ones whose SRFs return different numbers of rows.  And I'm not
convinced that it's a more useful behavior.

To clarify, the present behavior is basically a combination of both of Robert's results.

If the SRFs return the same number of rows the first (zippered) result is returned without an NULL padding.

If the SRFs return a different number of rows the LCM behavior kicks in and you get Robert's second result.

SELECT generate_series(1, 4), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM ( generate_series(1, 4), generate_series(1, 4) );

BUT

​SELECT generate_series(1, 3), generate_series(1, 4) ORDER BY 1, 2;
is the same as
SELECT * FROM ROWS FROM generate_series(1, 3) a, LATERAL ROWS FROM generate_series(1, 4) b;


Tom's 2.5 proposal basically says we make the former equivalence succeed and have the later one fail.

The rewrite would be unaware of the cardinality of the SRF and so it cannot conditionally rewrite the query.  One of the two must be chosen and the incompatible behavior turned into an error.

David J.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Next
From: Tom Lane
Date:
Subject: Re: pg9.6 segfault using simple query (related to use fk for join estimates)