Re: Changed SRF in targetlist handling - Mailing list pgsql-hackers
From | David G. Johnston |
---|---|
Subject | Re: Changed SRF in targetlist handling |
Date | |
Msg-id | CAKFQuwb7UQg5qztjx=0NkND4qWTycMcWFQohx--gbTzZE4nd5g@mail.gmail.com Whole thread Raw |
In response to | Re: Changed SRF in targetlist handling (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Merlin Moncure <mmoncure@gmail.com> writes:
> On Mon, May 23, 2016 at 2:13 PM, David Fetter <david@fetter.org> wrote:
>> On Mon, May 23, 2016 at 01:28:11PM -0500, Merlin Moncure wrote:
>>> +1 on removing LCM.
>> As a green field project, that would make total sense. As a thing
>> decades in, it's not clear to me that that would break less stuff or
>> break it worse than simply disallowing SRFs in the target list, which
>> has been rejected on bugward-compatibility grounds. I suspect it
>> would be even worse because disallowing SRFs in target lists would at
>> least be obvious and localized when it broke code.
> If I'm reading this correctly, it sounds to me like you are making the
> case that removing target list SRF completely would somehow cause less
> breakage than say, rewriting it to a LATERAL based implementation for
> example. With more than a little forbearance, let's just say I don't
> agree.
We should consider single and multiple SRFs in a targetlist as distinct
use-cases; only the latter has got weird properties.
There are several things we could potentially do with multiple SRFs in
the same targetlist. In increasing order of backwards compatibility and
effort required:
1. Throw error if there's more than one SRF.
2. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...). This would
have the same behavior as before if the SRFs all return the same number
of rows, and otherwise would behave differently.
3. Rewrite into some other construct that still ends up as a FunctionScan
RTE node, but has the old LCM behavior if the SRFs produce different
numbers of rows. (Perhaps we would not need to expose this construct
as something directly SQL-visible.)
It's certainly arguable that the common use-cases for SRF-in-tlist
don't have more than one SRF per tlist, and thus that implementing #1
would be an appropriate amount of effort. It's worth noting also that
the LCM behavior has been repeatedly reported as a bug, and therefore
that if we do #3 we'll be expending very substantial effort to be
literally bug-compatible with ancient behavior that no one in the
current development group thinks is well-designed. As far as #2 goes,
it would have the advantage that code depending on the same-number-of-
rows case would continue to work as before. David has a point that it
would silently break application code that's actually depending on the
LCM behavior, but how much of that is there likely to be, really?
[ reflects a bit... ] I guess there is room for an option 2-and-a-half:
2.5. Rewrite into LATERAL ROWS FROM (srf1(), srf2(), ...), but decorate
the FunctionScan RTE to tell the executor to throw an error if the SRFs
don't all return the same number of rows, rather than silently
null-padding. This would have the same behavior as before for the sane
case, and would be very not-silent about cases where apps actually invoked
the LCM behavior. Again, we wouldn't necessarily have to expose such an
option at the SQL level. (Though it strikes me that such a restriction
could have value in its own right, analogous to the STRICT options that
we've invented in some other places to allow insisting on the expected
numbers of rows being returned. ROWS FROM STRICT (srf1(), srf2(), ...),
anybody?)
I'd let the engineers decide between 1, 2.5, and 3 - but if we accomplish our goals while implementing #3 I'd say that would be the best outcome for the community as whole.
We don't have the luxury of providing a safe-usage mode where people writing new queries get the error but pre-existing queries are considered OK. We will have to rely upon education and deal with the occasional bug report but our long-time customers, even if only a minority would be affected, will appreciate the effort taken to not break code that has been working for a long time.
The minority is likely small enough to at least make options 1 and 2.5 viable though I'd think we make an effort to avoid #1.
David J.
pgsql-hackers by date: