Thread: Re: [HACKERS] PostgreSQL 10 changes in exclusion constraints - did something change? CASE WHEN behavior oddity

> Did something change with how exclusion constraints are handled?  I'm
trying to troubleshoot a regression we are having with PostGIS raster
support.

> As best I can guess, it's because exclusion constraints that used to work
in past versions are failing in PostgreSQL 10 with an error something like
> this:

> ERROR:  conflicting key value violates exclusion constraint
"enforce_spatially_unique_test_raster_columns_rast"
> ERROR:  new row for relation "test_raster_columns" violates check
constraint "enforce_coverage_tile_rast"

> Unfortunately I don't know how long this has been an issue since we had an
earlier test failing preventing the raster ones  from being tested.

> Thanks,
> Regina


I figured out the culprit was the change in CASE WHEN behavior with set
returning functions

Had a criteria something of the form:

CASE WHEN some_condition_dependent_on_sometable_that_resolves_to_false THEN
(regexp_matches(...))[1] ELSE ...  END
FROM sometable;


One thing that seems a little odd to me is why these return a record


SELECT CASE WHEN strpos('ABC', 'd') > 1 THEN (regexp_matches('a (b) c',
'd'))[1] ELSE 'a' END;

SELECT CASE WHEN false THEN (regexp_matches('a (b) c', 'd'))[1] ELSE 'a' END
FROM pg_tables;



And this doesn't - I'm guessing it has to do with this being a function of
the value of table, but it seems unintuitive 
From a user perspective.

SELECT CASE WHEN  strpos(f.tablename, 'ANY (ARRAY[') > 1 THEN
(regexp_matches('a (b) c', 'd'))[1] ELSE 'a' END
FROM pg_tables AS f;


Pre-PostgreSQL 10 this would return a row for each record in pg_tables



Thanks,
Regina




"Regina Obe" <lr@pcorp.us> writes:
> I figured out the culprit was the change in CASE WHEN behavior with set
> returning functions
> Had a criteria something of the form:
> CASE WHEN some_condition_dependent_on_sometable_that_resolves_to_false THEN
> (regexp_matches(...))[1] ELSE ...  END
> FROM sometable;

You might want to consider changing such usages to use regexp_match()
instead of regexp_matches().
        regards, tom lane



> "Regina Obe" <lr@pcorp.us> writes:
>> I figured out the culprit was the change in CASE WHEN behavior with 
>> set returning functions Had a criteria something of the form:
>> CASE WHEN some_condition_dependent_on_sometable_that_resolves_to_false 
>>  THEN (regexp_matches(...))[1] ELSE ...  END FROM sometable;

> You might want to consider changing such usages to use regexp_match()
instead of regexp_matches().

>            regards, tom lane

Thanks.  I ended up swapping out with substring which was a bit shorter than
regexp_match()[].

But I've got similar problems with PostGIS topology logic and the easiest
change to make was take advantage
of the fact that you guys are treating CASE constant ... THEN  SRF ...

Differently 

Than 

CASE not_constant_based_on_table_value THEN  SRF ..


So I switched those to constant checks.  This feels a little dirty and
fragile to me though.

Is this behavior going to stay or change?

It seems inconsistent from a user perspective that

CASE constant .... == short-circuit skipping over SRFs that may  otherwise
fail

While 

CASE not_constant_table_dependent  doesn't short-circuit.

I can understand the motive behind it, it just feels a little inconsistent
from an end-user POV.


Thanks,
Regina




"Regina Obe" <lr@pcorp.us> writes:
> Is this behavior going to stay or change?
> It seems inconsistent from a user perspective that
> CASE constant .... == short-circuit skipping over SRFs that may  otherwise
> fail
> While 
> CASE not_constant_table_dependent  doesn't short-circuit.
> I can understand the motive behind it, it just feels a little inconsistent
> from an end-user POV.

After thinking about this for awhile, I agree that what we've got now
isn't too satisfactory.  We can make an analogy between SRFs and
aggregate functions: both of them look like simple function calls
syntactically, but they have global effects on the semantics of the
query, particularly on how many rows are returned.

In the case of aggregates, there is long-standing precedent that we
can optimize away individual aggregate calls but the query semantics
do not change, ie you get one output row (or one per GROUP BY group)
even if every last aggregate call disappears due to CASE simplification.
The same was true for deletion of SRFs by CASE-simplification before
v10, but now we've broken that, which seems like a clear bug.

I think it would be possible to teach eval_const_expressions that
it must not discard CASE/COALESCE subexpressions that contain SRFs,
which would preserve the rule that expression simplification doesn't
change the query semantics.

Another possibility is to say that we've broken this situation
irretrievably and we should start throwing errors for SRFs in
places where they'd be conditionally evaluated.  That's not real
nice perhaps, but it's better than the way things are right now.

Thoughts?
        regards, tom lane



Hi,

On 2017-05-28 14:03:26 -0400, Tom Lane wrote:
> I think it would be possible to teach eval_const_expressions that
> it must not discard CASE/COALESCE subexpressions that contain SRFs,
> which would preserve the rule that expression simplification doesn't
> change the query semantics.

That sounds like a good idea.  Do you want to write up a patch, or
should I?  I can, but I'd want to finish the walsender panic and other
signal handling stuff first (mostly waiting for review for now).


> Another possibility is to say that we've broken this situation
> irretrievably and we should start throwing errors for SRFs in
> places where they'd be conditionally evaluated.  That's not real
> nice perhaps, but it's better than the way things are right now.

I'd be ok with that too, but I don't really see a strong need so far.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2017-05-28 14:03:26 -0400, Tom Lane wrote:
>> I think it would be possible to teach eval_const_expressions that
>> it must not discard CASE/COALESCE subexpressions that contain SRFs,
>> which would preserve the rule that expression simplification doesn't
>> change the query semantics.

> That sounds like a good idea.  Do you want to write up a patch, or
> should I?  I can, but I'd want to finish the walsender panic and other
> signal handling stuff first (mostly waiting for review for now).

I think you've got enough on your plate.  I can take care of whatever
we decide to do here.  What I was looking for was opinions on which
way to address it.

>> Another possibility is to say that we've broken this situation
>> irretrievably and we should start throwing errors for SRFs in
>> places where they'd be conditionally evaluated.  That's not real
>> nice perhaps, but it's better than the way things are right now.

> I'd be ok with that too, but I don't really see a strong need so far.

The argument for this way is basically that it's better to break
apps visibly than silently.  The behavior for SRF-inside-CASE is
not going to be the same as before even if we implement the fix
I suggest above, and it's arguable that this new behavior is not
at all intuitive.

I'm not really sure which way to jump, which is why I was hoping
for some discussion here.
        regards, tom lane



Hi,


On 2017-06-02 22:53:00 -0400, Tom Lane wrote:
> I think you've got enough on your plate.  I can take care of whatever
> we decide to do here.

Thanks.


> Andres Freund <andres@anarazel.de> writes:
> >> Another possibility is to say that we've broken this situation
> >> irretrievably and we should start throwing errors for SRFs in
> >> places where they'd be conditionally evaluated.  That's not real
> >> nice perhaps, but it's better than the way things are right now.
> 
> > I'd be ok with that too, but I don't really see a strong need so far.
> 
> The argument for this way is basically that it's better to break
> apps visibly than silently.

Right, I got that.


> The behavior for SRF-inside-CASE is
> not going to be the same as before even if we implement the fix
> I suggest above, and it's arguable that this new behavior is not
> at all intuitive.

Yea, I'm not a big fan of the either the pre v10 or the v10 behaviour of
SRFs inside coalesce/case.  Neither is really resonable imo - I'm not
sure a reasonable behaviour even exists.  IIRC I'd argued in the
original SRF thread that we should just throw an error, and I think we'd
concluded that we'd not do so for now.


> I'm not really sure which way to jump, which is why I was hoping
> for some discussion here.

There not really being an intuitive behaviour seems to be a bit of a
reason to disallow. Another argument that I can see is that it'll be
easier to allow it again later, than to do the reverse.  But I think the
new behaviour can also be useful, and I suspect not that many people
will hit this...

- Andres



> On Jun 2, 2017, at 8:11 PM, Andres Freund <andres@anarazel.de> wrote:
> 
> Hi,
> 
> 
> On 2017-06-02 22:53:00 -0400, Tom Lane wrote:
>> I think you've got enough on your plate.  I can take care of whatever
>> we decide to do here.
> 
> Thanks.
> 
> 
>> Andres Freund <andres@anarazel.de> writes:
>>>> Another possibility is to say that we've broken this situation
>>>> irretrievably and we should start throwing errors for SRFs in
>>>> places where they'd be conditionally evaluated.  That's not real
>>>> nice perhaps, but it's better than the way things are right now.
>> 
>>> I'd be ok with that too, but I don't really see a strong need so far.
>> 
>> The argument for this way is basically that it's better to break
>> apps visibly than silently.
> 
> Right, I got that.
> 
> 
>> The behavior for SRF-inside-CASE is
>> not going to be the same as before even if we implement the fix
>> I suggest above, and it's arguable that this new behavior is not
>> at all intuitive.
> 
> Yea, I'm not a big fan of the either the pre v10 or the v10 behaviour of
> SRFs inside coalesce/case.  Neither is really resonable imo - I'm not
> sure a reasonable behaviour even exists.  IIRC I'd argued in the
> original SRF thread that we should just throw an error, and I think we'd
> concluded that we'd not do so for now.

I am trying to get my head around the type of query you and Tom
are discussing.  When you say you are unsure a reasonable behavior
even exists, are you saying such queries have no intuitive meaning?

Can you give an example of such a query which has no intuitive
meaning?  Perhaps I am not thinking about the right kind of queries.
I have been thinking about examples like:

SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 ENDFROM table_with_columns_x_and_y_and_z;

Which to me gives 'z' output rows per table row where y is true, and
one output row per table row where y is false.  That could be changed
with an aggregate function such as:

SELECT x, CASE WHEN y THEN SUM(generate_series(1,z)) ELSE 5 ENDFROM table_with_columns_x_and_y;

Which to me gives one output row per table row regardless of whether y
is true.


Thanks, and my apologies if I am merely lacking sufficient imagination to
think of a proper example.

Mark Dilger



> On Jun 4, 2017, at 11:55 AM, Mark Dilger <hornschnorter@gmail.com> wrote:
> 
> SELECT x, CASE WHEN y THEN SUM(generate_series(1,z)) ELSE 5 END
>     FROM table_with_columns_x_and_y;

Sorry, this table is supposed to be the same as the previous one,

table_with_columns_x_and_y_and_z




Hi Mark,

On 2017-06-04 11:55:03 -0700, Mark Dilger wrote:
> > Yea, I'm not a big fan of the either the pre v10 or the v10 behaviour of
> > SRFs inside coalesce/case.  Neither is really resonable imo - I'm not
> > sure a reasonable behaviour even exists.  IIRC I'd argued in the
> > original SRF thread that we should just throw an error, and I think we'd
> > concluded that we'd not do so for now.
> 
> I am trying to get my head around the type of query you and Tom
> are discussing.  When you say you are unsure a reasonable behavior
> even exists, are you saying such queries have no intuitive meaning?

I'm not saying that there aren't some cases where it's intuitive, but
there's definitely lots that don't have intuitive meaning.  Especially
when there are multiple SRFs in the same targetlist.

Do I understand correctly that you're essentially advocating for the <
v10 behaviour?  It'd be nontrivial to implement that, without loosing
the significant benefits (Significantly slower, higher code complexity,
weird behaviour around multiple SRFs) gained by removing the previous
implementation.   I'd really like to see some examples of when all of
this is useful - I've yet to see a realistic one that's not just as
easily written differently


> Can you give an example of such a query which has no intuitive
> meaning?  Perhaps I am not thinking about the right kind of queries.
> I have been thinking about examples like:
> 
> SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 END
>     FROM table_with_columns_x_and_y_and_z;
> 
> Which to me gives 'z' output rows per table row where y is true, and
> one output row per table row where y is false.

Try any query that has one SRF outside of the CASE, and one inside.  In
the old behaviour that'll make the total number of rows returned nearly
undeterministic because of the least-common-multiple behaviour.


> That could be changed with an aggregate function such as:
> SELECT x, CASE WHEN y THEN SUM(generate_series(1,z)) ELSE 5 END
>     FROM table_with_columns_x_and_y;

That query doesn't work.  First off, aggregates don't take set arguments
(neither in old nor new releases), secondly aggregates are evaluated
independently of CASE/COALESCE statements, thirdly you're missing group
bys.  Those all are independent of the v10 changes.


> Thanks, and my apologies if I am merely lacking sufficient imagination to
> think of a proper example.

Might be worthwhile to reread the thread about the SRF reimplementation.

https://www.postgresql.org/message-id/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de

- Andres



> On Jun 4, 2017, at 12:35 PM, Andres Freund <andres@anarazel.de> wrote:
>
> Hi Mark,
>
> On 2017-06-04 11:55:03 -0700, Mark Dilger wrote:
>>> Yea, I'm not a big fan of the either the pre v10 or the v10 behaviour of
>>> SRFs inside coalesce/case.  Neither is really resonable imo - I'm not
>>> sure a reasonable behaviour even exists.  IIRC I'd argued in the
>>> original SRF thread that we should just throw an error, and I think we'd
>>> concluded that we'd not do so for now.
>>
>> I am trying to get my head around the type of query you and Tom
>> are discussing.  When you say you are unsure a reasonable behavior
>> even exists, are you saying such queries have no intuitive meaning?
>
> I'm not saying that there aren't some cases where it's intuitive, but
> there's definitely lots that don't have intuitive meaning.  Especially
> when there are multiple SRFs in the same targetlist.
>
> Do I understand correctly that you're essentially advocating for the <
> v10 behaviour?

No, I'm not advocating either way.  I merely wanted to know which queries
you and Tom were talking about.

>  It'd be nontrivial to implement that, without loosing
> the significant benefits (Significantly slower, higher code complexity,
> weird behaviour around multiple SRFs) gained by removing the previous
> implementation.   I'd really like to see some examples of when all of
> this is useful - I've yet to see a realistic one that's not just as
> easily written differently
>
>
>> Can you give an example of such a query which has no intuitive
>> meaning?  Perhaps I am not thinking about the right kind of queries.
>> I have been thinking about examples like:
>>
>> SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 END
>>     FROM table_with_columns_x_and_y_and_z;
>>
>> Which to me gives 'z' output rows per table row where y is true, and
>> one output row per table row where y is false.
>
> Try any query that has one SRF outside of the CASE, and one inside.  In
> the old behaviour that'll make the total number of rows returned nearly
> undeterministic because of the least-common-multiple behaviour.
>
>> That could be changed with an aggregate function such as:
>> SELECT x, CASE WHEN y THEN SUM(generate_series(1,z)) ELSE 5 END
>>     FROM table_with_columns_x_and_y;
>
> That query doesn't work.  First off, aggregates don't take set arguments
> (neither in old nor new releases), secondly aggregates are evaluated
> independently of CASE/COALESCE statements, thirdly you're missing group
> bys.  Those all are independent of the v10 changes.

Sorry, I was not clear.  What I meant to get at was that if you remove from the
executor all support for SRFs inside case statements, you might foreclose the option
of extending the syntax at some later date to allow aggregates over SRFs.  I'm
not saying that this works currently, but in principle if you allowed that SUM() that
I put up there, you'd get back exactly one row from it, same as you get from the
ELSE clause.  That would seem to solve the problem without going so far as
completely disallowing the SRF altogether.  The reasons for not putting a GROUP BY
clause in the example are (a) I don't know where it would go, syntactically, and (b)
it would defeat the purpose, because once you are grouping, you again have the
possibility of getting more than one row.

I'm not advocating implementing this; I'm just speculating about possible future
features in which SRFs inside a case statement might be allowed.

>
>> Thanks, and my apologies if I am merely lacking sufficient imagination to
>> think of a proper example.
>
> Might be worthwhile to reread the thread about the SRF reimplementation.
>
> https://www.postgresql.org/message-id/20160822214023.aaxz5l4igypowyri@alap3.anarazel.de

This helps, thanks!

Mark Dilger




On 2017-06-04 14:16:14 -0700, Mark Dilger wrote:
> Sorry, I was not clear.  What I meant to get at was that if you remove from the
> executor all support for SRFs inside case statements, you might foreclose the option
> of extending the syntax at some later date to allow aggregates over
> SRFs.

Seems very unlikely that we'd ever want to do that.  The right way to do
this is to simply move the SRF into the from list.  Having the executor
support arbitrary sources of tuples would just complicate and slow down
already complicated and slow code...


> I'm
> not saying that this works currently, but in principle if you allowed that SUM() that
> I put up there, you'd get back exactly one row from it, same as you get from the
> ELSE clause.  That would seem to solve the problem without going so far as
> completely disallowing the SRF altogether.

But what would the benefit be?


Greetings,

Andres Freund



> On Jun 4, 2017, at 2:19 PM, Andres Freund <andres@anarazel.de> wrote:
>
> On 2017-06-04 14:16:14 -0700, Mark Dilger wrote:
>> Sorry, I was not clear.  What I meant to get at was that if you remove from the
>> executor all support for SRFs inside case statements, you might foreclose the option
>> of extending the syntax at some later date to allow aggregates over
>> SRFs.
>
> Seems very unlikely that we'd ever want to do that.  The right way to do
> this is to simply move the SRF into the from list.  Having the executor
> support arbitrary sources of tuples would just complicate and slow down
> already complicated and slow code...
>
>
>> I'm
>> not saying that this works currently, but in principle if you allowed that SUM() that
>> I put up there, you'd get back exactly one row from it, same as you get from the
>> ELSE clause.  That would seem to solve the problem without going so far as
>> completely disallowing the SRF altogether.
>
> But what would the benefit be?

In my example, the aggregate function is taking a column from the table as
an argument, so the output of the aggregate function needs to be computed per row,
not just once.  And if the function is expensive, or has side-effects, you might
only want it to execute for those rows where the CASE statement is true, rather
than for all of them.  You may get that same behavior using lateral or some such,
I'm uncertain, but in a complicated CASE statement, it be more straightforward
to write something like:

SELECTCASE    WHEN t.x = 'foo' THEN expensive_aggfunc1(srf1(t.y,t.z))    WHEN t.x = 'bar' THEN
expensive_aggfunc2(srf2(t.y,t.z))   WHEN t.x = 'baz' THEN expensive_aggfunc3(srf3(t.y,t.z))    ....    WHEN t.x = 'zzz'
THENexpensive_aggfuncN(srfN(t.y,t.z))    ELSE 5END 
FROM mytable t;

Than to try to write it any other way.


I'm not advocating anything here, even though it may sound that way to you.
I'm just thinking this thing through, given that you may be committing a removal
of functionality that we want back at some later time.

Out of curiosity, how would you rewrite what I have above such that the
aggregate function is not inside the case statement, and the expensive_aggfuncs
are only called for those (t.y,t.z) that are actually appropriate?


Mark Dilger


Mark Dilger <hornschnorter@gmail.com> writes:
>> On Jun 4, 2017, at 2:19 PM, Andres Freund <andres@anarazel.de> wrote:
>> Seems very unlikely that we'd ever want to do that.  The right way to do
>> this is to simply move the SRF into the from list.  Having the executor
>> support arbitrary sources of tuples would just complicate and slow down
>> already complicated and slow code...

> In my example, the aggregate function is taking a column from the table as
> an argument, so the output of the aggregate function needs to be computed per row,
> not just once.  And if the function is expensive, or has side-effects, you might
> only want it to execute for those rows where the CASE statement is true, rather
> than for all of them.  You may get that same behavior using lateral or some such,
> I'm uncertain, but in a complicated CASE statement, it be more straightforward
> to write something like:

> SELECT
>     CASE
>         WHEN t.x = 'foo' THEN expensive_aggfunc1(srf1(t.y,t.z))
>         WHEN t.x = 'bar' THEN expensive_aggfunc2(srf2(t.y,t.z))
>         WHEN t.x = 'baz' THEN expensive_aggfunc3(srf3(t.y,t.z))
>         ....
>         WHEN t.x = 'zzz' THEN expensive_aggfuncN(srfN(t.y,t.z))
>         ELSE 5
>     END
> FROM mytable t;

I think the correct way to do that already exists, namely to use a
correlated sub-select to wrap each SRF+aggregate:
   ...   WHEN t.x = 'foo' THEN (SELECT expensive_aggfunc1(s) FROM srf1(t.y,t.z) s)   ...

I don't really feel a need to invent some other notation for that.

After chewing on this for awhile, I'm starting to come to the conclusion
that we'd be best off to throw an error for SRF-inside-CASE (or
COALESCE).  Mark is correct that the simplest case of
SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 ENDFROM table_with_columns_x_and_y_and_z;

behaves just intuitively enough that people might be using it.  The new
implementation method cannot reasonably duplicate the old semantics for
that, which means that if we let it stand as-is we will be silently
breaking queries, even if we fix up some of the weirder corner cases like
what happens when the CASE can be const-simplified.  So I think we'd be
better off to make this throw an error, and force any affected users to
rewrite in a way that will work in both v10 and older releases.

As to *how* to throw an error, I think it should be possible to teach
parse analysis to detect such cases, with something like the
ParseExprKind mechanism that could be checked to see if we're inside
a subexpression that restricts what's allowed.  There are some other
checks like no-nested-aggregates that perhaps could be folded in as
well.  Checking at parse analysis ought to be sufficient because
rule rewriting could not introduce such a case where it wasn't before,
and planner subquery flattening won't introduce one either because we
don't flatten subqueries with SRFs in their tlists.

If people are on board with throwing an error, I'll go see about
writing a patch.
        regards, tom lane



On Wed, Jun 7, 2017 at 11:57 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
If people are on board with throwing an error, I'll go see about
writing a patch.

+1 from me.

David J.​
> After chewing on this for awhile, I'm starting to come to the conclusion
that we'd be best off to throw an error for SRF-inside-CASE (or COALESCE).
Mark is correct that the simplest case of

>    SELECT x, CASE WHEN y THEN generate_series(1,z) ELSE 5 END
>    FROM table_with_columns_x_and_y_and_z;

> behaves just intuitively enough that people might be using it.  The new
implementation method cannot reasonably duplicate the old semantics for
that, which means that if we let it stand as-is we will be 

> silently breaking queries, even if we fix up some of the weirder corner
cases like what happens when the CASE can be const-simplified.  So I think
we'd be better off to make this throw an error, and force any 
> affected users to rewrite in a way that will work in both v10 and older
releases.

> As to *how* to throw an error, I think it should be possible to teach
parse analysis to detect such cases, with something like the ParseExprKind
mechanism that could be checked to see if we're inside a 
> subexpression that restricts what's allowed.  There are some other checks
like no-nested-aggregates that perhaps could be folded in as well.  Checking
at parse analysis ought to be sufficient because 
> rule rewriting could not introduce such a case where it wasn't before, and
planner subquery flattening won't introduce one either because we don't
flatten subqueries with SRFs in their tlists.

> If people are on board with throwing an error, I'll go see about writing a
patch.

>            regards, tom lane

+1

I'm not a fan of either solution, but I think what Tom proposes of throwing
an error sounds like least invasive and confusing.

I'd much prefer an error thrown than silent behavior change. Given that we
ran into this in 3 places in PostGIS code, I'm not convinced the issue is
all that rare.

Make sure to point out the breaking change in the release notes though and
syntax to remedy it.

Thanks,
Regina 




"Regina Obe" <lr@pcorp.us> writes:
> I'm not a fan of either solution, but I think what Tom proposes of throwing
> an error sounds like least invasive and confusing.

> I'd much prefer an error thrown than silent behavior change. Given that we
> ran into this in 3 places in PostGIS code, I'm not convinced the issue is
> all that rare.

> Make sure to point out the breaking change in the release notes though and
> syntax to remedy it.

As far as that goes, the best fix I could think of after a few minutes is
to integrate your conditional logic into a custom set-returning function.
For example,
   select x, case when y > 0 then generate_series(1, z) else 5 end from tt;

could become
   create function mysrf(cond bool, start int, fin int, els int)     returns setof int as $$   begin     if cond then
   return query select generate_series(start, fin);     else       return query select els;     end if;   end$$
languageplpgsql;
 
   select x, mysrf(y > 0, 1, z, 5) from tt;

(adjust the amount of parameterization to taste, of course)

Now, the fact that a fairly mechanical conversion like this is possible
suggests that we *could* solve the problem if we had to, at least for
simple cases like this one.  But it'd be a lot of work, not least because
we'd presumably not want core-defined syntax to depend on an extension
like plpgsql --- and I don't see a way to do this with straight SQL
functions.  So my feeling is that we should not expend that effort.
If it turns out that a lot more people are affected than I currently
think will be the case, maybe we'll have to revisit that choice.

But this line of thinking does strengthen my feeling that throwing an
error is the right thing to do for the moment.  If we allow v10 to accept
such cases but do something different from what we used to, that will
greatly complicate any future attempt to try to restore the old behavior.
        regards, tom lane



> But this line of thinking does strengthen my feeling that throwing an
error is the right thing to do for the moment.  If we allow v10 to accept
such cases but do something different from what we used to, that 
> will greatly complicate any future attempt to try to restore the old
behavior.

>            regards, tom lane

Agreed.  The other side benefit of throwing an error instead of just doing
something different is you'll find out how rampant the old behavior is :).

People are more likely to know to complain when their apps break than they
are if it just silently starts doing something different.

My main concern in these cases is the short-circuiting not happening.
Because in these cases, the code goes into areas that it shouldn't which is
likely to mess up some logic in hard to troubleshoot ways.
I think erroring out is the best compromise.

Thanks,
Regina




On 2017-06-08 11:57:49 -0400, Regina Obe wrote:
> My main concern in these cases is the short-circuiting not happening.

Note there's also no short-circuiting e.g. for aggregates inside case
either.

- Andres



"'Andres Freund'" <andres@anarazel.de> writes:
> On 2017-06-08 11:57:49 -0400, Regina Obe wrote:
>> My main concern in these cases is the short-circuiting not happening.

> Note there's also no short-circuiting e.g. for aggregates inside case
> either.

Well, depends.  If const-folding manages to get rid of the aggregate
call altogether, it won't be computed.
        regards, tom lane



I wrote:
> As to *how* to throw an error, I think it should be possible to teach
> parse analysis to detect such cases, with something like the
> ParseExprKind mechanism that could be checked to see if we're inside
> a subexpression that restricts what's allowed.  There are some other
> checks like no-nested-aggregates that perhaps could be folded in as
> well.

I spent some time experimenting with this, and immediately found out
that information_schema.user_mapping_options contains an instance of the
problematic usage :-(.  However, that view also turns out to be a poster
child for why our old SRF semantics are a bad idea, because it's on the
hairy edge of failing completely.  It's got two SRF invocations in its
tlist, which it relies on to execute in lockstep and produce the same
number of rows --- but then it puts a CASE around one of them, with an
ELSE NULL.  So in old versions, that only works because if the CASE
doesn't return a set result at runtime then we just run the tlist the
number of times suggested by the other SRF.  And in HEAD, it only works
because we run the two SRFs in lockstep behind the scenes and then the
CASE is discarding individual results not the whole execution of the
second SRF.  If you don't have a headache at this point, re-read the above
until you do.  Or if you fully understand that and still think it's fine,
consider what will happen if the CASE's test expression generates
different results from one call to the next --- which I think is actually
possible here, because pg_has_role() operates on CatalogSnapshot time and
might possibly change its mind partway through the query.  Pre-v10, that
would have generated completely messed-up output, with a hard-to-predict
number of rows and unexpected combinations of the two SRFs' outputs.

Rewriting this view to use a LATERAL SRF call is just enormously cleaner.

Anyway, to the subject at hand.  My thought when I wrote the previous
message was to implement a "top down" mechanism whereby contexts like
CASE and COALESCE would record their presence in the ParseState while
recursively analyzing their arguments, and then SRF calls would be
responsible for throwing an error by inspecting that context.  The first
attached patch does it that way, and it seems nice and clean, but I ran
into a complete dead end while trying to extend it to handle related cases
such as disallowing SRF-inside-aggregate or nested SRFs in FROM.  The
trouble is that when we are parsing the arguments of a SRF or aggregate,
we don't actually know yet whether it's a SRF or aggregate or plain
function.  We can't find that out until we look up the pg_proc entry,
which we can't do without knowing the argument types, so we have to do
parse analysis of the arguments first.

I then considered a "bottom up" approach where the idea is for each SRF
to report its presence in the ParseState, and then the outer construct
complains if any SRF reported its presence within the outer construct's
arguments.  This isn't hard to implement, just keep a "p_last_srf" pointer
in the ParseState, as in the second attached patch.  This feels more ugly
than the first approach; for one thing, if we want to extend it to
multiple cases of "X cannot contain a Y" then we need a ParseState field
for each kind of Y we care about.  Also, it will tend to complain about
the last Y within a given X construct, not the first one; but only a true
geek would ever even notice that, let alone feel that it's strange.

I didn't actually fix the "bottom up" approach to complain about nested
SRFs.  It's clear to me how to do so, but because we analyze function
arguments before calling ParseFuncOrColumn, we'd have to have the callers
of that function remember the appropriate p_last_srf value (ie, from just
before they analyze the arguments) and then pass it to ParseFuncOrColumn.
That would add a bunch of uninteresting clutter to the patch so I didn't
do it here.

I'm inclined to go with the "bottom up" approach, but I wonder if anyone
has any comments or better ideas?

            regards, tom lane

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index cbcd6cf..98bcfa0 100644
*** a/src/backend/catalog/information_schema.sql
--- b/src/backend/catalog/information_schema.sql
*************** CREATE VIEW user_mapping_options AS
*** 2936,2947 ****
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN
(pg_options_to_table(um.umoptions)).option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

--- 2936,2949 ----
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST(opts.option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user)
!                      THEN opts.option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um,
!          pg_options_to_table(um.umoptions) opts;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 27dd49d..889338f 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformRangeSubselect(ParseState *psta
*** 485,490 ****
--- 485,491 ----
       */
      Assert(pstate->p_expr_kind == EXPR_KIND_NONE);
      pstate->p_expr_kind = EXPR_KIND_FROM_SUBSELECT;
+     Assert(pstate->p_subexpr_kind == NIL);

      /*
       * If the subselect is LATERAL, make lateral_only names of this level
*************** transformRangeSubselect(ParseState *psta
*** 504,509 ****
--- 505,511 ----
      /* Restore state */
      pstate->p_lateral_active = false;
      pstate->p_expr_kind = EXPR_KIND_NONE;
+     Assert(pstate->p_subexpr_kind == NIL);

      /*
       * Check that we got a SELECT.  Anything else should be impossible given
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 92101c9..bbefb99 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** transformCaseExpr(ParseState *pstate, Ca
*** 1630,1635 ****
--- 1630,1638 ----

      newc = makeNode(CaseExpr);

+     /* show that subexpressions are within CASE */
+     PushSubExprKind(pstate, SUBEXPR_KIND_CASE);
+
      /* transform the test expression, if any */
      arg = transformExprRecurse(pstate, (Node *) c->arg);

*************** transformCaseExpr(ParseState *pstate, Ca
*** 1741,1746 ****
--- 1744,1751 ----
                                    "CASE/WHEN");
      }

+     PopSubExprKind(pstate, SUBEXPR_KIND_CASE);
+
      newc->location = c->location;

      return (Node *) newc;
*************** transformCoalesceExpr(ParseState *pstate
*** 2181,2186 ****
--- 2186,2194 ----
      List       *newcoercedargs = NIL;
      ListCell   *args;

+     /* show that subexpressions are within COALESCE */
+     PushSubExprKind(pstate, SUBEXPR_KIND_COALESCE);
+
      foreach(args, c->args)
      {
          Node       *e = (Node *) lfirst(args);
*************** transformCoalesceExpr(ParseState *pstate
*** 2205,2212 ****
--- 2213,2223 ----
          newcoercedargs = lappend(newcoercedargs, newe);
      }

+     PopSubExprKind(pstate, SUBEXPR_KIND_COALESCE);
+
      newc->args = newcoercedargs;
      newc->location = c->location;
+
      return (Node *) newc;
  }

diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 55853c2..5dd65d0 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*************** check_srf_call_placement(ParseState *pst
*** 2089,2094 ****
--- 2089,2095 ----
  {
      const char *err;
      bool        errkind;
+     ParseSubExprKind subexprkind;

      /*
       * Check to see if the set-returning function is in an invalid place
*************** check_srf_call_placement(ParseState *pst
*** 2225,2228 ****
--- 2226,2252 ----
                   errmsg("set-returning functions are not allowed in %s",
                          ParseExprKindName(pstate->p_expr_kind)),
                   parser_errposition(pstate, location)));
+
+     /*
+      * Okay, the overall expression type allows a SRF, but what about
+      * subexpressions?    (We don't currently have enough variants of this to
+      * justify a lot of work to merge code; but we do avoid generating extra
+      * translatable strings.)
+      */
+     subexprkind = inSubExprOfKind(pstate,
+                                   SUBEXPR_KIND_CASE | SUBEXPR_KIND_COALESCE);
+     if (subexprkind == SUBEXPR_KIND_CASE)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "CASE"),
+                  parser_errposition(pstate, location)));
+     else if (subexprkind == SUBEXPR_KIND_COALESCE)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "COALESCE"),
+                  parser_errposition(pstate, location)));
  }
diff --git a/src/backend/parser/parse_node.c b/src/backend/parser/parse_node.c
index fb3d117..58ca0a8 100644
*** a/src/backend/parser/parse_node.c
--- b/src/backend/parser/parse_node.c
*************** pcb_error_callback(void *arg)
*** 182,187 ****
--- 182,211 ----


  /*
+  * inSubExprOfKind
+  *        Detect whether we're parsing a subexpression of specified kind(s)
+  *
+  * sexprkinds is an OR'd mask of one or more ParseSubExprKind values.
+  * If any of these appear in the p_subexpr_kind stack, returns the most
+  * closely nested such kind.  If none of them do, returns SUBEXPR_KIND_NONE.
+  */
+ ParseSubExprKind
+ inSubExprOfKind(ParseState *pstate, int sexprkinds)
+ {
+     ListCell   *lc;
+
+     foreach(lc, pstate->p_subexpr_kind)
+     {
+         ParseSubExprKind skind = (ParseSubExprKind) lfirst_int(lc);
+
+         if (skind & sexprkinds)
+             return skind;
+     }
+     return SUBEXPR_KIND_NONE;
+ }
+
+
+ /*
   * make_var
   *        Build a Var node for an attribute identified by RTE and attrno
   */
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0b54840..3651be8 100644
*** a/src/include/parser/parse_node.h
--- b/src/include/parser/parse_node.h
*************** typedef enum ParseExprKind
*** 70,75 ****
--- 70,93 ----
      EXPR_KIND_PARTITION_EXPRESSION        /* PARTITION BY expression */
  } ParseExprKind;

+ /*
+  * While ParseExprKind identifies an expression's position within a SQL query,
+  * and thus can't be nested within a given query level, we also have interest
+  * in recognizing certain nestable subexpression contexts.  These are handled
+  * by creating a List of ParseSubExprKind values, with the front of the list
+  * corresponding to the most closely nested context. The list (p_subexpr_kind)
+  * is empty when we start parsing an expression.
+  *
+  * The enum values are assigned so that we can OR them together to form a
+  * mask identifying several subexpr kinds.
+  */
+ typedef enum ParseSubExprKind
+ {
+     SUBEXPR_KIND_NONE = 0,        /* not any subexpr kind */
+     SUBEXPR_KIND_CASE = 0x0001, /* subexpressions of a CASE */
+     SUBEXPR_KIND_COALESCE = 0x0002        /* COALESCE() arguments */
+ } ParseSubExprKind;
+

  /*
   * Function signatures for parser hooks
*************** typedef Node *(*CoerceParamHook) (ParseS
*** 142,147 ****
--- 160,168 ----
   * p_expr_kind: kind of expression we're currently parsing, as per enum above;
   * EXPR_KIND_NONE when not in an expression.
   *
+  * p_subexpr_kind: integer List of identifiers of nested subexpression types,
+  * as per enum above; NIL when not in a subexpression of interest.
+  *
   * p_next_resno: next TargetEntry.resno to assign, starting from 1.
   *
   * p_multiassign_exprs: partially-processed MultiAssignRef source expressions.
*************** struct ParseState
*** 181,186 ****
--- 202,208 ----
      bool        p_is_insert;    /* process assignment like INSERT not UPDATE */
      List       *p_windowdefs;    /* raw representations of window clauses */
      ParseExprKind p_expr_kind;    /* what kind of expression we're parsing */
+     List       *p_subexpr_kind; /* kind(s) of subexpressions we're parsing */
      int            p_next_resno;    /* next targetlist resno to assign */
      List       *p_multiassign_exprs;    /* junk tlist entries for multiassign */
      List       *p_locking_clause;        /* raw FOR UPDATE/FOR SHARE info */
*************** typedef struct ParseCallbackState
*** 254,259 ****
--- 276,288 ----
      ErrorContextCallback errcallback;
  } ParseCallbackState;

+ /* Macros for manipulating p_subexpr_kind list */
+ #define PushSubExprKind(pstate, sexprkind) \
+     ((pstate)->p_subexpr_kind = lcons_int(sexprkind, (pstate)->p_subexpr_kind))
+ #define PopSubExprKind(pstate, sexprkind) \
+     (AssertMacro(linitial_int((pstate)->p_subexpr_kind) == (sexprkind)), \
+      (pstate)->p_subexpr_kind = list_delete_first((pstate)->p_subexpr_kind))
+

  extern ParseState *make_parsestate(ParseState *parentParseState);
  extern void free_parsestate(ParseState *pstate);
*************** extern void setup_parser_errposition_cal
*** 263,268 ****
--- 292,299 ----
                                    ParseState *pstate, int location);
  extern void cancel_parser_errposition_callback(ParseCallbackState *pcbstate);

+ extern ParseSubExprKind inSubExprOfKind(ParseState *pstate, int sexprkinds);
+
  extern Var *make_var(ParseState *pstate, RangeTblEntry *rte, int attrno,
           int location);
  extern Oid    transformArrayType(Oid *arrayType, int32 *arrayTypmod);
diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out
index 4b6170d..5c82614 100644
*** a/src/test/regress/expected/rangefuncs.out
--- b/src/test/regress/expected/rangefuncs.out
*************** select * from foobar();  -- fail
*** 1969,1986 ****
  ERROR:  function return row and query-specified return row do not match
  DETAIL:  Returned row contains 3 attributes, but query expects 2.
  drop function foobar();
- -- check behavior when a function's input sometimes returns a set (bug #8228)
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-  id |      str      | lower
- ----+---------------+-------
-   2 | 0000000049404 | 49404
- (1 row)
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)
  create function extractq2(t int8_tbl) returns int8 as $$
    select t.q2
--- 1969,1974 ----
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index c8ae361..ca242f7 100644
*** a/src/test/regress/expected/tsrf.out
--- b/src/test/regress/expected/tsrf.out
*************** SELECT generate_series(1, generate_serie
*** 41,46 ****
--- 41,51 ----
                 3
  (6 rows)

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+ ERROR:  set-valued function called in context that cannot accept a set
+ LINE 1: SELECT * FROM generate_series(1, generate_series(1, 3));
+                                          ^
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));
   generate_series
*************** SELECT few.dataa, count(*) FROM few WHER
*** 190,195 ****
--- 195,209 ----
   a     |     4
  (2 rows)

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in CASE
+ LINE 1: SELECT q1, case when q1 > 0 then generate_series(1,3) else 0...
+                                          ^
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in COALESCE
+ LINE 1: SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+                             ^
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
  ERROR:  set-valued function called in context that cannot accept a set
diff --git a/src/test/regress/sql/rangefuncs.sql b/src/test/regress/sql/rangefuncs.sql
index 4ed84b1..442d397 100644
*** a/src/test/regress/sql/rangefuncs.sql
--- b/src/test/regress/sql/rangefuncs.sql
*************** select * from foobar();  -- fail
*** 600,614 ****

  drop function foobar();

- -- check behavior when a function's input sometimes returns a set (bug #8228)
-
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)

  create function extractq2(t int8_tbl) returns int8 as $$
--- 600,605 ----
diff --git a/src/test/regress/sql/tsrf.sql b/src/test/regress/sql/tsrf.sql
index 417e78c..45de099 100644
*** a/src/test/regress/sql/tsrf.sql
--- b/src/test/regress/sql/tsrf.sql
*************** SELECT generate_series(1, 2), generate_s
*** 14,19 ****
--- 14,22 ----
  -- srf, with SRF argument
  SELECT generate_series(1, generate_series(1, 3));

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));

*************** SELECT dataa, generate_series(1,1), coun
*** 51,56 ****
--- 54,63 ----
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2;
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2;

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index cbcd6cf..98bcfa0 100644
*** a/src/backend/catalog/information_schema.sql
--- b/src/backend/catalog/information_schema.sql
*************** CREATE VIEW user_mapping_options AS
*** 2936,2947 ****
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN
(pg_options_to_table(um.umoptions)).option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

--- 2936,2949 ----
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST(opts.option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user)
!                      THEN opts.option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um,
!          pg_options_to_table(um.umoptions) opts;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index efe1c37..5241fd2 100644
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
*************** check_agg_arguments_walker(Node *node,
*** 705,710 ****
--- 705,717 ----
          }
          /* Continue and descend into subtree */
      }
+     /* We can throw error on sight for a set-returning function */
+     if ((IsA(node, FuncExpr) &&((FuncExpr *) node)->funcretset) ||
+         (IsA(node, OpExpr) &&((OpExpr *) node)->opretset))
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("aggregate function calls cannot contain set-returning function calls"),
+                  parser_errposition(context->pstate, exprLocation(node))));
      /* We can throw error on sight for a window function */
      if (IsA(node, WindowFunc))
          ereport(ERROR,
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 92101c9..102da58 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** transformMultiAssignRef(ParseState *psta
*** 1619,1625 ****
  static Node *
  transformCaseExpr(ParseState *pstate, CaseExpr *c)
  {
!     CaseExpr   *newc;
      Node       *arg;
      CaseTestExpr *placeholder;
      List       *newargs;
--- 1619,1626 ----
  static Node *
  transformCaseExpr(ParseState *pstate, CaseExpr *c)
  {
!     CaseExpr   *newc = makeNode(CaseExpr);
!     Node       *last_srf = pstate->p_last_srf;
      Node       *arg;
      CaseTestExpr *placeholder;
      List       *newargs;
*************** transformCaseExpr(ParseState *pstate, Ca
*** 1628,1635 ****
      Node       *defresult;
      Oid            ptype;

-     newc = makeNode(CaseExpr);
-
      /* transform the test expression, if any */
      arg = transformExprRecurse(pstate, (Node *) c->arg);

--- 1629,1634 ----
*************** transformCaseExpr(ParseState *pstate, Ca
*** 1741,1746 ****
--- 1740,1755 ----
                                    "CASE/WHEN");
      }

+     /* if any subexpression contained a SRF, complain */
+     if (pstate->p_last_srf != last_srf)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "CASE"),
+                  parser_errposition(pstate,
+                                     exprLocation(pstate->p_last_srf))));
+
      newc->location = c->location;

      return (Node *) newc;
*************** static Node *
*** 2177,2182 ****
--- 2186,2192 ----
  transformCoalesceExpr(ParseState *pstate, CoalesceExpr *c)
  {
      CoalesceExpr *newc = makeNode(CoalesceExpr);
+     Node       *last_srf = pstate->p_last_srf;
      List       *newargs = NIL;
      List       *newcoercedargs = NIL;
      ListCell   *args;
*************** transformCoalesceExpr(ParseState *pstate
*** 2205,2210 ****
--- 2215,2230 ----
          newcoercedargs = lappend(newcoercedargs, newe);
      }

+     /* if any subexpression contained a SRF, complain */
+     if (pstate->p_last_srf != last_srf)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "COALESCE"),
+                  parser_errposition(pstate,
+                                     exprLocation(pstate->p_last_srf))));
+
      newc->args = newcoercedargs;
      newc->location = c->location;
      return (Node *) newc;
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 55853c2..9c761c4 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*************** ParseFuncOrColumn(ParseState *pstate, Li
*** 771,776 ****
--- 771,780 ----
          retval = (Node *) wfunc;
      }

+     /* if it returns a set, remember it for error checks at higher levels */
+     if (retset)
+         pstate->p_last_srf = retval;
+
      return retval;
  }

diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index e40b10d..b2f9a9c 100644
*** a/src/backend/parser/parse_oper.c
--- b/src/backend/parser/parse_oper.c
*************** make_op(ParseState *pstate, List *opname
*** 843,849 ****
--- 843,853 ----

      /* if it returns a set, check that's OK */
      if (result->opretset)
+     {
          check_srf_call_placement(pstate, location);
+         /* ... and remember it for error checks at higher levels */
+         pstate->p_last_srf = (Node *) result;
+     }

      ReleaseSysCache(tup);

diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0b54840..6a3507f 100644
*** a/src/include/parser/parse_node.h
--- b/src/include/parser/parse_node.h
*************** typedef Node *(*CoerceParamHook) (ParseS
*** 157,162 ****
--- 157,165 ----
   * p_hasAggs, p_hasWindowFuncs, etc: true if we've found any of the indicated
   * constructs in the query.
   *
+  * p_last_srf: the set-returning FuncExpr or OpExpr most recently found in
+  * the query, or NULL if none.
+  *
   * p_pre_columnref_hook, etc: optional parser hook functions for modifying the
   * interpretation of ColumnRefs and ParamRefs.
   *
*************** struct ParseState
*** 199,204 ****
--- 202,209 ----
      bool        p_hasSubLinks;
      bool        p_hasModifyingCTE;

+     Node       *p_last_srf;        /* most recent set-returning func/op found */
+
      /*
       * Optional hook functions for parser callbacks.  These are null unless
       * set up by the caller of make_parsestate.
diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out
index 4b6170d..5c82614 100644
*** a/src/test/regress/expected/rangefuncs.out
--- b/src/test/regress/expected/rangefuncs.out
*************** select * from foobar();  -- fail
*** 1969,1986 ****
  ERROR:  function return row and query-specified return row do not match
  DETAIL:  Returned row contains 3 attributes, but query expects 2.
  drop function foobar();
- -- check behavior when a function's input sometimes returns a set (bug #8228)
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-  id |      str      | lower
- ----+---------------+-------
-   2 | 0000000049404 | 49404
- (1 row)
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)
  create function extractq2(t int8_tbl) returns int8 as $$
    select t.q2
--- 1969,1974 ----
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index c8ae361..6e91ede 100644
*** a/src/test/regress/expected/tsrf.out
--- b/src/test/regress/expected/tsrf.out
*************** SELECT generate_series(1, generate_serie
*** 41,46 ****
--- 41,51 ----
                 3
  (6 rows)

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+ ERROR:  set-valued function called in context that cannot accept a set
+ LINE 1: SELECT * FROM generate_series(1, generate_series(1, 3));
+                                          ^
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));
   generate_series
*************** SELECT few.dataa, count(*) FROM few WHER
*** 190,198 ****
   a     |     4
  (2 rows)

  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
! ERROR:  set-valued function called in context that cannot accept a set
  LINE 1: SELECT min(generate_series(1, 3)) FROM few;
                     ^
  -- SRFs are not allowed in window function arguments, either
--- 195,212 ----
   a     |     4
  (2 rows)

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in CASE
+ LINE 1: SELECT q1, case when q1 > 0 then generate_series(1,3) else 0...
+                                          ^
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in COALESCE
+ LINE 1: SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+                             ^
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
! ERROR:  aggregate function calls cannot contain set-returning function calls
  LINE 1: SELECT min(generate_series(1, 3)) FROM few;
                     ^
  -- SRFs are not allowed in window function arguments, either
diff --git a/src/test/regress/sql/rangefuncs.sql b/src/test/regress/sql/rangefuncs.sql
index 4ed84b1..442d397 100644
*** a/src/test/regress/sql/rangefuncs.sql
--- b/src/test/regress/sql/rangefuncs.sql
*************** select * from foobar();  -- fail
*** 600,614 ****

  drop function foobar();

- -- check behavior when a function's input sometimes returns a set (bug #8228)
-
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)

  create function extractq2(t int8_tbl) returns int8 as $$
--- 600,605 ----
diff --git a/src/test/regress/sql/tsrf.sql b/src/test/regress/sql/tsrf.sql
index 417e78c..45de099 100644
*** a/src/test/regress/sql/tsrf.sql
--- b/src/test/regress/sql/tsrf.sql
*************** SELECT generate_series(1, 2), generate_s
*** 14,19 ****
--- 14,22 ----
  -- srf, with SRF argument
  SELECT generate_series(1, generate_series(1, 3));

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));

*************** SELECT dataa, generate_series(1,1), coun
*** 51,56 ****
--- 54,63 ----
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2;
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2;

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Hi,

On 2017-06-08 23:05:53 -0400, Tom Lane wrote:
> I spent some time experimenting with this, and immediately found out
> that information_schema.user_mapping_options contains an instance of the
> problematic usage :-(.  However, that view also turns out to be a poster
> child for why our old SRF semantics are a bad idea, because it's on the
> hairy edge of failing completely. [...]

Yuck.


> Anyway, to the subject at hand.  My thought when I wrote the previous
> message was to implement a "top down" mechanism whereby contexts like
> CASE and COALESCE would record their presence in the ParseState while
> recursively analyzing their arguments, and then SRF calls would be
> responsible for throwing an error by inspecting that context.  The first
> attached patch does it that way, and it seems nice and clean, but I ran
> into a complete dead end while trying to extend it to handle related cases
> such as disallowing SRF-inside-aggregate or nested SRFs in FROM.

But, do we really need to handle those?  IOW, isn't handling
CASE/COALESCE, which we can recognize early in parse analysis,
sufficient to error out here?  It'd be a lot nicer if we could error
about SRF arguments to aggregates et al. during analysis rather than
execution, but there's not a comparably huge need to improve there.


> I'm inclined to go with the "bottom up" approach, but I wonder if anyone
> has any comments or better ideas?

I'll have a look at the patches tomorrow, but I'm too tired to
meaningfully read code tonight.

- Andres



Andres Freund <andres@anarazel.de> writes:
> On 2017-06-08 23:05:53 -0400, Tom Lane wrote:
>> ... The first
>> attached patch does it that way, and it seems nice and clean, but I ran
>> into a complete dead end while trying to extend it to handle related cases
>> such as disallowing SRF-inside-aggregate or nested SRFs in FROM.

> But, do we really need to handle those?  IOW, isn't handling
> CASE/COALESCE, which we can recognize early in parse analysis,
> sufficient to error out here?  It'd be a lot nicer if we could error
> about SRF arguments to aggregates et al. during analysis rather than
> execution, but there's not a comparably huge need to improve there.

Yes, we already have guards for those cases, but they return fairly opaque
error messages to the tune of "set-valued function called in context that
cannot accept a set", because the executor hasn't enough context to do
better.  I'd like the messages to be more specific, like "set-valued
function cannot appear within CASE" and so on.  Anyway the point here is
to evaluate different approaches to changing the parser on the basis of
whether they *could* be extended to handle related cases.  Whether we
actually do that right now is less important.
        regards, tom lane



I wrote:
> Yes, we already have guards for those cases, but they return fairly opaque
> error messages to the tune of "set-valued function called in context that
> cannot accept a set", because the executor hasn't enough context to do
> better.  I'd like the messages to be more specific, like "set-valued
> function cannot appear within CASE" and so on.

Here's an expanded version of the "bottom up" patch that adjusts some
parser APIs to allow these additional messages to be thrown.  This changes
all occurrences of "set-valued function called in context that cannot
accept a set" in the core regression tests into something more specific.
There are probably some remaining cases that this doesn't cover, but the
existing execution-time checks will still catch those.

In passing, I added explicit checks that the operator doesn't return
set to transformAExprNullIf and make_distinct_op.  We used to be a bit
laissez-faire about that, expecting that the executor would throw
errors for such cases.  But now there are hard-wired assumptions that
only FuncExpr and OpExpr nodes could return sets, so we've got to make
sure that NullIfExpr and DistinctExpr nodes don't.

(In the other direction, I suspect that expression_returns_set() is now
too optimistic about believing that it needn't descend into the arguments
of nodes whose execution code used to reject SRF results.  But that's a
matter for a separate patch.)

            regards, tom lane

diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index cbcd6cf..98bcfa0 100644
*** a/src/backend/catalog/information_schema.sql
--- b/src/backend/catalog/information_schema.sql
*************** CREATE VIEW user_mapping_options AS
*** 2936,2947 ****
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN
(pg_options_to_table(um.umoptions)).option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

--- 2936,2949 ----
      SELECT authorization_identifier,
             foreign_server_catalog,
             foreign_server_name,
!            CAST(opts.option_name AS sql_identifier) AS option_name,
             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
!                        OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user)
!                      THEN opts.option_value
                       ELSE NULL END AS character_data) AS option_value
!     FROM _pg_user_mappings um,
!          pg_options_to_table(um.umoptions) opts;

  GRANT SELECT ON user_mapping_options TO PUBLIC;

diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index a35ba32..89aea2f 100644
*** a/src/backend/executor/functions.c
--- b/src/backend/executor/functions.c
*************** sql_fn_post_column_ref(ParseState *pstat
*** 388,393 ****
--- 388,394 ----
          param = ParseFuncOrColumn(pstate,
                                    list_make1(subfield),
                                    list_make1(param),
+                                   pstate->p_last_srf,
                                    NULL,
                                    cref->location);
      }
diff --git a/src/backend/parser/parse_agg.c b/src/backend/parser/parse_agg.c
index efe1c37..5241fd2 100644
*** a/src/backend/parser/parse_agg.c
--- b/src/backend/parser/parse_agg.c
*************** check_agg_arguments_walker(Node *node,
*** 705,710 ****
--- 705,717 ----
          }
          /* Continue and descend into subtree */
      }
+     /* We can throw error on sight for a set-returning function */
+     if ((IsA(node, FuncExpr) &&((FuncExpr *) node)->funcretset) ||
+         (IsA(node, OpExpr) &&((OpExpr *) node)->opretset))
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                  errmsg("aggregate function calls cannot contain set-returning function calls"),
+                  parser_errposition(context->pstate, exprLocation(node))));
      /* We can throw error on sight for a window function */
      if (IsA(node, WindowFunc))
          ereport(ERROR,
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 27dd49d..3d5b208 100644
*** a/src/backend/parser/parse_clause.c
--- b/src/backend/parser/parse_clause.c
*************** transformRangeFunction(ParseState *pstat
*** 572,577 ****
--- 572,579 ----
          List       *pair = (List *) lfirst(lc);
          Node       *fexpr;
          List       *coldeflist;
+         Node       *newfexpr;
+         Node       *last_srf;

          /* Disassemble the function-call/column-def-list pairs */
          Assert(list_length(pair) == 2);
*************** transformRangeFunction(ParseState *pstat
*** 618,630 ****
                      Node       *arg = (Node *) lfirst(lc);
                      FuncCall   *newfc;

                      newfc = makeFuncCall(SystemFuncName("unnest"),
                                           list_make1(arg),
                                           fc->location);

!                     funcexprs = lappend(funcexprs,
!                                         transformExpr(pstate, (Node *) newfc,
!                                                    EXPR_KIND_FROM_FUNCTION));

                      funcnames = lappend(funcnames,
                                          FigureColname((Node *) newfc));
--- 620,644 ----
                      Node       *arg = (Node *) lfirst(lc);
                      FuncCall   *newfc;

+                     last_srf = pstate->p_last_srf;
+
                      newfc = makeFuncCall(SystemFuncName("unnest"),
                                           list_make1(arg),
                                           fc->location);

!                     newfexpr = transformExpr(pstate, (Node *) newfc,
!                                              EXPR_KIND_FROM_FUNCTION);
!
!                     /* nodeFunctionscan.c requires SRFs to be at top level */
!                     if (pstate->p_last_srf != last_srf &&
!                         pstate->p_last_srf != newfexpr)
!                         ereport(ERROR,
!                                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                                  errmsg("set-returning functions must appear at top level of FROM"),
!                                  parser_errposition(pstate,
!                                          exprLocation(pstate->p_last_srf))));
!
!                     funcexprs = lappend(funcexprs, newfexpr);

                      funcnames = lappend(funcnames,
                                          FigureColname((Node *) newfc));
*************** transformRangeFunction(ParseState *pstat
*** 638,646 ****
          }

          /* normal case ... */
!         funcexprs = lappend(funcexprs,
!                             transformExpr(pstate, fexpr,
!                                           EXPR_KIND_FROM_FUNCTION));

          funcnames = lappend(funcnames,
                              FigureColname(fexpr));
--- 652,672 ----
          }

          /* normal case ... */
!         last_srf = pstate->p_last_srf;
!
!         newfexpr = transformExpr(pstate, fexpr,
!                                  EXPR_KIND_FROM_FUNCTION);
!
!         /* nodeFunctionscan.c requires SRFs to be at top level */
!         if (pstate->p_last_srf != last_srf &&
!             pstate->p_last_srf != newfexpr)
!             ereport(ERROR,
!                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                      errmsg("set-returning functions must appear at top level of FROM"),
!                      parser_errposition(pstate,
!                                         exprLocation(pstate->p_last_srf))));
!
!         funcexprs = lappend(funcexprs, newfexpr);

          funcnames = lappend(funcnames,
                              FigureColname(fexpr));
diff --git a/src/backend/parser/parse_expr.c b/src/backend/parser/parse_expr.c
index 92101c9..da5fb7f 100644
*** a/src/backend/parser/parse_expr.c
--- b/src/backend/parser/parse_expr.c
*************** static Node *transformCurrentOfExpr(Pars
*** 118,125 ****
  static Node *transformColumnRef(ParseState *pstate, ColumnRef *cref);
  static Node *transformWholeRowRef(ParseState *pstate, RangeTblEntry *rte,
                       int location);
! static Node *transformIndirection(ParseState *pstate, Node *basenode,
!                      List *indirection);
  static Node *transformTypeCast(ParseState *pstate, TypeCast *tc);
  static Node *transformCollateClause(ParseState *pstate, CollateClause *c);
  static Node *make_row_comparison_op(ParseState *pstate, List *opname,
--- 118,124 ----
  static Node *transformColumnRef(ParseState *pstate, ColumnRef *cref);
  static Node *transformWholeRowRef(ParseState *pstate, RangeTblEntry *rte,
                       int location);
! static Node *transformIndirection(ParseState *pstate, A_Indirection *ind);
  static Node *transformTypeCast(ParseState *pstate, TypeCast *tc);
  static Node *transformCollateClause(ParseState *pstate, CollateClause *c);
  static Node *make_row_comparison_op(ParseState *pstate, List *opname,
*************** transformExprRecurse(ParseState *pstate,
*** 192,205 ****
              }

          case T_A_Indirection:
!             {
!                 A_Indirection *ind = (A_Indirection *) expr;
!
!                 result = transformExprRecurse(pstate, ind->arg);
!                 result = transformIndirection(pstate, result,
!                                               ind->indirection);
!                 break;
!             }

          case T_A_ArrayExpr:
              result = transformArrayExpr(pstate, (A_ArrayExpr *) expr,
--- 191,198 ----
              }

          case T_A_Indirection:
!             result = transformIndirection(pstate, (A_Indirection *) expr);
!             break;

          case T_A_ArrayExpr:
              result = transformArrayExpr(pstate, (A_ArrayExpr *) expr,
*************** unknown_attribute(ParseState *pstate, No
*** 439,449 ****
  }

  static Node *
! transformIndirection(ParseState *pstate, Node *basenode, List *indirection)
  {
!     Node       *result = basenode;
      List       *subscripts = NIL;
!     int            location = exprLocation(basenode);
      ListCell   *i;

      /*
--- 432,443 ----
  }

  static Node *
! transformIndirection(ParseState *pstate, A_Indirection *ind)
  {
!     Node       *last_srf = pstate->p_last_srf;
!     Node       *result = transformExprRecurse(pstate, ind->arg);
      List       *subscripts = NIL;
!     int            location = exprLocation(result);
      ListCell   *i;

      /*
*************** transformIndirection(ParseState *pstate,
*** 451,457 ****
       * subscripting.  Adjacent A_Indices nodes have to be treated as a single
       * multidimensional subscript operation.
       */
!     foreach(i, indirection)
      {
          Node       *n = lfirst(i);

--- 445,451 ----
       * subscripting.  Adjacent A_Indices nodes have to be treated as a single
       * multidimensional subscript operation.
       */
!     foreach(i, ind->indirection)
      {
          Node       *n = lfirst(i);

*************** transformIndirection(ParseState *pstate,
*** 484,489 ****
--- 478,484 ----
              newresult = ParseFuncOrColumn(pstate,
                                            list_make1(n),
                                            list_make1(result),
+                                           last_srf,
                                            NULL,
                                            location);
              if (newresult == NULL)
*************** transformColumnRef(ParseState *pstate, C
*** 632,637 ****
--- 627,633 ----
                      node = ParseFuncOrColumn(pstate,
                                               list_make1(makeString(colname)),
                                               list_make1(node),
+                                              pstate->p_last_srf,
                                               NULL,
                                               cref->location);
                  }
*************** transformColumnRef(ParseState *pstate, C
*** 678,683 ****
--- 674,680 ----
                      node = ParseFuncOrColumn(pstate,
                                               list_make1(makeString(colname)),
                                               list_make1(node),
+                                              pstate->p_last_srf,
                                               NULL,
                                               cref->location);
                  }
*************** transformColumnRef(ParseState *pstate, C
*** 737,742 ****
--- 734,740 ----
                      node = ParseFuncOrColumn(pstate,
                                               list_make1(makeString(colname)),
                                               list_make1(node),
+                                              pstate->p_last_srf,
                                               NULL,
                                               cref->location);
                  }
*************** transformAExprOp(ParseState *pstate, A_E
*** 927,932 ****
--- 925,932 ----
      else
      {
          /* Ordinary scalar operator */
+         Node       *last_srf = pstate->p_last_srf;
+
          lexpr = transformExprRecurse(pstate, lexpr);
          rexpr = transformExprRecurse(pstate, rexpr);

*************** transformAExprOp(ParseState *pstate, A_E
*** 934,939 ****
--- 934,940 ----
                                    a->name,
                                    lexpr,
                                    rexpr,
+                                   last_srf,
                                    a->location);
      }

*************** transformAExprNullIf(ParseState *pstate,
*** 1053,1058 ****
--- 1054,1060 ----
                                  a->name,
                                  lexpr,
                                  rexpr,
+                                 pstate->p_last_srf,
                                  a->location);

      /*
*************** transformAExprNullIf(ParseState *pstate,
*** 1063,1068 ****
--- 1065,1075 ----
                  (errcode(ERRCODE_DATATYPE_MISMATCH),
                   errmsg("NULLIF requires = operator to yield boolean"),
                   parser_errposition(pstate, a->location)));
+     if (result->opretset)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("NULLIF operator must not return a set"),
+                  parser_errposition(pstate, a->location)));

      /*
       * ... but the NullIfExpr will yield the first operand's type.
*************** transformAExprIn(ParseState *pstate, A_E
*** 1266,1271 ****
--- 1273,1279 ----
                                     a->name,
                                     copyObject(lexpr),
                                     rexpr,
+                                    pstate->p_last_srf,
                                     a->location);
          }

*************** transformBoolExpr(ParseState *pstate, Bo
*** 1430,1435 ****
--- 1438,1444 ----
  static Node *
  transformFuncCall(ParseState *pstate, FuncCall *fn)
  {
+     Node       *last_srf = pstate->p_last_srf;
      List       *targs;
      ListCell   *args;

*************** transformFuncCall(ParseState *pstate, Fu
*** 1465,1470 ****
--- 1474,1480 ----
      return ParseFuncOrColumn(pstate,
                               fn->funcname,
                               targs,
+                              last_srf,
                               fn,
                               fn->location);
  }
*************** transformMultiAssignRef(ParseState *psta
*** 1619,1625 ****
  static Node *
  transformCaseExpr(ParseState *pstate, CaseExpr *c)
  {
!     CaseExpr   *newc;
      Node       *arg;
      CaseTestExpr *placeholder;
      List       *newargs;
--- 1629,1636 ----
  static Node *
  transformCaseExpr(ParseState *pstate, CaseExpr *c)
  {
!     CaseExpr   *newc = makeNode(CaseExpr);
!     Node       *last_srf = pstate->p_last_srf;
      Node       *arg;
      CaseTestExpr *placeholder;
      List       *newargs;
*************** transformCaseExpr(ParseState *pstate, Ca
*** 1628,1635 ****
      Node       *defresult;
      Oid            ptype;

-     newc = makeNode(CaseExpr);
-
      /* transform the test expression, if any */
      arg = transformExprRecurse(pstate, (Node *) c->arg);

--- 1639,1644 ----
*************** transformCaseExpr(ParseState *pstate, Ca
*** 1741,1746 ****
--- 1750,1765 ----
                                    "CASE/WHEN");
      }

+     /* if any subexpression contained a SRF, complain */
+     if (pstate->p_last_srf != last_srf)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "CASE"),
+                  parser_errposition(pstate,
+                                     exprLocation(pstate->p_last_srf))));
+
      newc->location = c->location;

      return (Node *) newc;
*************** static Node *
*** 2177,2182 ****
--- 2196,2202 ----
  transformCoalesceExpr(ParseState *pstate, CoalesceExpr *c)
  {
      CoalesceExpr *newc = makeNode(CoalesceExpr);
+     Node       *last_srf = pstate->p_last_srf;
      List       *newargs = NIL;
      List       *newcoercedargs = NIL;
      ListCell   *args;
*************** transformCoalesceExpr(ParseState *pstate
*** 2205,2210 ****
--- 2225,2240 ----
          newcoercedargs = lappend(newcoercedargs, newe);
      }

+     /* if any subexpression contained a SRF, complain */
+     if (pstate->p_last_srf != last_srf)
+         ereport(ERROR,
+                 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+         /* translator: %s is name of a SQL construct, eg GROUP BY */
+                  errmsg("set-returning functions are not allowed in %s",
+                         "COALESCE"),
+                  parser_errposition(pstate,
+                                     exprLocation(pstate->p_last_srf))));
+
      newc->args = newcoercedargs;
      newc->location = c->location;
      return (Node *) newc;
*************** make_row_comparison_op(ParseState *pstat
*** 2793,2799 ****
          Node       *rarg = (Node *) lfirst(r);
          OpExpr       *cmp;

!         cmp = castNode(OpExpr, make_op(pstate, opname, larg, rarg, location));

          /*
           * We don't use coerce_to_boolean here because we insist on the
--- 2823,2830 ----
          Node       *rarg = (Node *) lfirst(r);
          OpExpr       *cmp;

!         cmp = castNode(OpExpr, make_op(pstate, opname, larg, rarg,
!                                        pstate->p_last_srf, location));

          /*
           * We don't use coerce_to_boolean here because we insist on the
*************** make_distinct_op(ParseState *pstate, Lis
*** 3000,3011 ****
  {
      Expr       *result;

!     result = make_op(pstate, opname, ltree, rtree, location);
      if (((OpExpr *) result)->opresulttype != BOOLOID)
          ereport(ERROR,
                  (errcode(ERRCODE_DATATYPE_MISMATCH),
               errmsg("IS DISTINCT FROM requires = operator to yield boolean"),
                   parser_errposition(pstate, location)));

      /*
       * We rely on DistinctExpr and OpExpr being same struct
--- 3031,3048 ----
  {
      Expr       *result;

!     result = make_op(pstate, opname, ltree, rtree,
!                      pstate->p_last_srf, location);
      if (((OpExpr *) result)->opresulttype != BOOLOID)
          ereport(ERROR,
                  (errcode(ERRCODE_DATATYPE_MISMATCH),
               errmsg("IS DISTINCT FROM requires = operator to yield boolean"),
                   parser_errposition(pstate, location)));
+     if (((OpExpr *) result)->opretset)
+         ereport(ERROR,
+                 (errcode(ERRCODE_DATATYPE_MISMATCH),
+                  errmsg("IS DISTINCT FROM operator must not return a set"),
+                  parser_errposition(pstate, location)));

      /*
       * We rely on DistinctExpr and OpExpr being same struct
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 55853c2..bb290c5 100644
*** a/src/backend/parser/parse_func.c
--- b/src/backend/parser/parse_func.c
*************** static Node *ParseComplexProjection(Pars
*** 64,73 ****
   *
   *    The argument expressions (in fargs) must have been transformed
   *    already.  However, nothing in *fn has been transformed.
   */
  Node *
  ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
!                   FuncCall *fn, int location)
  {
      bool        is_column = (fn == NULL);
      List       *agg_order = (fn ? fn->agg_order : NIL);
--- 64,77 ----
   *
   *    The argument expressions (in fargs) must have been transformed
   *    already.  However, nothing in *fn has been transformed.
+  *
+  *    last_srf should be a copy of pstate->p_last_srf from just before we
+  *    started transforming fargs.  If the caller knows that fargs couldn't
+  *    contain any SRF calls, last_srf can just be pstate->p_last_srf.
   */
  Node *
  ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
!                   Node *last_srf, FuncCall *fn, int location)
  {
      bool        is_column = (fn == NULL);
      List       *agg_order = (fn ? fn->agg_order : NIL);
*************** ParseFuncOrColumn(ParseState *pstate, Li
*** 628,634 ****

      /* if it returns a set, check that's OK */
      if (retset)
!         check_srf_call_placement(pstate, location);

      /* build the appropriate output structure */
      if (fdresult == FUNCDETAIL_NORMAL)
--- 632,638 ----

      /* if it returns a set, check that's OK */
      if (retset)
!         check_srf_call_placement(pstate, last_srf, location);

      /* build the appropriate output structure */
      if (fdresult == FUNCDETAIL_NORMAL)
*************** ParseFuncOrColumn(ParseState *pstate, Li
*** 759,764 ****
--- 763,778 ----
                       errmsg("FILTER is not implemented for non-aggregate window functions"),
                       parser_errposition(pstate, location)));

+         /*
+          * Window functions can't either take or return sets
+          */
+         if (pstate->p_last_srf != last_srf)
+             ereport(ERROR,
+                     (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                      errmsg("window function calls cannot contain set-returning function calls"),
+                      parser_errposition(pstate,
+                                         exprLocation(pstate->p_last_srf))));
+
          if (retset)
              ereport(ERROR,
                      (errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
*************** ParseFuncOrColumn(ParseState *pstate, Li
*** 771,776 ****
--- 785,794 ----
          retval = (Node *) wfunc;
      }

+     /* if it returns a set, remember it for error checks at higher levels */
+     if (retset)
+         pstate->p_last_srf = retval;
+
      return retval;
  }

*************** LookupAggWithArgs(ObjectWithArgs *agg, b
*** 2083,2091 ****
   *        and throw a nice error if not.
   *
   * A side-effect is to set pstate->p_hasTargetSRFs true if appropriate.
   */
  void
! check_srf_call_placement(ParseState *pstate, int location)
  {
      const char *err;
      bool        errkind;
--- 2101,2113 ----
   *        and throw a nice error if not.
   *
   * A side-effect is to set pstate->p_hasTargetSRFs true if appropriate.
+  *
+  * last_srf should be a copy of pstate->p_last_srf from just before we
+  * started transforming the function's arguments.  This allows detection
+  * of whether the SRF's arguments contain any SRFs.
   */
  void
! check_srf_call_placement(ParseState *pstate, Node *last_srf, int location)
  {
      const char *err;
      bool        errkind;
*************** check_srf_call_placement(ParseState *pst
*** 2121,2127 ****
              errkind = true;
              break;
          case EXPR_KIND_FROM_FUNCTION:
!             /* okay ... but we can't check nesting here */
              break;
          case EXPR_KIND_WHERE:
              errkind = true;
--- 2143,2157 ----
              errkind = true;
              break;
          case EXPR_KIND_FROM_FUNCTION:
!             /* okay, but we don't allow nested SRFs here */
!             /* errmsg is chosen to match transformRangeFunction() */
!             /* errposition should point to the inner SRF */
!             if (pstate->p_last_srf != last_srf)
!                 ereport(ERROR,
!                         (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                          errmsg("set-returning functions must appear at top level of FROM"),
!                          parser_errposition(pstate,
!                                          exprLocation(pstate->p_last_srf))));
              break;
          case EXPR_KIND_WHERE:
              errkind = true;
diff --git a/src/backend/parser/parse_oper.c b/src/backend/parser/parse_oper.c
index e40b10d..4b1db76 100644
*** a/src/backend/parser/parse_oper.c
--- b/src/backend/parser/parse_oper.c
*************** op_error(ParseState *pstate, List *op, c
*** 735,746 ****
   * Transform operator expression ensuring type compatibility.
   * This is where some type conversion happens.
   *
!  * As with coerce_type, pstate may be NULL if no special unknown-Param
!  * processing is wanted.
   */
  Expr *
  make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
!         int location)
  {
      Oid            ltypeId,
                  rtypeId;
--- 735,748 ----
   * Transform operator expression ensuring type compatibility.
   * This is where some type conversion happens.
   *
!  * last_srf should be a copy of pstate->p_last_srf from just before we
!  * started transforming the operator's arguments; this is used for nested-SRF
!  * detection.  If the caller will throw an error anyway for a set-returning
!  * expression, it's okay to cheat and just pass pstate->p_last_srf.
   */
  Expr *
  make_op(ParseState *pstate, List *opname, Node *ltree, Node *rtree,
!         Node *last_srf, int location)
  {
      Oid            ltypeId,
                  rtypeId;
*************** make_op(ParseState *pstate, List *opname
*** 843,849 ****

      /* if it returns a set, check that's OK */
      if (result->opretset)
!         check_srf_call_placement(pstate, location);

      ReleaseSysCache(tup);

--- 845,855 ----

      /* if it returns a set, check that's OK */
      if (result->opretset)
!     {
!         check_srf_call_placement(pstate, last_srf, location);
!         /* ... and remember it for error checks at higher levels */
!         pstate->p_last_srf = (Node *) result;
!     }

      ReleaseSysCache(tup);

diff --git a/src/include/parser/parse_func.h b/src/include/parser/parse_func.h
index 5be1812..68572e9 100644
*** a/src/include/parser/parse_func.h
--- b/src/include/parser/parse_func.h
*************** typedef enum
*** 31,37 ****


  extern Node *ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
!                   FuncCall *fn, int location);

  extern FuncDetailCode func_get_detail(List *funcname,
                  List *fargs, List *fargnames,
--- 31,37 ----


  extern Node *ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
!                   Node *last_srf, FuncCall *fn, int location);

  extern FuncDetailCode func_get_detail(List *funcname,
                  List *fargs, List *fargnames,
*************** extern Oid LookupFuncWithArgs(ObjectWith
*** 67,72 ****
  extern Oid LookupAggWithArgs(ObjectWithArgs *agg,
                    bool noError);

! extern void check_srf_call_placement(ParseState *pstate, int location);

  #endif   /* PARSE_FUNC_H */
--- 67,73 ----
  extern Oid LookupAggWithArgs(ObjectWithArgs *agg,
                    bool noError);

! extern void check_srf_call_placement(ParseState *pstate, Node *last_srf,
!                          int location);

  #endif   /* PARSE_FUNC_H */
diff --git a/src/include/parser/parse_node.h b/src/include/parser/parse_node.h
index 0b54840..6a3507f 100644
*** a/src/include/parser/parse_node.h
--- b/src/include/parser/parse_node.h
*************** typedef Node *(*CoerceParamHook) (ParseS
*** 157,162 ****
--- 157,165 ----
   * p_hasAggs, p_hasWindowFuncs, etc: true if we've found any of the indicated
   * constructs in the query.
   *
+  * p_last_srf: the set-returning FuncExpr or OpExpr most recently found in
+  * the query, or NULL if none.
+  *
   * p_pre_columnref_hook, etc: optional parser hook functions for modifying the
   * interpretation of ColumnRefs and ParamRefs.
   *
*************** struct ParseState
*** 199,204 ****
--- 202,209 ----
      bool        p_hasSubLinks;
      bool        p_hasModifyingCTE;

+     Node       *p_last_srf;        /* most recent set-returning func/op found */
+
      /*
       * Optional hook functions for parser callbacks.  These are null unless
       * set up by the caller of make_parsestate.
diff --git a/src/include/parser/parse_oper.h b/src/include/parser/parse_oper.h
index d783b37..ab3c4aa 100644
*** a/src/include/parser/parse_oper.h
--- b/src/include/parser/parse_oper.h
*************** extern Oid    oprfuncid(Operator op);
*** 59,65 ****

  /* Build expression tree for an operator invocation */
  extern Expr *make_op(ParseState *pstate, List *opname,
!         Node *ltree, Node *rtree, int location);
  extern Expr *make_scalar_array_op(ParseState *pstate, List *opname,
                       bool useOr,
                       Node *ltree, Node *rtree, int location);
--- 59,65 ----

  /* Build expression tree for an operator invocation */
  extern Expr *make_op(ParseState *pstate, List *opname,
!         Node *ltree, Node *rtree, Node *last_srf, int location);
  extern Expr *make_scalar_array_op(ParseState *pstate, List *opname,
                       bool useOr,
                       Node *ltree, Node *rtree, int location);
diff --git a/src/test/regress/expected/rangefuncs.out b/src/test/regress/expected/rangefuncs.out
index 4b6170d..5c82614 100644
*** a/src/test/regress/expected/rangefuncs.out
--- b/src/test/regress/expected/rangefuncs.out
*************** select * from foobar();  -- fail
*** 1969,1986 ****
  ERROR:  function return row and query-specified return row do not match
  DETAIL:  Returned row contains 3 attributes, but query expects 2.
  drop function foobar();
- -- check behavior when a function's input sometimes returns a set (bug #8228)
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-  id |      str      | lower
- ----+---------------+-------
-   2 | 0000000049404 | 49404
- (1 row)
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)
  create function extractq2(t int8_tbl) returns int8 as $$
    select t.q2
--- 1969,1974 ----
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index c8ae361..10cacc5 100644
*** a/src/test/regress/expected/tsrf.out
--- b/src/test/regress/expected/tsrf.out
*************** SELECT generate_series(1, generate_serie
*** 41,46 ****
--- 41,51 ----
                 3
  (6 rows)

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+ ERROR:  set-returning functions must appear at top level of FROM
+ LINE 1: SELECT * FROM generate_series(1, generate_series(1, 3));
+                                          ^
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));
   generate_series
*************** SELECT few.dataa, count(*) FROM few WHER
*** 190,203 ****
   a     |     4
  (2 rows)

  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
! ERROR:  set-valued function called in context that cannot accept a set
  LINE 1: SELECT min(generate_series(1, 3)) FROM few;
                     ^
  -- SRFs are not allowed in window function arguments, either
  SELECT min(generate_series(1, 3)) OVER() FROM few;
! ERROR:  set-valued function called in context that cannot accept a set
  LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few;
                     ^
  -- SRFs are normally computed after window functions
--- 195,217 ----
   a     |     4
  (2 rows)

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in CASE
+ LINE 1: SELECT q1, case when q1 > 0 then generate_series(1,3) else 0...
+                                          ^
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+ ERROR:  set-returning functions are not allowed in COALESCE
+ LINE 1: SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+                             ^
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
! ERROR:  aggregate function calls cannot contain set-returning function calls
  LINE 1: SELECT min(generate_series(1, 3)) FROM few;
                     ^
  -- SRFs are not allowed in window function arguments, either
  SELECT min(generate_series(1, 3)) OVER() FROM few;
! ERROR:  window function calls cannot contain set-returning function calls
  LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few;
                     ^
  -- SRFs are normally computed after window functions
*************** SELECT int4mul(generate_series(1,2), 10)
*** 427,433 ****

  -- but SRFs in function RTEs must be at top level (annoying restriction)
  SELECT * FROM int4mul(generate_series(1,2), 10);
! ERROR:  set-valued function called in context that cannot accept a set
  LINE 1: SELECT * FROM int4mul(generate_series(1,2), 10);
                                ^
  -- DISTINCT ON is evaluated before tSRF evaluation if SRF is not
--- 441,447 ----

  -- but SRFs in function RTEs must be at top level (annoying restriction)
  SELECT * FROM int4mul(generate_series(1,2), 10);
! ERROR:  set-returning functions must appear at top level of FROM
  LINE 1: SELECT * FROM int4mul(generate_series(1,2), 10);
                                ^
  -- DISTINCT ON is evaluated before tSRF evaluation if SRF is not
diff --git a/src/test/regress/sql/rangefuncs.sql b/src/test/regress/sql/rangefuncs.sql
index 4ed84b1..442d397 100644
*** a/src/test/regress/sql/rangefuncs.sql
--- b/src/test/regress/sql/rangefuncs.sql
*************** select * from foobar();  -- fail
*** 600,614 ****

  drop function foobar();

- -- check behavior when a function's input sometimes returns a set (bug #8228)
-
- SELECT *,
-   lower(CASE WHEN id = 2 THEN (regexp_matches(str, '^0*([1-9]\d+)$'))[1]
-         ELSE str
-         END)
- FROM
-   (VALUES (1,''), (2,'0000000049404'), (3,'FROM 10000000876')) v(id, str);
-
  -- check whole-row-Var handling in nested lateral functions (bug #11703)

  create function extractq2(t int8_tbl) returns int8 as $$
--- 600,605 ----
diff --git a/src/test/regress/sql/tsrf.sql b/src/test/regress/sql/tsrf.sql
index 417e78c..45de099 100644
*** a/src/test/regress/sql/tsrf.sql
--- b/src/test/regress/sql/tsrf.sql
*************** SELECT generate_series(1, 2), generate_s
*** 14,19 ****
--- 14,22 ----
  -- srf, with SRF argument
  SELECT generate_series(1, generate_series(1, 3));

+ -- but we've traditionally rejected the same in FROM
+ SELECT * FROM generate_series(1, generate_series(1, 3));
+
  -- srf, with two SRF arguments
  SELECT generate_series(generate_series(1,3), generate_series(2, 4));

*************** SELECT dataa, generate_series(1,1), coun
*** 51,56 ****
--- 54,63 ----
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa ORDER BY 2;
  SELECT few.dataa, count(*) FROM few WHERE dataa = 'a' GROUP BY few.dataa, unnest('{1,1,3}'::int[]) ORDER BY 2;

+ -- SRFs are not allowed if they'd need to be conditionally executed
+ SELECT q1, case when q1 > 0 then generate_series(1,3) else 0 end FROM int8_tbl;
+ SELECT q1, coalesce(generate_series(1,3), 0) FROM int8_tbl;
+
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane wrote:
> I wrote:
> > Yes, we already have guards for those cases, but they return fairly opaque
> > error messages to the tune of "set-valued function called in context that
> > cannot accept a set", because the executor hasn't enough context to do
> > better.  I'd like the messages to be more specific, like "set-valued
> > function cannot appear within CASE" and so on.
> 
> Here's an expanded version of the "bottom up" patch that adjusts some
> parser APIs to allow these additional messages to be thrown.  This changes
> all occurrences of "set-valued function called in context that cannot
> accept a set" in the core regression tests into something more specific.
> There are probably some remaining cases that this doesn't cover, but the
> existing execution-time checks will still catch those.

Interesting stuff.  Here's a small recommendation for a couple of those
new messages.



-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Interesting stuff.  Here's a small recommendation for a couple of those
> new messages.

Hm.  I don't object to folding those two messages into one, but now that
I look at it, the text needs some more work anyway, perhaps.  What we're
actually checking is not so much whether the IS DISTINCT FROM construct
returns a set as whether the underlying equality operator does.  If we
want to be pedantic about it, we'd end up writing something like
"equality operator used by %s must not return a set"

But perhaps it's okay to fuzz the distinction and just write
"%s must not return a set"

You could justify that on the reasoning that if we were to allow this
then an underlying "=" that returned a set would presumably cause
IS DISTINCT FROM or NULLIF to also return a set.

I'm kind of thinking that the second wording is preferable, but there's
room to argue that the first is more precise.  Opinions?
        regards, tom lane



On 2017-06-09 17:33:45 -0400, Tom Lane wrote:
> diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
> index cbcd6cf..98bcfa0 100644
> --- a/src/backend/catalog/information_schema.sql
> +++ b/src/backend/catalog/information_schema.sql
> @@ -2936,12 +2936,14 @@ CREATE VIEW user_mapping_options AS
>      SELECT authorization_identifier,
>             foreign_server_catalog,
>             foreign_server_name,
> -           CAST((pg_options_to_table(um.umoptions)).option_name AS sql_identifier) AS option_name,
> +           CAST(opts.option_name AS sql_identifier) AS option_name,
>             CAST(CASE WHEN (umuser <> 0 AND authorization_identifier = current_user)
>                         OR (umuser = 0 AND pg_has_role(srvowner, 'USAGE'))
> -                       OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user) THEN
(pg_options_to_table(um.umoptions)).option_value
> +                       OR (SELECT rolsuper FROM pg_authid WHERE rolname = current_user)
> +                     THEN opts.option_value
>                       ELSE NULL END AS character_data) AS option_value
> -    FROM _pg_user_mappings um;
> +    FROM _pg_user_mappings um,
> +         pg_options_to_table(um.umoptions) opts;

This really is a lot better...


>  GRANT SELECT ON user_mapping_options TO PUBLIC;
>  
> diff --git a/src/backend/executor/functions.c b/sindex a35ba32..89aea2f 100644
> --- a/src/backend/executor/functions.c
> +++ b/src/backend/executor/functions.c
> @@ -388,6 +388,7 @@ sql_fn_post_column_ref(ParseState *pstat
>          param = ParseFuncOrColumn(pstate,
>                                    list_make1(subfield),
>                                    list_make1(param),
> +                                  pstate->p_last_srf,
>                                    NULL,
>                                    cref->location);
>      }
> diff --git a/src/backend/parser/parse_aindex efe1c37..5241fd2 100644
> --- a/src/backend/parser/parse_agg.c
> +++ b/src/backend/parser/parse_agg.c
> @@ -705,6 +705,13 @@ check_agg_arguments_walker(Node *node,
>          }
>          /* Continue and descend into subtree */
>      }
> +    /* We can throw error on sight for a set-returning function */
> +    if ((IsA(node, FuncExpr) &&((FuncExpr *) node)->funcretset) ||
> +        (IsA(node, OpExpr) &&((OpExpr *) node)->opretset))
> +        ereport(ERROR,
> +                (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> +                 errmsg("aggregate function calls cannot contain set-returning function calls"),
> +                 parser_errposition(context->pstate, exprLocation(node))));

Possibly too hard to be precise enough in a hint, but a number of these
could benefit from one suggesting moving things into FROM, using
LATERAL.

I'm kinda positively surprised at how non-invasive this turned out, I'd
afraid there'd be a lot more verbosity to it.  I think the improved
error messages (message & location), are quite worthwhile an their own.

Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> Possibly too hard to be precise enough in a hint, but a number of these
> could benefit from one suggesting moving things into FROM, using
> LATERAL.

Maybe "You might be able to move the set-returning function into a
LATERAL FROM item."?

> I'm kinda positively surprised at how non-invasive this turned out, I'd
> afraid there'd be a lot more verbosity to it.  I think the improved
> error messages (message & location), are quite worthwhile an their own.

Yeah, me too --- I'm pretty pleased with the result.
        regards, tom lane



On 2017-06-12 21:03:31 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Possibly too hard to be precise enough in a hint, but a number of these
> > could benefit from one suggesting moving things into FROM, using
> > LATERAL.
> 
> Maybe "You might be able to move the set-returning function into a
> LATERAL FROM item."?

WFM, seems at least better than the current and current-as-proposed
state.


- Andres