Thread: [HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

[HACKERS] Query fails when SRFs are part of FROM clause (Commit id: 69f4b9c85f)

From
Rushabh Lathia
Date:
Consider the below test;

CREATE TABLE tab ( a int primary key);

SELECT  *
FROM pg_constraint pc,
CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1, array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;

Above query is failing with "set-valued function called in context that cannot
accept a set". But if I remove the CASE from the query then it working just good.

Like:

SELECT  * 
FROM pg_constraint pc,
CAST(generate_series(1, array_upper(pc.conkey, 1)) AS int) AS position;

This started failing with 69f4b9c85f168ae006929eec44fc44d569e846b9. It seems
check_srf_call_placement() sets the hasTargetSRFs flag and but when the SRFs
at the rtable ofcourse this flag doesn't get set. It seems like missing something
their, but I might be completely wrong as not quire aware of this area.




regards,
Rushabh Lathia

Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commitid: 69f4b9c85f)

From
"David G. Johnston"
Date:
On Fri, Jan 27, 2017 at 5:28 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Consider the below test;

CREATE TABLE tab ( a int primary key);

SELECT  *
FROM pg_constraint pc,
CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1, array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;

Above query is failing with "set-valued function called in context that cannot
accept a set". But if I remove the CASE from the query then it working just good.

Like:

SELECT  * 
FROM pg_constraint pc,
CAST(generate_series(1, array_upper(pc.conkey, 1)) AS int) AS position;

This started failing with 69f4b9c85f168ae006929eec44fc44d569e846b9. It seems
check_srf_call_placement() sets the hasTargetSRFs flag and but when the SRFs
at the rtable ofcourse this flag doesn't get set. It seems like missing something
their, but I might be completely wrong as not quire aware of this area.


I'm a bit surprised that your query actually works...and without delving into source code its hard to explain why it should/shouldn't or whether the recent SRF work was intended to impact it.

In any case the more idiomatic way of writing your query these days (since 9.4 came out) is:

SELECT *
FROM pg_constraint pc
LEFT JOIN LATERAL generate_series(1, case when contype in ('f','p','u') then array_upper(pc.conkey, 1) else 0 end) gs ON true;

generate_series is smart enough to return an empty set (instead of erroring out) when provided with (1,0) as arguments.

David J.

Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commitid: 69f4b9c85f)

From
"David G. Johnston"
Date:
On Fri, Jan 27, 2017 at 3:13 PM, David G. Johnston <david.g.johnston@gmail.com> wrote:
In any case the more idiomatic way of writing your query these days (since 9.4 came out) is:

SELECT *
FROM pg_constraint pc
LEFT JOIN LATERAL generate_series(1, case when contype in ('f','p','u') then array_upper(pc.conkey, 1) else 0 end) gs ON true;


Supposedly ​should work back to 9.3​, mis-remembered when LATERAL was released.

David J.

Re: [HACKERS] Query fails when SRFs are part of FROM clause (Commitid: 69f4b9c85f)

From
Rushabh Lathia
Date:


On Sat, Jan 28, 2017 at 3:43 AM, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Fri, Jan 27, 2017 at 5:28 AM, Rushabh Lathia <rushabh.lathia@gmail.com> wrote:
Consider the below test;

CREATE TABLE tab ( a int primary key);

SELECT  *
FROM pg_constraint pc,
CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1, array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;

Above query is failing with "set-valued function called in context that cannot
accept a set". But if I remove the CASE from the query then it working just good.

Like:

SELECT  * 
FROM pg_constraint pc,
CAST(generate_series(1, array_upper(pc.conkey, 1)) AS int) AS position;

This started failing with 69f4b9c85f168ae006929eec44fc44d569e846b9. It seems
check_srf_call_placement() sets the hasTargetSRFs flag and but when the SRFs
at the rtable ofcourse this flag doesn't get set. It seems like missing something
their, but I might be completely wrong as not quire aware of this area.


I'm a bit surprised that your query actually works...and without delving into source code its hard to explain why it should/shouldn't or whether the recent SRF work was intended to impact it.

In any case the more idiomatic way of writing your query these days (since 9.4 came out) is:

SELECT *
FROM pg_constraint pc
LEFT JOIN LATERAL generate_series(1, case when contype in ('f','p','u') then array_upper(pc.conkey, 1) else 0 end) gs ON true;

generate_series is smart enough to return an empty set (instead of erroring out) when provided with (1,0) as arguments.


Thanks for the providing work-around query and I also understood your point.

At the same time reason to raise this issue was, because this was working before
69f4b9c85f168ae006929eec44fc44d569e846b9 commit and now its throwing
an error. So whether its intended or query started failing because of some
bug introduced with the commit.

Issues is reproducible when query re-written with LEFT JOIN LATERAL and I
continue to use CASE statement. 

SELECT *
FROM pg_constraint pc
LEFT JOIN LATERAL CAST((CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1, array_upper(pc.conkey, 1)) ELSE NULL END) AS int) gs ON true;
ERROR:  set-valued function called in context that cannot accept a set



David J.




--
Rushabh Lathia

Hi,

On 2017-01-27 17:58:04 +0530, Rushabh Lathia wrote:
> Consider the below test;
> 
> CREATE TABLE tab ( a int primary key);
> 
> SELECT  *
> FROM pg_constraint pc,
> CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1,
> array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;
> 
> Above query is failing with "set-valued function called in context that
> cannot
> accept a set".

I think that's correct. Functions in FROM are essentially a shorthand
for ROWS FROM(). And ROWS FROM doesn't allow arbitrary expressions.  It
works if you remove the CASE because then it's a valid ROWS FROM
content.


If, I didn't check, that worked previously, I think that was more
accident than intent.

> But if I remove the CASE from the query then it working just
> good.
> 
> Like:
> 
> SELECT  *
> FROM pg_constraint pc,
> CAST(generate_series(1, array_upper(pc.conkey, 1)) AS int) AS position;

This IMO shouldn't work either due to the CAST. But indeed it does.


> This started failing with 69f4b9c85f168ae006929eec44fc44d569e846b9. It seems
> check_srf_call_placement() sets the hasTargetSRFs flag and but when the SRFs
> at the rtable ofcourse this flag doesn't get set. It seems like missing
> something
> their, but I might be completely wrong as not quire aware of this area.

That's right, because it's not in the targetlist.

Regards,

Andres



Andres Freund <andres@anarazel.de> writes:
> On 2017-01-27 17:58:04 +0530, Rushabh Lathia wrote:
>> SELECT  *
>> FROM pg_constraint pc,
>> CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1,
>> array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;
>> 
>> Above query is failing with "set-valued function called in context that
>> cannot accept a set".

> I think that's correct. Functions in FROM are essentially a shorthand
> for ROWS FROM(). And ROWS FROM doesn't allow arbitrary expressions.

No, but it allows whatever looks syntactically like a function, including
casts.  IIRC, we made func_expr work that way ages ago to deflect
complaints that it wasn't very clear why some things-that-look-like-
functions were allowed in CREATE INDEX and others not.

> If, I didn't check, that worked previously, I think that was more
> accident than intent.

Yeah, probably.  But are we prepared to break working queries?
As I understood it, the agreement on this whole tlist-SRF change
was that we would not change any behavior that wasn't ill-defined.

We could probably fix this with the modification that was discussed
previously, to allow FunctionScan nodes to project a scalar tlist
from the outputs of their SRFs.
        regards, tom lane



Hi,

On 2017-01-30 16:55:56 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-01-27 17:58:04 +0530, Rushabh Lathia wrote:
> >> SELECT  *
> >> FROM pg_constraint pc,
> >> CAST(CASE WHEN pc.contype IN ('f','u','p') THEN generate_series(1,
> >> array_upper(pc.conkey, 1)) ELSE NULL END AS int) AS position;
> >> 
> >> Above query is failing with "set-valued function called in context that
> >> cannot accept a set".
> 
> > I think that's correct. Functions in FROM are essentially a shorthand
> > for ROWS FROM(). And ROWS FROM doesn't allow arbitrary expressions.
> 
> No, but it allows whatever looks syntactically like a function, including
> casts.  IIRC, we made func_expr work that way ages ago to deflect
> complaints that it wasn't very clear why some things-that-look-like-
> functions were allowed in CREATE INDEX and others not.

But given e.g. the above example that's just about no limitation at all,
because you can nest nearly arbitrarily complex things within the
expression.


> > If, I didn't check, that worked previously, I think that was more
> > accident than intent.
> 
> Yeah, probably.

Really looks that way. I think it only works that way because we hit the
recovery branch for:/* * Normally the passed expression tree will be a FuncExprState, since the * grammar only allows a
functioncall at the top level of a table * function reference.  However, if the function doesn't return set then * the
plannermight have replaced the function call via constant-folding * or inlining.  So if we see any other kind of
expressionnode, execute * it via the general ExecEvalExpr() code; the only difference is that we * don't get a chance
topass a special ReturnSetInfo to any functions * buried in the expression. */
 
which does a normal ExecEvalExpr() whenever the expression to be
evaluated isn't a FuncExpr.

At the very least I think we need to amend that paragraph explaining
that there's a bunch of other cases it can be hit. And add tests for it.


> But are we prepared to break working queries?

Within some limits, we imo should be.


> As I understood it, the agreement on this whole tlist-SRF change
> was that we would not change any behavior that wasn't ill-defined.

I'd argue that behaviour that only worked through some edge case is
kinda ill defined ;) (and no, I'm not that serious)


> We could probably fix this with the modification that was discussed
> previously, to allow FunctionScan nodes to project a scalar tlist
> from the outputs of their SRFs.

Hm. I'm not quite following. Could you expand?


Greetings,

Andres Freund



Andres Freund <andres@anarazel.de> writes:
> On 2017-01-30 16:55:56 -0500, Tom Lane wrote:
>> No, but it allows whatever looks syntactically like a function, including
>> casts.  IIRC, we made func_expr work that way ages ago to deflect
>> complaints that it wasn't very clear why some things-that-look-like-
>> functions were allowed in CREATE INDEX and others not.

> But given e.g. the above example that's just about no limitation at all,
> because you can nest nearly arbitrarily complex things within the
> expression.

Yeah, exactly.  But that's true anyway because even if it was
syntactically a plain function, it might've been a SQL function that the
planner chooses to inline.

>> But are we prepared to break working queries?

> Within some limits, we imo should be.

In this case I think the argument for rejecting is pretty darn weak;
it's an arbitrary implementation restriction, not anything with much
principle to it.

>> We could probably fix this with the modification that was discussed
>> previously, to allow FunctionScan nodes to project a scalar tlist
>> from the outputs of their SRFs.

> Hm. I'm not quite following. Could you expand?

Make it work like Agg and WindowFunc.  To wit, dump the actually special
function calls (the set-returning functions) into a list that's internal
to the FunctionScan node, and then anything above those goes into scalar
expressions in the node's tlist, which refer to the SRF outputs using
Vars or things morally equivalent to Vars.

This would not support nested SRFs in FROM, but that case has always
failed and I've heard no field requests to make it work, so I don't feel
bad about keeping that restriction.
        regards, tom lane



On 2017-01-30 17:24:31 -0500, Tom Lane wrote:
> Make it work like Agg and WindowFunc.  To wit, dump the actually special
> function calls (the set-returning functions) into a list that's internal
> to the FunctionScan node, and then anything above those goes into scalar
> expressions in the node's tlist, which refer to the SRF outputs using
> Vars or things morally equivalent to Vars.

Hm. That should be fairly doable.  (I'd advocate very strongly against
building that list via ExecInitExpr, but that's an implementation
detail).  We'd evaluate SRFs early, but that's just consistent with
targetlist SRFs.

Wonder if we there's an argument to be made for implementing this
roughly similarly to split_pathtarget_at_srf - instead of injecting a
ProjectSet node we'd add a FunctionScan node below a Result node.

- Andres



Andres Freund <andres@anarazel.de> writes:
> Wonder if we there's an argument to be made for implementing this
> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> ProjectSet node we'd add a FunctionScan node below a Result node.

Yeah, possibly.  That would have the advantage of avoiding an ExecProject
step when the SRFs aren't buried, which would certainly be the expected
case.

If you don't want to make ExecInitExpr responsible, then the planner would
have to do something like split_pathtarget_at_srf anyway to decompose the
expressions, no matter which executor representation we use.
        regards, tom lane



On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> Wonder if we there's an argument to be made for implementing this
>> roughly similarly to split_pathtarget_at_srf - instead of injecting a
>> ProjectSet node we'd add a FunctionScan node below a Result node.
>
> Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> step when the SRFs aren't buried, which would certainly be the expected
> case.
>
> If you don't want to make ExecInitExpr responsible, then the planner would
> have to do something like split_pathtarget_at_srf anyway to decompose the
> expressions, no matter which executor representation we use.

Did we do anything about this?  Are we going to?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> If you don't want to make ExecInitExpr responsible, then the planner would
>> have to do something like split_pathtarget_at_srf anyway to decompose the
>> expressions, no matter which executor representation we use.

> Did we do anything about this?  Are we going to?

No, and I think we should.  Is it on the v10 open items list?
        regards, tom lane



Tom, all,

* Tom Lane (tgl@sss.pgh.pa.us) wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> If you don't want to make ExecInitExpr responsible, then the planner would
> >> have to do something like split_pathtarget_at_srf anyway to decompose the
> >> expressions, no matter which executor representation we use.
>
> > Did we do anything about this?  Are we going to?
>
> No, and I think we should.  Is it on the v10 open items list?

Wasn't, I've added it now:

https://wiki.postgresql.org/wiki/PostgreSQL_10_Open_Items

Thanks!

Stephen

On 2017-03-09 13:34:22 -0500, Robert Haas wrote:
> On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Andres Freund <andres@anarazel.de> writes:
> >> Wonder if we there's an argument to be made for implementing this
> >> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> >> ProjectSet node we'd add a FunctionScan node below a Result node.
> >
> > Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> > step when the SRFs aren't buried, which would certainly be the expected
> > case.
> >
> > If you don't want to make ExecInitExpr responsible, then the planner would
> > have to do something like split_pathtarget_at_srf anyway to decompose the
> > expressions, no matter which executor representation we use.
> 
> Did we do anything about this?  Are we going to?

Working on a patch.

- Andres



Re: Query fails when SRFs are part of FROM clause (Commit id:69f4b9c85f)

From
Noah Misch
Date:
On Fri, Mar 10, 2017 at 11:49:46AM -0800, Andres Freund wrote:
> On 2017-03-09 13:34:22 -0500, Robert Haas wrote:
> > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > Andres Freund <andres@anarazel.de> writes:
> > >> Wonder if we there's an argument to be made for implementing this
> > >> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> > >> ProjectSet node we'd add a FunctionScan node below a Result node.
> > >
> > > Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> > > step when the SRFs aren't buried, which would certainly be the expected
> > > case.
> > >
> > > If you don't want to make ExecInitExpr responsible, then the planner would
> > > have to do something like split_pathtarget_at_srf anyway to decompose the
> > > expressions, no matter which executor representation we use.
> > 
> > Did we do anything about this?  Are we going to?
> 
> Working on a patch.

[Action required within three days.  This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item.  Andres,
since you committed the patch believed to have created it, you own this open
item.  If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know.  Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message.  Include a date for your subsequent status update.  Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10.  Consequently, I will appreciate your efforts
toward speedy resolution.  Thanks.

[1] https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com



Re: Re: Query fails when SRFs are part of FROM clause(Commit id: 69f4b9c85f)

From
Andres Freund
Date:
On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
> On Fri, Mar 10, 2017 at 11:49:46AM -0800, Andres Freund wrote:
> > On 2017-03-09 13:34:22 -0500, Robert Haas wrote:
> > > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > Andres Freund <andres@anarazel.de> writes:
> > > >> Wonder if we there's an argument to be made for implementing this
> > > >> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> > > >> ProjectSet node we'd add a FunctionScan node below a Result node.
> > > >
> > > > Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> > > > step when the SRFs aren't buried, which would certainly be the expected
> > > > case.
> > > >
> > > > If you don't want to make ExecInitExpr responsible, then the planner would
> > > > have to do something like split_pathtarget_at_srf anyway to decompose the
> > > > expressions, no matter which executor representation we use.
> > > 
> > > Did we do anything about this?  Are we going to?
> > 
> > Working on a patch.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Andres,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.

I've a very preliminary patch.  I'd like to only start polishing it up
once the code freeze is over, so I can work on getting some patches in -
note that I myself have no pending patches.  Once frozen I'll polish it
up and send that within a few days.

Ok?

- Andres



Andres Freund <andres@anarazel.de> writes:
> On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
>> [Action required within three days.  This is a generic notification.]

> I've a very preliminary patch.  I'd like to only start polishing it up
> once the code freeze is over, so I can work on getting some patches in -
> note that I myself have no pending patches.  Once frozen I'll polish it
> up and send that within a few days.

FWIW, I'm willing to help out on this.
        regards, tom lane



On Wed, Apr 05, 2017 at 12:16:25AM -0700, Andres Freund wrote:
> On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
> > On Fri, Mar 10, 2017 at 11:49:46AM -0800, Andres Freund wrote:
> > > On 2017-03-09 13:34:22 -0500, Robert Haas wrote:
> > > > On Mon, Jan 30, 2017 at 6:54 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > > > > Andres Freund <andres@anarazel.de> writes:
> > > > >> Wonder if we there's an argument to be made for implementing this
> > > > >> roughly similarly to split_pathtarget_at_srf - instead of injecting a
> > > > >> ProjectSet node we'd add a FunctionScan node below a Result node.
> > > > >
> > > > > Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> > > > > step when the SRFs aren't buried, which would certainly be the expected
> > > > > case.
> > > > >
> > > > > If you don't want to make ExecInitExpr responsible, then the planner would
> > > > > have to do something like split_pathtarget_at_srf anyway to decompose the
> > > > > expressions, no matter which executor representation we use.
> > > > 
> > > > Did we do anything about this?  Are we going to?
> > > 
> > > Working on a patch.
> > 
> > [Action required within three days.  This is a generic notification.]
> > 
> > The above-described topic is currently a PostgreSQL 10 open item.  Andres,
> > since you committed the patch believed to have created it, you own this open
> > item.  If some other commit is more relevant or if this does not belong as a
> > v10 open item, please let us know.  Otherwise, please observe the policy on
> > open item ownership[1] and send a status update within three calendar days of
> > this message.  Include a date for your subsequent status update.  Testers may
> > discover new open items at any time, and I want to plan to get them all fixed
> > well in advance of shipping v10.  Consequently, I will appreciate your efforts
> > toward speedy resolution.  Thanks.
> 
> I've a very preliminary patch.  I'd like to only start polishing it up
> once the code freeze is over, so I can work on getting some patches in -
> note that I myself have no pending patches.  Once frozen I'll polish it
> up and send that within a few days.
> 
> Ok?

Okay; using my simplistic translator of "a few", I'll look for your next
status update on or before 2017-04-11.  As always, feel free to set a
different date.



On 2017-01-30 18:54:50 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Wonder if we there's an argument to be made for implementing this
> > roughly similarly to split_pathtarget_at_srf - instead of injecting a
> > ProjectSet node we'd add a FunctionScan node below a Result node.
>
> Yeah, possibly.  That would have the advantage of avoiding an ExecProject
> step when the SRFs aren't buried, which would certainly be the expected
> case.
>
> If you don't want to make ExecInitExpr responsible, then the planner would
> have to do something like split_pathtarget_at_srf anyway to decompose the
> expressions, no matter which executor representation we use.

Working on this I came across a few things:

Splitting things away from the FunctionScan node doesn't work entirely
naturally, due to ORDINALITY.  Consider e.g. cases where there's no
function to evaluate anymore, because it got inlined, but we want to
ORDINALITY.  We could obviously support FunctionScan nodes without any
associated (or empty) RangeTblFunctions, but that seems awkward.

Secondly, doing non-function stuff gets interesting when volatility is
involved: Consider something like FROM CAST(srf() * volatile_func() AS
whatnot); when, and how often, does volatile_func() get evaluated?  If
we put it somewhere around the projection path it'll only get evaluated
if the rows are actually retrieved and will get re-evaluated upon
projection.  If we put it somewhere below the tuplestores that
nodeFunctionscan./execSRF.c generate for each RangeTblFunction, it'll
get called evaluated regardless of being retrieved.  The latter is
noticeably more complicated, because of SRFM_Materialize SRFs.   Our
policy around when it's ok to re-evaluate volatile functions isn't clear
enough to me, to say which one is right and which one is wrong.

For now there's simply another RangeTblFunctions field with a separate
non-FuncExpr expression, that can reference to the SRF output via scan
Vars.  That's evaluated in FunctionNext's !simple branch, where we
conveniently have a separate slot already.  The separation currently
happens create_functionscan_plan().

Tom, do you have any opinion on the volatility stuff?

- Andres



Andres Freund <andres@anarazel.de> writes:
> Tom, do you have any opinion on the volatility stuff?

What was the previous behavior for such cases?  If it was reasonably
sane, we probably have to preserve it.  If it was unpredictable or
completely wacko, maybe we don't.
        regards, tom lane



On 2017-04-11 17:25:52 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Tom, do you have any opinion on the volatility stuff?
> 
> What was the previous behavior for such cases?  If it was reasonably
> sane, we probably have to preserve it.  If it was unpredictable or
> completely wacko, maybe we don't.

Previously we'd stash the result in a new tuplestore, because it
happened inside ExecMakeTableFunctionResult()'s fallback path.  The
inner tuplestore (from the proper SRF) would get evaluated via the the
isDone mechanism.

That'd imo be a fair amount of work to emulate, because we'd have to
manually go over the tuplesttore.

But given that we do *not* have similar semantics for volatiles in the
targetlist, I'm quite unconvinced that that's necessary.  Consider
e.g. my previous example of SELECT * FROM CAST(srf() * volatile_func() AS whatnot)
rewritten into a saner version as SELECT srf * volatile_func() FROM srf() AS srf;
here volatile_func() would before and now get re-evaluated if there's a
rewind, and would only be invoked if the row is actually evaluated.

- Andres



Andres Freund <andres@anarazel.de> writes:
> On 2017-04-11 17:25:52 -0400, Tom Lane wrote:
>> What was the previous behavior for such cases?  If it was reasonably
>> sane, we probably have to preserve it.  If it was unpredictable or
>> completely wacko, maybe we don't.

> Previously we'd stash the result in a new tuplestore, because it
> happened inside ExecMakeTableFunctionResult()'s fallback path.  The
> inner tuplestore (from the proper SRF) would get evaluated via the the
> isDone mechanism.

> That'd imo be a fair amount of work to emulate, because we'd have to
> manually go over the tuplesttore.

Yeah.  I don't have a big problem with saying that things that aren't
themselves SRFs are evaluated as though in a projection step atop the
SRF calculation.  We've already crossed that bridge with respect to
expressions around SRFs in the tlist --- for instance this:

regression=# select f1, case when f1>0 then generate_series(1,2) else null end as c from int4_tbl;    f1      | c
-------------+---          0 |            0 |       123456 | 1     123456 | 2    -123456 |      -123456 |   2147483647
|1 2147483647 | 2-2147483647 |  -2147483647 |   
(10 rows)

gives different results than it used to:

regression96=# select f1, case when f1>0 then generate_series(1,2) else null end as c from int4_tbl;    f1      | c
-------------+---          0 |       123456 | 1     123456 | 2    -123456 |   2147483647 | 1 2147483647 | 2-2147483647
|  
(7 rows)

Now, that old behavior matches what you got in the RangeFunction case:

regression96=# select * from int4_tbl, cast(case when f1>0 then generate_series(1,2) else null end as int);    f1
|int4  
-------------+------          0 |          123456 |    1     123456 |    2    -123456 |      2147483647 |    1
2147483647|    2-2147483647 |      
(7 rows)

So it would make sense to me for our new behavior to still match the
targetlist case.

Not sure if that's exactly the same as what you're saying or not.
        regards, tom lane



On 2017-04-11 17:42:42 -0400, Tom Lane wrote:
> Now, that old behavior matches what you got in the RangeFunction case:
> 
> regression96=# select * from int4_tbl, cast(case when f1>0 then generate_series(1,2) else null end as int);
>      f1      | int4 
> -------------+------
>            0 |     
>       123456 |    1
>       123456 |    2
>      -123456 |     
>   2147483647 |    1
>   2147483647 |    2
>  -2147483647 |     
> (7 rows)
> 
> So it would make sense to me for our new behavior to still match the
> targetlist case.
> 
> Not sure if that's exactly the same as what you're saying or not.

The patch now indeed returns

regression[20994][1]=# select * from int4_tbl, cast(case when f1>0 then generate_series(1,2) else null end as int);
WARNING:  01000: replacing
LOCATION:  frobble_rtefunc, createplan.c:3102
(as you can see, this ain't quite ready)
┌─────────────┬────────┐
│     f1      │  int4  │
├─────────────┼────────┤
│           0 │ (null) │
│           0 │ (null) │
│      123456 │      1 │
│      123456 │      2 │
│     -123456 │ (null) │
│     -123456 │ (null) │
│  2147483647 │      1 │
│  2147483647 │      2 │
│ -2147483647 │ (null) │
│ -2147483647 │ (null) │
└─────────────┴────────┘
(10 rows)

The basic approach seems quite workable.  It's not super extensible to
allow SRFs deeper inside generic ROWS FROM arguments however - I'm not
sure there's any need to work towards that however, I've not heard
demands so far.   Any arguments against that?

One other thing where it'd currently affect behaviour is something like:
SELECT * FROM CAST(generate_series(1,0) * 5 as int);

which, in < v10 would return 1 row, but with my patch returns no rows.
That makes a lot more sense in my opinion, given that a plain FROM
generate_series(1,0) doesn't return any rows in either version.

Right now I'm mopping up corner cases where it'd *expand* the set of
currently valid commands in an inconsistent manner. Namely FROM
int4mul(generate_series(..), 5) works, but FROM
composite_returning(somesrf()) wouldn't without additional work.  I plan
to continue to error out in either...

Greetings,

Andres Freund



On 2017-04-05 09:39:37 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2017-04-05 02:47:55 -0400, Noah Misch wrote:
> >> [Action required within three days.  This is a generic notification.]
>
> > I've a very preliminary patch.  I'd like to only start polishing it up
> > once the code freeze is over, so I can work on getting some patches in -
> > note that I myself have no pending patches.  Once frozen I'll polish it
> > up and send that within a few days.
>
> FWIW, I'm willing to help out on this.

Help would be appreciated.  I've pondered, and partially implemented,
several approaches so far, and my conclusion is that we should just do
nothing.  The amount of corner cases is just too big, and the utility of
the feature too small.

To recap, the issue is that in previous versions it was, by accident
(there's no test, code comments "supporting" the feature talk about a
different corner case, and the behaviour isn't correct in some cases)
allowed to do something like:
SELECT * FROM CAST(generate_series(1,3) * 5 AS int);
while
SELECT * FROM generate_series(1,3) * 5;
is not allowed.

The reason that that works from the gram.y perspective is that CAST etc
types of func_expr's.  The reason that it worked from a code perspective
is that execQual.c:ExecMakeTableFunctionResult() has the following:/* * Normally the passed expression tree will be a
FuncExprState,since the * grammar only allows a function call at the top level of a table * function reference.
However,if the function doesn't return set then * the planner might have replaced the function call via
constant-folding* or inlining.  So if we see any other kind of expression node, execute * it via the general
ExecEvalExpr()code; the only difference is that we * don't get a chance to pass a special ReturnSetInfo to any
functions* buried in the expression. */if (funcexpr && IsA(funcexpr, FuncExprState) &&    IsA(funcexpr->expr,
FuncExpr))
and back then ExecEvalExpr() was allowed to return sets.  It also
depends on some default initializations (e.g. rsinfo.returnMode =
SFRM_ValuePerCall).

This yields plenty weird behaviour in < v10. E.g. the following is
disallowed: SELECT * FROM int4mul(generate_series(1,3), 1); ERROR:  0A000: set-valued function called in context that
cannotaccept a set
 
as is SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS bigint); ERROR:  0A000: set-valued function called in
contextthat cannot accept a set
 
because the cast is implemented as int8(expr) which avoids the fallback
path as it's a FuncExpr, but SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS text);
works because the cast is implemented as a io coercion, which is not a
funcexpr. Therefore it uses the fallback ExecEvalExpr().

The mismatch between ExecEvalExpr() and nodeFunctionscan.c SRF behaviour
back then also yields odd behaviour, e.g.: SELECT * FROM generate_series(1,0);
returns zero rows, but SELECT * FROM CAST(generate_series(1,0) * 5 AS INT);
returns one NULL row.


In v10, as it stands, these all error out, because the SRFs are now only
to be evaluated via either nodeFunctionscan.c (FROM) or via
nodeProjectSet.c (targetlist), not ExecEvalExpr() anymore.


I've basically pondered three different methods of implementing
something akin to the old behaviour:

1) Move the non-SRF part into nodeFunctionscan.c's targetlist, and let  it be evaluated there.  E.g. if the expression
is CAST(generate_series(1,5) AS text), evaluate the generate_series(1,5)  using nodeFunctionscan's
FunctionScanPerFuncStatemachinery, but  implement the CAST as CAST(Var(whatever, 1) AS Text).
 
  That doesn't work well for composite/record returning rows, because  RangeTblFunction's returning composites are
expandedinto multiple  columns. E.g.  SELECT * FROM CAST(CAST(twocol_noset_outline(generate_series(1,3), 'text') AS
text)AS twocol);  returns all the columns of twocol, not a single wholerow datum.
 
  There's also some issues around what to do with cases involving  volatile functions when the output is not
referenced,or referenced  multiple times e.g.    SELECT f, f FROM CAST(generate_series(1,3) * nextval(...)) AS f;
wouldevaluate nextval() twice with such an approach...
 

2) During planning, split of the SRF bit from the !SRF bit and have  nodeFunctionscan.c evaluate both separately, like
1). That allows to  avoid the volatility issue, because the targetlist is still projected  separately.
 
  I've prototyped this to a reasonable point, by having  create_functionscan_plan() process each RangeTblFunction and
split the expression into SRF and non-SRF parts, and then have  FunctionNext() evaluate the non-SRF part.
 
  At the current state of my prototype this happens to allow simple SRF  in arguments cases like SELECT * FROM
int4mul(generate_series(1,3),1)  which are disallowed in < v10.
 

  This is reasonably-ish in complexity for scalar values, but gets  quite complicated for composite/record datums.
Suddenlythere's two  different types of values that we need to deal with, the type of  datum returned by the SRF, and
thetype of datum returned by the  final expression - they might be the same, they might not.  Especially for
record-returningfunctions, where ROWS FROM( AS...)  determines the column types, and ExecMakeTableFunctionResult() does
a tupledesc_match() to verify the SRF returned something reasonable,  it gets wild: What would we even be matching the
typesagainst?
 


3) Implement nested FROM SRFs as pipelined executor nodes, similar to  the way targetlist SRFs are now handled.  E.g.
somethinglike  SELECT * FROM CAST(generate_series(1,10) * 5 AS int);
 
  would be implemented as one nodeFunctionscan.c for the  generate_series(), and then something like nodeProjectSet.c
(ora  nodeFunctionscan.c branch that'd make a sub-node available) would  evaluate the CAST(Var() * 5 AS int).
 
  This approach has the advantage that it'd allow us, potentially in  the future, to get rid of the restriction that
normalROWS FROM  doesn't allow SRFs in arguments.
 
  I think this again runs into trouble with row-expansion in the return  value from SRFs, which isn't actually desired
tillthe outermost  "level" of the expression.  I guess we could add a mode to  nodeFunctionscan.c that forces
composite/recordtypes to be returned  as wholerow vars instead of being split up.
 


My conclusion here is that it's way too complicated to properly
implement a feature that only seems to exist by accident and has plenty
of weird behaviour.  Currently we continue to accept all the !SRF
expressions that were previously accepted, but I'd even consider
insisting that the expression needs to be a proper FuncExpr at
parse-analysis time (before inlining/const evaluation did its thing).

Unless somebody has a radically better idea how to implement this?


- Andres



On Wed, Apr 12, 2017 at 6:58 PM, Andres Freund <andres@anarazel.de> wrote:
> This yields plenty weird behaviour in < v10. E.g. the following is
> disallowed:
>   SELECT * FROM int4mul(generate_series(1,3), 1);
>   ERROR:  0A000: set-valued function called in context that cannot accept a set
> as is
>   SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS bigint);
>   ERROR:  0A000: set-valued function called in context that cannot accept a set
> because the cast is implemented as int8(expr) which avoids the fallback
> path as it's a FuncExpr, but
>   SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS text);
> works because the cast is implemented as a io coercion, which is not a
> funcexpr. Therefore it uses the fallback ExecEvalExpr().

I don't think it's remotely reasonable to try to reproduce this kind
of behavior exactly.  I think the question is: if we do nothing here,
will users be pissed?  The answer is not clear to me.  Rushabh's
original report cast this as a possible bug, not a query he actually
needed to work for any particular real-world purpose.  On the other
hand, I don't quite understand why any of these examples should fail.
If you can select from generate_series() as if it were a table, it
seems like you ought to be able to also apply one or more functions to
the result and select from the result.  On the third hand, if this
only sort of half-worked in v9.6, it's hard to say it's a must-have
for v10.

So I'm not sure what the right thing to do here is.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



On 2017-04-13 12:53:27 -0400, Robert Haas wrote:
> On Wed, Apr 12, 2017 at 6:58 PM, Andres Freund <andres@anarazel.de> wrote:
> > This yields plenty weird behaviour in < v10. E.g. the following is
> > disallowed:
> >   SELECT * FROM int4mul(generate_series(1,3), 1);
> >   ERROR:  0A000: set-valued function called in context that cannot accept a set
> > as is
> >   SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS bigint);
> >   ERROR:  0A000: set-valued function called in context that cannot accept a set
> > because the cast is implemented as int8(expr) which avoids the fallback
> > path as it's a FuncExpr, but
> >   SELECT * FROM CAST(int4mul(generate_series(1,3), 1) AS text);
> > works because the cast is implemented as a io coercion, which is not a
> > funcexpr. Therefore it uses the fallback ExecEvalExpr().
> 
> I don't think it's remotely reasonable to try to reproduce this kind
> of behavior exactly.  I think the question is: if we do nothing here,
> will users be pissed?  The answer is not clear to me.  Rushabh's
> original report cast this as a possible bug, not a query he actually
> needed to work for any particular real-world purpose.  On the other
> hand, I don't quite understand why any of these examples should fail.
> If you can select from generate_series() as if it were a table, it
> seems like you ought to be able to also apply one or more functions to
> the result and select from the result.  On the third hand, if this
> only sort of half-worked in v9.6, it's hard to say it's a must-have
> for v10.

I'd say it really didn't work at v9.6, and it wasn't intended to,
there's always[TM] been explicit code to check for that:
    /* We don't allow sets in the arguments of the table function */    if (argDone != ExprSingleResult)
ereport(ERROR,               (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),                 errmsg("set-valued function
calledin context that cannot accept a set")));
 

it's just that it ended up not being entirely reliably checked...


I think there's some argument to be made that SRF in arguments ought to
work for reasons of consistency, but the required complications and the
lack of field demand make me skeptical about it being worth it.


Greetings,

Andres Freund



Robert Haas <robertmhaas@gmail.com> writes:
> So I'm not sure what the right thing to do here is.

Andres' points about composite vs noncomposite function result types
seem pretty compelling: we could make the behavior better for scalar
results, but if it then diverges from what happens for composite
results, I don't think that's a step forward.

In the end, no matter how much work we put in, there are going to be some
corner cases that act differently from before.  Considering that none of
the previous corner-case behavior here was particularly carefully thought
through, that's not necessarily disastrous.  We should also consider that
contorting the logic to be bug-compatible with prior behavior may preclude
additional optimization work in future.

I'm a bit inclined to agree with the idea of explicitly requiring SRFs
in FROM to appear only at the top level of the expression.  That was
certainly the only case that the old code was really designed to support,
and I doubt that the documentation claims that it works otherwise.  Also,
to the extent that that ensures we give a clear error rather than possibly
giving results that differ from pre-v10, it's probably going to be less
of a foot-gun for users.

In any case we'd have some documentation work to do here.  Maybe we should
first look at what, if anything, the docs currently say in this area, and
how they'd need to be adjusted if we stick with the currently-implemented
behavior.  As Andres noted, some of the code comments need work too.
        regards, tom lane



I wrote:
> I'm a bit inclined to agree with the idea of explicitly requiring SRFs
> in FROM to appear only at the top level of the expression.

If we are going to go down this road, I think it would be a good idea
to try to provide a cursor position for the "can't accept a set" error
message, because otherwise it will be really unclear what's wrong with
something like

SELECT * FROM ROWS FROM (generate_series(1,10), abs(generate_series(1,10)));

I looked into what it would take to do that, and was pleasantly surprised
to find out that most of the infrastructure is already there since
commit 4c728f38.  Basically all we have to do is the attached.

Now that this infrastructure exists, anything that has access to a
PlanState or ExprContext, plus a parse node containing a location, is able
to report an error cursor.  It took a considerable effort of will not to
start plastering error position reports on a lot of other executor errors.
I think we should do that, but it smells new-feature-y, and hence
something to tackle for v11 not now.  But if v10 is moving the goalposts
on where you can put an SRF call, I think we should do this much in v10.

Naming note: a name like ExecErrPosition() would have been more consistent
with the other stuff that execUtils.c exports.  But since this is meant
to be used in ereport() calls, whose support functions generally aren't
camel-cased, I thought executor_errposition() was a better choice.
I'm not particularly set on that though.

            regards, tom lane

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 15d693f..5a34a46 100644
*** a/src/backend/executor/execExpr.c
--- b/src/backend/executor/execExpr.c
*************** ExecInitFunc(ExprEvalStep *scratch, Expr
*** 2103,2109 ****
      if (flinfo->fn_retset)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("set-valued function called in context that cannot accept a set")));

      /* Build code to evaluate arguments directly into the fcinfo struct */
      argno = 0;
--- 2103,2111 ----
      if (flinfo->fn_retset)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("set-valued function called in context that cannot accept a set"),
!                  parent ? executor_errposition(parent->state,
!                                           exprLocation((Node *) node)) : 0));

      /* Build code to evaluate arguments directly into the fcinfo struct */
      argno = 0;
diff --git a/src/backend/executor/execSRF.c b/src/backend/executor/execSRF.c
index 4badd5c..077ac20 100644
*** a/src/backend/executor/execSRF.c
--- b/src/backend/executor/execSRF.c
***************
*** 34,40 ****


  /* static function decls */
! static void init_sexpr(Oid foid, Oid input_collation, SetExprState *sexpr,
             MemoryContext sexprCxt, bool allowSRF, bool needDescForSRF);
  static void ShutdownSetExpr(Datum arg);
  static void ExecEvalFuncArgs(FunctionCallInfo fcinfo,
--- 34,41 ----


  /* static function decls */
! static void init_sexpr(Oid foid, Oid input_collation, Expr *node,
!            SetExprState *sexpr, PlanState *parent,
             MemoryContext sexprCxt, bool allowSRF, bool needDescForSRF);
  static void ShutdownSetExpr(Datum arg);
  static void ExecEvalFuncArgs(FunctionCallInfo fcinfo,
*************** ExecInitTableFunctionResult(Expr *expr,
*** 77,83 ****
          state->funcReturnsSet = func->funcretset;
          state->args = ExecInitExprList(func->args, parent);

!         init_sexpr(func->funcid, func->inputcollid, state,
                     econtext->ecxt_per_query_memory, func->funcretset, false);
      }
      else
--- 78,84 ----
          state->funcReturnsSet = func->funcretset;
          state->args = ExecInitExprList(func->args, parent);

!         init_sexpr(func->funcid, func->inputcollid, expr, state, parent,
                     econtext->ecxt_per_query_memory, func->funcretset, false);
      }
      else
*************** ExecInitFunctionResultSet(Expr *expr,
*** 438,444 ****
          FuncExpr   *func = (FuncExpr *) expr;

          state->args = ExecInitExprList(func->args, parent);
!         init_sexpr(func->funcid, func->inputcollid, state,
                     econtext->ecxt_per_query_memory, true, true);
      }
      else if (IsA(expr, OpExpr))
--- 439,445 ----
          FuncExpr   *func = (FuncExpr *) expr;

          state->args = ExecInitExprList(func->args, parent);
!         init_sexpr(func->funcid, func->inputcollid, expr, state, parent,
                     econtext->ecxt_per_query_memory, true, true);
      }
      else if (IsA(expr, OpExpr))
*************** ExecInitFunctionResultSet(Expr *expr,
*** 446,452 ****
          OpExpr       *op = (OpExpr *) expr;

          state->args = ExecInitExprList(op->args, parent);
!         init_sexpr(op->opfuncid, op->inputcollid, state,
                     econtext->ecxt_per_query_memory, true, true);
      }
      else
--- 447,453 ----
          OpExpr       *op = (OpExpr *) expr;

          state->args = ExecInitExprList(op->args, parent);
!         init_sexpr(op->opfuncid, op->inputcollid, expr, state, parent,
                     econtext->ecxt_per_query_memory, true, true);
      }
      else
*************** restart:
*** 645,651 ****
   * init_sexpr - initialize a SetExprState node during first use
   */
  static void
! init_sexpr(Oid foid, Oid input_collation, SetExprState *sexpr,
             MemoryContext sexprCxt, bool allowSRF, bool needDescForSRF)
  {
      AclResult    aclresult;
--- 646,653 ----
   * init_sexpr - initialize a SetExprState node during first use
   */
  static void
! init_sexpr(Oid foid, Oid input_collation, Expr *node,
!            SetExprState *sexpr, PlanState *parent,
             MemoryContext sexprCxt, bool allowSRF, bool needDescForSRF)
  {
      AclResult    aclresult;
*************** init_sexpr(Oid foid, Oid input_collation
*** 683,689 ****
      if (sexpr->func.fn_retset && !allowSRF)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("set-valued function called in context that cannot accept a set")));

      /* Otherwise, caller should have marked the sexpr correctly */
      Assert(sexpr->func.fn_retset == sexpr->funcReturnsSet);
--- 685,693 ----
      if (sexpr->func.fn_retset && !allowSRF)
          ereport(ERROR,
                  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
!                  errmsg("set-valued function called in context that cannot accept a set"),
!                  parent ? executor_errposition(parent->state,
!                                           exprLocation((Node *) node)) : 0));

      /* Otherwise, caller should have marked the sexpr correctly */
      Assert(sexpr->func.fn_retset == sexpr->funcReturnsSet);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index df3d650..08229bd 100644
*** a/src/backend/executor/execUtils.c
--- b/src/backend/executor/execUtils.c
***************
*** 28,33 ****
--- 28,35 ----
   *        ExecOpenScanRelation    Common code for scan node init routines.
   *        ExecCloseScanRelation
   *
+  *        executor_errposition    Report syntactic position of an error.
+  *
   *        RegisterExprContextCallback    Register function shutdown callback
   *        UnregisterExprContextCallback  Deregister function shutdown callback
   *
***************
*** 44,49 ****
--- 46,52 ----
  #include "access/relscan.h"
  #include "access/transam.h"
  #include "executor/executor.h"
+ #include "mb/pg_wchar.h"
  #include "nodes/nodeFuncs.h"
  #include "parser/parsetree.h"
  #include "storage/lmgr.h"
*************** UpdateChangedParamSet(PlanState *node, B
*** 686,691 ****
--- 689,724 ----
  }

  /*
+  * executor_errposition
+  *        Report an execution-time cursor position, if possible.
+  *
+  * This is expected to be used within an ereport() call.  The return value
+  * is a dummy (always 0, in fact).
+  *
+  * The locations stored in parsetrees are byte offsets into the source string.
+  * We have to convert them to 1-based character indexes for reporting to
+  * clients.  (We do things this way to avoid unnecessary overhead in the
+  * normal non-error case: computing character indexes would be much more
+  * expensive than storing token offsets.)
+  */
+ int
+ executor_errposition(EState *estate, int location)
+ {
+     int            pos;
+
+     /* No-op if location was not provided */
+     if (location < 0)
+         return 0;
+     /* Can't do anything if source text is not available */
+     if (estate == NULL || estate->es_sourceText == NULL)
+         return 0;
+     /* Convert offset to character number */
+     pos = pg_mbstrlen_with_len(estate->es_sourceText, location) + 1;
+     /* And pass it to the ereport mechanism */
+     return errposition(pos);
+ }
+
+ /*
   * Register a shutdown callback in an ExprContext.
   *
   * Shutdown callbacks will be called (in reverse order of registration)
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index f7f3189..3107cf5 100644
*** a/src/include/executor/executor.h
--- b/src/include/executor/executor.h
*************** extern bool ExecRelationIsTargetRelation
*** 482,487 ****
--- 482,489 ----
  extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
  extern void ExecCloseScanRelation(Relation scanrel);

+ extern int    executor_errposition(EState *estate, int location);
+
  extern void RegisterExprContextCallback(ExprContext *econtext,
                              ExprContextCallbackFunction function,
                              Datum arg);
diff --git a/src/test/regress/expected/tsrf.out b/src/test/regress/expected/tsrf.out
index 33f370b..c8ae361 100644
*** a/src/test/regress/expected/tsrf.out
--- b/src/test/regress/expected/tsrf.out
*************** SELECT few.dataa, count(*) FROM few WHER
*** 193,201 ****
--- 193,205 ----
  -- SRFs are not allowed in aggregate arguments
  SELECT min(generate_series(1, 3)) FROM few;
  ERROR:  set-valued function called in context that cannot accept a set
+ LINE 1: SELECT min(generate_series(1, 3)) FROM few;
+                    ^
  -- SRFs are not allowed in window function arguments, either
  SELECT min(generate_series(1, 3)) OVER() FROM few;
  ERROR:  set-valued function called in context that cannot accept a set
+ LINE 1: SELECT min(generate_series(1, 3)) OVER() FROM few;
+                    ^
  -- SRFs are normally computed after window functions
  SELECT id,lag(id) OVER(), count(*) OVER(), generate_series(1,3) FROM few;
   id | lag | count | generate_series
*************** SELECT int4mul(generate_series(1,2), 10)
*** 424,429 ****
--- 428,435 ----
  -- but SRFs in function RTEs must be at top level (annoying restriction)
  SELECT * FROM int4mul(generate_series(1,2), 10);
  ERROR:  set-valued function called in context that cannot accept a set
+ LINE 1: SELECT * FROM int4mul(generate_series(1,2), 10);
+                               ^
  -- DISTINCT ON is evaluated before tSRF evaluation if SRF is not
  -- referenced either in ORDER BY or in the DISTINCT ON list. The ORDER
  -- BY reference can be implicitly generated, if there's no other ORDER BY.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

On 2017-04-17 19:26:11 -0400, Tom Lane wrote:
> I wrote:
> > I'm a bit inclined to agree with the idea of explicitly requiring SRFs
> > in FROM to appear only at the top level of the expression.
> 
> If we are going to go down this road, I think it would be a good idea
> to try to provide a cursor position for the "can't accept a set" error
> message, because otherwise it will be really unclear what's wrong with
> something like

That sounds like a good plan.


> Now that this infrastructure exists, anything that has access to a
> PlanState or ExprContext, plus a parse node containing a location, is able
> to report an error cursor.

I've wished for that ability for a while...


> It took a considerable effort of will not to
> start plastering error position reports on a lot of other executor errors.
> I think we should do that, but it smells new-feature-y, and hence
> something to tackle for v11 not now.  But if v10 is moving the goalposts
> on where you can put an SRF call, I think we should do this much in v10.

FWIW, I'd be ok with relaxing things a bit around this, but let's get
the minimal thing in first.


> Naming note: a name like ExecErrPosition() would have been more consistent
> with the other stuff that execUtils.c exports.  But since this is meant
> to be used in ereport() calls, whose support functions generally aren't
> camel-cased, I thought executor_errposition() was a better choice.
> I'm not particularly set on that though.

Seems slightly better this way.


Looks good to me.


I'm working on a patch that adds a few more tests around the current
behaviour and that updates the code to match what we now know.

I couldn't find any place in the docs that actually documents our SRF
behaviour in any sort of detail ([1], [2]), particularly not documenting
where SRFs are legal, and how nested SRFs are supposed to behave.  I'm
not sure in how much detail we want to go?  If we want do document that,
where?  The closest seems to be "4.2.14. Expression Evaluation Rules"
[3].

Greetings,

Andres Freund

[1] https://www.postgresql.org/docs/devel/static/sql-select.html#sql-from
[2] https://www.postgresql.org/docs/devel/static/queries-table-expressions.html#queries-from
[3] https://www.postgresql.org/docs/devel/static/sql-expressions.html#syntax-express-eval



Andres Freund <andres@anarazel.de> writes:
> On 2017-04-17 19:26:11 -0400, Tom Lane wrote:
>> If we are going to go down this road, I think it would be a good idea
>> to try to provide a cursor position for the "can't accept a set" error
>> message, because otherwise it will be really unclear what's wrong with
>> something like

> Looks good to me.

Thanks for reviewing.

I did some further testing and noted that plperl and pltcl fail to pass
through the error cursor, although plpgsql and plpython seem OK.  That's
a pre-existing issue though --- they don't pass through plain syntax
error cursors either.  I'm fine with leaving that for later.  Otherwise,
it seemed to work, so pushed.

> I couldn't find any place in the docs that actually documents our SRF
> behaviour in any sort of detail ([1], [2]), particularly not documenting
> where SRFs are legal, and how nested SRFs are supposed to behave.  I'm
> not sure in how much detail we want to go?  If we want do document that,
> where?  The closest seems to be "4.2.14. Expression Evaluation Rules"
> [3].

Looks like you didn't notice xfunc.sgml?  There's a large amount of info
there, particularly section 37.4.8:

https://www.postgresql.org/docs/devel/static/xfunc-sql.html#xfunc-sql-functions-returning-set

I've never been totally happy with this presentation, since (a) it's
buried pretty far in the back of the manual, and (b) structurally it
looks like it applies only to SQL-language functions, despite the
disclaimer that says it applies to all languages.  Still, right now
is probably not the time to do massive docs surgery, and in any case
I'm not sure that bringing all that detail forward into 4.2 would
represent an improvement.  Maybe a link or two from chapter
4 would be the ticket.
        regards, tom lane