Thread: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andres Freund
Date:
Hi,

In my understanding FunctionCallInfoData is basically per-call data,
whereas FmgrInfo is information about the function.  It makes some sense
that ->context is in FunctionCallInfoData, after all it's used for
per-row data like the trigger context.  But we don't really change the
collation of function invocations per-call.  Thus I don't quite get why
FunctionCallInfoData contains information about it rather than FmgrInfo.

Now I get that there's the practical reason that we want to pass the
collation to function invocations made with DirectionFunctionCall*Coll,
and they currently don't set up a FmgrInfo.  But that doesn't really
convince me this is a good idea.

The reason I'm asking is that I'm fixing the JIT code generation for
expressions to be more efficient. And for functions that do not get
inlined I currently need to set up a separate *per-call*
FunctionCallInfoData to allow the optimizer to be free enough to do it's
thing.  And the less metadata has to be copied around, the better.  I'm
not sure it's realistic to change this, I mostly want to understand the
reasoning.

Greetings,

Andres Freund


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Chapman Flack
Date:
On 06/05/2018 04:57 PM, Andres Freund wrote:
> But we don't really change the
> collation of function invocations per-call. 

Is that true? (Not a rhetorical question; I'm unsure.)

Is your argument that the expression's collation is determined once
for each call /site/, and thereafter doesn't change over repeated
invocations via any one call site, and a unique FmgrInfo is
populated at each call site?

This is at the edges of my current understanding of how those gears
turn, so it could very well be true, but it seems to me that's how
the question would have to be put.

-Chap


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> In my understanding FunctionCallInfoData is basically per-call data,
> whereas FmgrInfo is information about the function.  It makes some sense
> that ->context is in FunctionCallInfoData, after all it's used for
> per-row data like the trigger context.  But we don't really change the
> collation of function invocations per-call.  Thus I don't quite get why
> FunctionCallInfoData contains information about it rather than FmgrInfo.

[squint]  I would say that the call collation is an argument, not a
property of the function, and therefore is correctly located in
FunctionCallInfoData.

It's true that we often abuse fn_extra to hold data that's essentially
call-site-dependent, but I don't think that's a good reason to push
collation into FmgrInfo.

            regards, tom lane


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andres Freund
Date:
On 2018-06-05 17:11:17 -0400, Chapman Flack wrote:
> On 06/05/2018 04:57 PM, Andres Freund wrote:
> > But we don't really change the
> > collation of function invocations per-call. 
> 
> Is that true? (Not a rhetorical question; I'm unsure.)

Yes, it is at the moment.


> Is your argument that the expression's collation is determined once
> for each call /site/, and thereafter doesn't change over repeated
> invocations via any one call site, and a unique FmgrInfo is
> populated at each call site?

Yes, that's pretty much how it works.

Greetings,

Andres Freund


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andres Freund
Date:
Hi,

On 2018-06-06 01:01:49 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > In my understanding FunctionCallInfoData is basically per-call data,
> > whereas FmgrInfo is information about the function.  It makes some sense
> > that ->context is in FunctionCallInfoData, after all it's used for
> > per-row data like the trigger context.  But we don't really change the
> > collation of function invocations per-call.  Thus I don't quite get why
> > FunctionCallInfoData contains information about it rather than FmgrInfo.
> 
> [squint]  I would say that the call collation is an argument, not a
> property of the function, and therefore is correctly located in
> FunctionCallInfoData.

I think it's not unreasonable to think of it that way, but it's really
not how it is used today.  In pretty much all cases the collation is
known and determined at the time fmgr_info() is called (we also commonly
reuse FunctionCallInfoData structs).

And I think that's good, because we'd almost certainly be in trouble if
we didn't - if collation is used for input in data cached into fn_extra,
the caching the code largely doesn't go to the trouble to rebuild that
for a different collation.


> It's true that we often abuse fn_extra to hold data that's essentially
> call-site-dependent, but I don't think that's a good reason to push
> collation into FmgrInfo.

FmgrInfo really *is* call-site dependent, no? fn_extra, fn_mcxt, fn_expr
all are.  I think it's more useful to view the FmgrInfo /
FunctionCallInfo data split as the separation between per-callsite and
per-call data.  And I think it'd make much more sense to officially
treat collation as the former.

Greetings,

Andres Freund


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Peter Eisentraut
Date:
On 6/6/18 09:06, Andres Freund wrote:
>> It's true that we often abuse fn_extra to hold data that's essentially
>> call-site-dependent, but I don't think that's a good reason to push
>> collation into FmgrInfo.
> FmgrInfo really *is* call-site dependent, no? fn_extra, fn_mcxt, fn_expr
> all are.  I think it's more useful to view the FmgrInfo /
> FunctionCallInfo data split as the separation between per-callsite and
> per-call data.  And I think it'd make much more sense to officially
> treat collation as the former.

I think there are really three sets of information: catalog lookup
information, per query/expression information (such as collation), and
per-call information.  But we only have two places to put things, so it
might look a bit odd.

Nothing wrong with considering changes, of course.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Tom Lane
Date:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 6/6/18 09:06, Andres Freund wrote:
>> FmgrInfo really *is* call-site dependent, no? fn_extra, fn_mcxt, fn_expr
>> all are.  I think it's more useful to view the FmgrInfo /
>> FunctionCallInfo data split as the separation between per-callsite and
>> per-call data.  And I think it'd make much more sense to officially
>> treat collation as the former.

> I think there are really three sets of information: catalog lookup
> information, per query/expression information (such as collation), and
> per-call information.  But we only have two places to put things, so it
> might look a bit odd.

> Nothing wrong with considering changes, of course.

Well, I still say this is an ad-hoc, unprincipled semantics change.
If we're going to buy into FmgrInfo being call site dependent,
we should go all the way.  Why not move nargs there (or get rid of it,
isn't it redundant with fn_nargs?), and possibly the context pointer?

            regards, tom lane


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andres Freund
Date:
Hi,

On 2018-06-06 11:17:33 -0400, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > On 6/6/18 09:06, Andres Freund wrote:
> >> FmgrInfo really *is* call-site dependent, no? fn_extra, fn_mcxt, fn_expr
> >> all are.  I think it's more useful to view the FmgrInfo /
> >> FunctionCallInfo data split as the separation between per-callsite and
> >> per-call data.  And I think it'd make much more sense to officially
> >> treat collation as the former.
> 
> > I think there are really three sets of information: catalog lookup
> > information, per query/expression information (such as collation), and
> > per-call information.  But we only have two places to put things, so it
> > might look a bit odd.
> 
> > Nothing wrong with considering changes, of course.
> 
> Well, I still say this is an ad-hoc, unprincipled semantics change.

> If we're going to buy into FmgrInfo being call site dependent,
> we should go all the way.  Why not move nargs there (or get rid of it,
> isn't it redundant with fn_nargs?), and possibly the context pointer?

Currently fn_nargs and nargs aren't quite the same unfortunately. The
latter is different for functions with variable arguments. But it's
still per-callsite rather than per-call.  I agree that we, if we do
anything, make a bigger redesign. I just had a handle on nargs, but not
really on fncollation...

I don't think context necessarily is per-callsite, we use it for things
like the trigger context which changes per-row.  I think there's still a
fair argument to be made to change it to move it to FmgrInfo, but it's
not quite as clear semantically.

One way to address the performance issue of having to set all these up
on every call would be to introduce a 'flags' field to FmgrInfo. Unless
FMGR_{CONTEXT,RESULTINFO,...} bits are set in it, the corresponding
FunctionCallInfoData fields would have undefined values and not be set.
I'm doubtful that's worth it with just context, resultinfo getting that
treatment, but if we ever were to extend that...

Greetings,

Andres Freund


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andrew Gierth
Date:
>>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:

 Andres> I think it's not unreasonable to think of it that way, but it's
 Andres> really not how it is used today. In pretty much all cases the
 Andres> collation is known and determined at the time fmgr_info() is
 Andres> called (we also commonly reuse FunctionCallInfoData structs).

The obvious case which is not one of those "pretty much all cases" is
where DirectFunctionCallN[Coll] is used - which turns out to be not all
that unusual.

-- 
Andrew (irc:RhodiumToad)


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Andres Freund
Date:
Hi,

On 2018-06-06 21:25:14 +0100, Andrew Gierth wrote:
> >>>>> "Andres" == Andres Freund <andres@anarazel.de> writes:
> 
>  Andres> I think it's not unreasonable to think of it that way, but it's
>  Andres> really not how it is used today. In pretty much all cases the
>  Andres> collation is known and determined at the time fmgr_info() is
>  Andres> called (we also commonly reuse FunctionCallInfoData structs).
> 
> The obvious case which is not one of those "pretty much all cases" is
> where DirectFunctionCallN[Coll] is used - which turns out to be not all
> that unusual.

There the callsite just lives for just one call, I don't really see that
being an exception?

Greetings,

Andres Freund


Re: Why is fncollation in FunctionCallInfoData rather than fmgr_info?

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2018-06-06 21:25:14 +0100, Andrew Gierth wrote:
>> The obvious case which is not one of those "pretty much all cases" is
>> where DirectFunctionCallN[Coll] is used - which turns out to be not all
>> that unusual.

> There the callsite just lives for just one call, I don't really see that
> being an exception?

The problem with your proposal, for that case, is that there's no FmgrInfo
to put the collation into.

            regards, tom lane