Thread: Re: [COMMITTERS] pgsql: Add notion of a "transform function" that can simplify function

Robert Haas <rhaas@postgresql.org> writes:
> Add notion of a "transform function" that can simplify function calls.

Why exactly was this thought to be a good idea:

> * A NULL original expression disables use of transform functions while
> * retaining all other behaviors.

AFAICT that buys nothing except to greatly complicate the API
specification for simplify_function, something that is now proving
problematic for Marti's requested refactoring [1].  If it's
inappropriate for a transform function to modify a CoerceViaIO call,
surely the transform function can be expected to know that.
        regards, tom lane

[1] http://archives.postgresql.org/pgsql-hackers/2012-03/msg00694.php


On Fri, Mar 23, 2012 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <rhaas@postgresql.org> writes:
>> Add notion of a "transform function" that can simplify function calls.
>
> Why exactly was this thought to be a good idea:
>
>> * A NULL original expression disables use of transform functions while
>> * retaining all other behaviors.
>
> AFAICT that buys nothing except to greatly complicate the API
> specification for simplify_function, something that is now proving
> problematic for Marti's requested refactoring [1].  If it's
> inappropriate for a transform function to modify a CoerceViaIO call,
> surely the transform function can be expected to know that.

I assumed that we were merely trying to avoid forcing the caller to
provide the expression tree if they didn't have it handy, and that the
comment was merely making allowance for the fact that someone might
want to do such a thing.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


On Fri, Mar 23, 2012 at 10:55:52AM -0400, Tom Lane wrote:
> Robert Haas <rhaas@postgresql.org> writes:
> > Add notion of a "transform function" that can simplify function calls.
> 
> Why exactly was this thought to be a good idea:
> 
> > * A NULL original expression disables use of transform functions while
> > * retaining all other behaviors.

We last spoke of that idea here, albeit in minimal detail:
http://archives.postgresql.org/pgsql-hackers/2011-06/msg00918.php

> AFAICT that buys nothing except to greatly complicate the API
> specification for simplify_function, something that is now proving
> problematic for Marti's requested refactoring [1].  If it's
> inappropriate for a transform function to modify a CoerceViaIO call,
> surely the transform function can be expected to know that.

I did it that way because it looked wrong to pass the same CoerceViaIO node to
transforms of both the input and output functions.  Thinking about it again
now, doing so imposes no fundamental problems.  Feel welcome to change it.


Robert Haas <robertmhaas@gmail.com> writes:
> On Fri, Mar 23, 2012 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Why exactly was this thought to be a good idea:
>> 
>>> * A NULL original expression disables use of transform functions while
>>> * retaining all other behaviors.

> I assumed that we were merely trying to avoid forcing the caller to
> provide the expression tree if they didn't have it handy, and that the
> comment was merely making allowance for the fact that someone might
> want to do such a thing.

How would they not have the original expression tree handy?

But now that I'm looking at this ... the API specification for transform
functions seems rather thoroughly broken anyway.  Why are we passing the
original expression and nothing else?  This would appear to require the
transform function to repeat all the input-normalization and
simplification work done up to this point.  It would seem to me to be
more useful to pass the fully-processed argument list.  I've not looked
yet at the existing transform functions, but why would they want to know
about the original node at all?
        regards, tom lane


Noah Misch <noah@leadboat.com> writes:
> On Fri, Mar 23, 2012 at 10:55:52AM -0400, Tom Lane wrote:
>> Why exactly was this thought to be a good idea:
>> 
>>> * A NULL original expression disables use of transform functions while
>>> * retaining all other behaviors.

> I did it that way because it looked wrong to pass the same CoerceViaIO node to
> transforms of both the input and output functions.  Thinking about it again
> now, doing so imposes no fundamental problems.  Feel welcome to change it.

Oh, I see your point --- it's not obvious whether the current transform
is meant for the input or the output function.  Which is a very good
point.  In principle the transform function could figure out which end
of that it must be, but it would be ugly.

However, see my response to Robert: why are we passing the original node
to the transform function at all?  It would be more useful and easier to
work with to pass the function's fully-processed argument list, I believe.
        regards, tom lane


On Fri, Mar 23, 2012 at 11:31:54AM -0400, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > On Fri, Mar 23, 2012 at 10:55 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Why exactly was this thought to be a good idea:
> >> 
> >>> * A NULL original expression disables use of transform functions while
> >>> * retaining all other behaviors.
> 
> > I assumed that we were merely trying to avoid forcing the caller to
> > provide the expression tree if they didn't have it handy, and that the
> > comment was merely making allowance for the fact that someone might
> > want to do such a thing.
> 
> How would they not have the original expression tree handy?
> 
> But now that I'm looking at this ... the API specification for transform
> functions seems rather thoroughly broken anyway.  Why are we passing the
> original expression and nothing else?  This would appear to require the
> transform function to repeat all the input-normalization and
> simplification work done up to this point.  It would seem to me to be
> more useful to pass the fully-processed argument list.  I've not looked
> yet at the existing transform functions, but why would they want to know
> about the original node at all?

You suggested[1] passing an Expr instead of an argument list, and your reasons
still seem good to me.  That said, perhaps we should send both the original
Expr and the simplified argument list.  That will help if we ever want to
fully simplify x - y * 0.  (Then again, the feature is undocumented and we
could change it when that day comes.)

[1] http://archives.postgresql.org/pgsql-hackers/2011-06/msg00915.php

The existing transform functions are trivial and could survive on nearly any
API we might consider.  See varchar_transform().


I wrote:
> However, see my response to Robert: why are we passing the original node
> to the transform function at all?  It would be more useful and easier to
> work with to pass the function's fully-processed argument list, I believe.

After a bit of looking around, I realize that the current implementation
of transform functions is flat-out wrong, because whenever a transform
actually fires, it proceeds to throw away all the work that
eval_const_expressions has done on the input, and instead return some
lightly-modified version of the original node tree.  Thus for example
in the regression database:

regression=# create function foo(x float8, y int) returns numeric as
regression-# 'select ($1 + $2)::numeric' language sql;
CREATE FUNCTION

regression=# select "numeric"(foo(y := 1, x := f1), -1) from float8_tbl;
ERROR:  unrecognized node type: 310

since the adjustment of foo's named arguments is thrown away.

So this patch is going to need some work.  I continue to not see any
particular reason why the transform function should need the original
node tree.  I think what it *should* be getting is the OID of the
function (currently, it's impossible for one transform to serve more
than one function, which seems like it might be useful); the input
collation (currently, transforms are basically unusable for any
collation-sensitive function), and the pre-simplified argument list.
        regards, tom lane


Noah Misch <noah@leadboat.com> writes:
> On Fri, Mar 23, 2012 at 11:31:54AM -0400, Tom Lane wrote:
>> ... I've not looked
>> yet at the existing transform functions, but why would they want to know
>> about the original node at all?

> You suggested[1] passing an Expr instead of an argument list, and your reasons
> still seem good to me.  That said, perhaps we should send both the original
> Expr and the simplified argument list.  That will help if we ever want to
> fully simplify x - y * 0.  (Then again, the feature is undocumented and we
> could change it when that day comes.)

I believe what I had in mind back then was that we'd build a new FuncExpr
containing the simplified argument list.  On reflection that's probably
the most future-proof way to do it, since otherwise anytime we change
the contents of FuncExpr, we'll be faced with possibly having to change
the signature of protransform functions.

Will go see what I can do with that.
        regards, tom lane