Re: Lambda expressions (was Re: BUG #15471) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Lambda expressions (was Re: BUG #15471)
Date
Msg-id 20181030193720.e6z5wvqaargywi76@alap3.anarazel.de
Whole thread Raw
In response to Lambda expressions (was Re: BUG #15471)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Lambda expressions (was Re: BUG #15471)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2018-10-30 15:04:55 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2018-10-30 09:30:54 -0400, Tom Lane wrote:
> >> This is all some more fuel for the idea that we need a less messy
> >> substitute for CaseTestExpr.  As it happens, I was just fooling with
> >> that yesterday, and hope to have something to post soon.  But it'll
> >> be too invasive to back-patch.  I'll try to fix this particular
> >> problem more locally.
> 
> > Yea, I agree we need something better there.  I happened to also play
> > around with that (more from the angle that we'd want similar
> > infrastructure for the SQL function inlining case we talked about) on
> > the long flights back from .eu yesterday, but didn't get that far.
> 
> Yeah, that conversation was what spurred me to do something.  The patch
> as I have it right now only fixes (most of) the inlining-failure cases.
> I've been thinking about how to reuse the mechanism to get rid of
> CaseTestExpr, but am running into some issues.
> 
> The core idea that I'm working on is to invent a new node type
> LambdaExpr that evaluates an expression and substitutes it as a
> Param into another expression, notationally sort of like
> 
>     LET($n := expression1 IN ... result expression using $n ...)

Right, that's what I was thinking about as well (except I called it a
WithValues node).


> and then the different values can be distinguished by having
> different Param numbers.  These Params have a different ParamKind
> from PARAM_EXTERNAL or PARAM_EXEC, because there's too many
> assumptions about the behavior of those ParamKinds, but otherwise
> they're not that much different.

Right, that makes sense.

How did you deal with the fact that we might extract subset of the LET()
into e.g. a RestrictionInfo (and then e.g. an IndexPath), but another
part would be e.g. evaluated as part of a qual?  I got to the point
where I allocated a bunch of memory for a set of Param nodes (e.g. from
a single inlined function) and abandoned having something like the
WithValues node - and that memory was directly referenced by the new
Param nodes.  But I didn't really make that work outside of the simplest
cases.


> The stumbling block that I'm seeing in terms of making parser output
> use this is that if, say, each CaseExpr has a distinct Param number
> for what had been its CaseTestExpr, then textually and logically
> equivalent CASE constructs might not compare equal() because of
> different Param numbers.  And that's really bad news for a bunch
> of reasons, see e.g. recent bugs with DDL on partition tables.
> But we can't have equal() ignore the Param numbers without
> re-introducing all the same issues we have with CaseTestExpr.

I never got this far.  I think it'd be worthwhile to try to have a
better parse time representation too, we do a number of somewhat ugly
contortions to make that work.

I wonder a bit if we should have Param nodes of the LET kind refer to
their "parent" LET node, and have something paramlevelsup or letlevelsup
(similar to varlevelsup) type logic, where each LET node has a link to
its parent. The numbering for each level would start at 0, solving the
issue of breaking equal() in most cases. But that seems awfully
complicated.

Alternatively I wonder if there's really a need to have a Param number
for this new kind of Param. All it really needs is a way to 1) refer to
the expression that determines it value 2) a way to avoid repeatedly
evaluate the expression 3) a way to have the cached values reset between
expression evaluations.  Neither of these really require a LET() node,
or Param numbers, right?  It makes sense to have Param numbers out of
symmetry, but perhaps that symmetry isn't actually that strong?


> So what I'm thinking about at the moment is to keep using CaseTestExpr
> in parser output trees, but have the planner replace them with
> (uniquely numbered) Params during eval_const_expressions.  As long
> as we integrate that correctly with function inlining, it should be
> safe and ensure that CASEs coming in from functions can't be confused
> with ones present in the original query (because we'd renumber anything
> in the inlined tree to be distinct from what we had already).  But that
> seems a bit ugly, and it might interfere with late-stage optimizations.

Better than nothing, but I'm not a huge fan...  Could be a large step
towards a larger improvement tho.

Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Lambda expressions (was Re: BUG #15471)
Next
From: Tom Lane
Date:
Subject: Re: Super PathKeys (Allowing sort order through precision loss functions)