Re: Improving planner variable handling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improving planner variable handling
Date
Msg-id 16583.1208982861@sss.pgh.pa.us
Whole thread Raw
In response to Re: Improving planner variable handling  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> After further thought about this, and about the new "ForceToNull"
> expression node that I was suggesting, I have a more radical proposal
> in mind: let's get rid of join alias variables, instead of expanding
> their use.

I've been studying this idea more, and it seems workable and useful, but
as always there are a few rough edges.

One tricky area is involved when we rearrange outer joins using the
third outer-join identity:
(A leftjoin B on (Pab)) leftjoin C on (Pbc)    is equivalent toA leftjoin (B leftjoin C on (Pbc)) on (Pab)    if Pbc is
strictfor at least one column of B
 

The problem here is that in form 2, Pbc's references to columns of B
would presumably be plain Vars, since there can be no need to force
them to null before Pbc is evaluated.  But in form 1 there had better
be ForceToNull nodes referencing the A/B join.  Conversely, if form 1
was what was originally entered, the parser would emit ForceToNull
nodes atop the B Vars in Pbc, but these are unnecessary if we implement
it as form 2.  I'm not too worried about wasting a few cycles to fall
through a useless ForceToNull node at runtime, but there's a bigger
problem here: if expressions that are really semantically equivalent
might or might not contain ForceToNull nodes, that's likely to get in
the way of planner optimization activities.  One of the things I was
hoping to get out of this was that Var-plus-ForceToNull trees would
be equal() if and only if semantically equivalent, and that property
seems to be slipping away.

Another strange thing that's happening here is that after a
transformation from form 1 to form 2, Pbc would contain ForceToNull
references to a join that's actually above its evaluation point in
the tree.  We could presumably deal with that by decreeing such a
thing to be a no-op, but it seems mighty ugly, as well as being
a rule that would prevent detection of erroneous rearrangements.
And again we are faced with the realization that apparently equal()
trees might not mean the same thing, depending on where in the query
you are looking.

So I'm feeling a bit dissatisfied and wondering whether there isn't
a better way to do this.  Any thoughts out there?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Index AM change proposals, redux
Next
From: "Joshua D. Drake"
Date:
Subject: Re: WIP: psql default banner patch v4