Re: Improving planner variable handling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improving planner variable handling
Date
Msg-id 26957.1208566981@sss.pgh.pa.us
Whole thread Raw
In response to Improving planner variable handling  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Improving planner variable handling  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
I wrote:
> I've been thinking about how to improve the planner's poor handling of
> variables in outer-join situations.
> ...
> I think the basic solution for this is that upper levels of the plan tree
> should refer to the nullable output columns of an outer join using
> "alias Vars" that name the join rel, not the underlying base relation,

After further thought about this, and about the new "ForceToNull"
expression node that I was suggesting, I have a more radical proposal
in mind: let's get rid of join alias variables, instead of expanding
their use.

What I'm now envisioning is that the parser would generate Vars that
always refer to base relations, never to join nodes; and if the
reference appears above an outer join that can null the variable,
plaster a ForceToNull node atop the Var.  ForceToNull would carry the
relid (rangetable index) of the RTE_JOIN RTE for the outer join, so that
it expresses exactly which outer join has done the nulling.  Actually,
rather than just an index of one join RTE, ForceToNull should carry a
bitmapset of relids of multiple outer joins, in case we are looking down
through multiple outer joins to the base relation.  The advantage of
doing it that way instead of stacking several ForceToNull nodes is that
the representation doesn't change if we change the order of application
of the outer joins.

In this approach ForceToNull is carried all the way through the
parsing/planning process, rather than being inserted on-the-fly in
some late stage of the planner.  At the last stage of the planner
(setrefs.c) we could mark it or modify it in join plan nodes to
let the executor know which side of the join needs to be checked
to decide whether to null at execution time.

This representation has the nice property that two Var-and-optional-
ForceToNull expression trees are equal() if and only if they are
semantically equivalent --- a property we don't have right now, either
before or after smashing join alias vars to base vars (although it's
worse without doing that, which is why the planner is doing it
currently).  So we don't need flatten_join_alias_vars anymore.

There is one corner case that doesn't fit into this nicely, which is
merged output columns from FULL JOIN USING.  Currently we represent
those as join alias Vars initially, and expand them into
"COALESCE(left-side-var, right-side-var)" during
flatten_join_alias_vars.  We could keep doing that, since the planner
has no great intelligence about FULL JOIN anyway.  But I was hoping
to get rid of the flatten_join_alias_vars pass altogether.  Perhaps
it is worth adding a special parsetree representation for these
things --- I'm imagining something roughly like ForceToNull but with
two inputs not one.  I think the only reason we need a special
representation at all is so that ruleutils.c can decompile it
as a Var reference rather than COALESCE().

This representation makes ruleutils.c's decompilation job harder, since
it's no longer clear from inspection of a Var node which RTE entry it
should be displayed with reference to.  (If the base relation is
underneath an aliased JOIN then we *must* reference the JOIN instead,
not the base rel.)  But it's clearly possible, and I'm happy to push
complexity into decompilation if it means savings in the main parse/plan
code path.

Another nice thing is we won't need to widen AttrNumber; in fact, I don't
think we need to permanently store any per-Var data in join RTEs, except
maybe for those darn FULL JOIN USING vars.  Variables' varattnos only
refer to base relations and their range doesn't increase in a join nest.

Comments?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Lessons from commit fest
Next
From: Andrew Dunstan
Date:
Subject: Re: Lessons from commit fest