Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno
Date
Msg-id 23380.1576532442@sss.pgh.pa.us
Whole thread Raw
In response to Re: Clarifying/rationalizing Vars' varno/varattno/varnoold/varoattno  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Dec 16, 2019 at 12:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> What I'd like, in order to make progress with the planner rewrite,
>> is that all four Vars in the tlist have varno 3, showing that
>> they are (potentially) semantically distinct from the Vars in
>> the JOIN ON clause (which'd have varnos 1 and 2 in this example).

> I don't have an opinion about the merits of this change, but I'm
> curious how this manages to work. It seems like there would be a fair
> number of places that needed to map the join alias var back to some
> baserel that can supply it. And it seems like there could be multiple
> levels of join alias vars as well, since you could have joins nested
> inside of other joins, possibly with subqueries involved.

Sure.  Right now, we smash join aliases down to the ultimately-referenced
base vars early in planning (see flatten_join_alias_vars).  After the
patch that I'm proposing right now, that would continue to be the case,
so there'd be little change in most of the planner from this.  However,
the later changes that I speculated about in the other thread would
involve delaying that smashing in cases where the join output value is
possibly different from the input value, so that we would have a clear
representational distinction between those things, something we lack
today.

> At some point I had the idea that it might make sense to have
> equivalence classes that had both a list of full members (which are
> exactly equivalent) and nullable members (which are either equivalent
> or null).

Yeah, this is another way that you might get at the problem, but it
seems to me it's not really addressing the fundamental squishiness.
If the "nullable members" might be null, then what semantics are
you promising exactly?  You certainly haven't got anything that
defines a sort order for them.

> I'm not sure whether that idea is of any practical use,
> though. It does seems strange to me that the representation you are
> proposing gets at the question only indirectly. The nullable version
> of the Var has got a different varno and varattno than the
> non-nullable version of the Var, but other than that there's no
> connection between them. How do you go about matching those together?

You'd have to look into the join's joinaliasvars list (or more likely,
some new planner data structure derived from that) to discover that
there's any connection.  That seems fine to me, because AFAICS
relatively few places would need to do that.  It's certainly better
than using a representation that suggests that two values are the same
when they're not.  (TBH, I've spent the last dozen years waiting for
someone to come up with an example that completely breaks equivalence
classes, if not our entire approach to outer joins.  So far we've been
able to work around every case, but we've sometimes had to give up on
optimizations that would be nice to have.)

A related example that is bugging me is that the grouping-sets patch
broke the meaning of Vars that represent post-grouping values ---
there again, the value might have gone to null as a result of grouping,
but you can't tell it apart from values that haven't.  I think this is
less critical because such Vars can't appear in FROM/WHERE so they're
of little interest to most of the planner, but we've still had to put
in kluges like 90947674f because of that.  We might be well advised
to invent some join-alias-like mechanism for those.  (I have a vague
memory now that Gierth wanted to do something like that and I
discouraged it because it was unlike the way we did outer joins ...
so he was right, but what we should have done was fix outer joins not
double down on the kluge.)

> I guess varnoold/varoattno can do the trick, but if that's only being
> used by ruleutils.c then there must be some other mechanism.

Actually, they're nothing but debug support currently --- ruleutils
doesn't use them either.  It's possible that this change would allow
ruleutils to save cycles in a lot of cases by not having to drill down
through subplans to identify the ultimate referent of upper-plan Vars.
But I haven't investigated that yet.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Mark Dilger
Date:
Subject: Re: [PATCH] Memory leak, at src/common/exec.c
Next
From: Tomas Vondra
Date:
Subject: Re: ERROR: could not resize shared memory segment...No space lefton device