Re: [BUGS] BUG #14646: performance hint to remove - Mailing list pgsql-bugs

From Tom Lane
Subject Re: [BUGS] BUG #14646: performance hint to remove
Date
Msg-id 12219.1494541963@sss.pgh.pa.us
Whole thread Raw
In response to Re: [BUGS] BUG #14646: performance hint to remove  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-bugs
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> Confirmed 9.6.2 on Ubuntu.  Not sure this is properly classified as a bug
> but its definitely an area where improvement would seem desirable.  I am a
> particularly heavy user of the join predicate "USING" and never really
> thought to look at this dynamic (without the WHERE clause it doesn't
> matter, both tables up end sequentially scanned).

Yeah, if you use USING the parser tries to define the merged join variable
as equivalent to just one or the other input variable.  With an inner join
it'll arbitrarily pick the left input, with a left or right join it will
use the outer-side variable (since the inner-side one might go to NULL).
With a FULL join, the merged variable is actually defined as
COALESCE(left-input, right-input), which is necessary for semantic
correctness but is pretty awful for optimization purposes.

If you put an equality constraint on the merged join variable, everything
works pretty well anyway because of deduction of implied equalities (ie,
we have left-input = right-input from the USING's join clause, and then
if we also know e.g. left-input = 42 we can deduce right-input = 42).
There are some hacks that make that carry through for outer join cases
as well.

The OP's WHERE clause isn't a simple equality so the equivalence-class
machinery doesn't help there.  There's been some speculation about
improving that, but there are a bunch of semantic and practical problems
in the way.  One of the biggest is that we can't simply duplicate the
restriction for each input variable, because that would cause the planner
to misestimate their selectivity and produce bad plans of a different
sort (not knowing that the restrictions are redundant).  The eclass
machinery knows enough to compensate for that, but it's not clear how
to do it for arbitrary restrictions.

Anyway, this isn't a bug but a missed optimization opportunity, and
I would counsel not holding your breath waiting for an improvement.
It's a difficult problem and it's not tremendously high priority.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: [BUGS] BUG #14646: performance hint to remove
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] [BUGS] Concurrent ALTER SEQUENCE RESTART Regression