BTW, a couple of thoughts came to me after considering this
issue awhile longer:
1. The really fundamental problem is that we don't update
SpecialJoinInfo's semi_operators/semi_rhs_exprs after discovering that
join-level comparisons of a particular value are unnecessary. We will
then not emit the actual join clause ("t1.a = t4.a" in this example),
but we still list t4.a as something that would need unique-ification.
I looked a little bit at whether it'd be reasonable to do that at the
end of construction of EquivalenceClasses, but I think it'd add a lot
more planning cycles than my v3 patch. A more invasive idea could be
to not compute semi_operators/semi_rhs_exprs at all until we've
finished building EquivalenceClasses. I'm not sure if that'd be
adequately cheap either, but maybe it would be.
2. Another thing that's a bit worrisome is the recognition that
a3179ab69's logic for reconstruction of attr_needed might produce
different results than we had to begin with. It's not necessarily
worse: in the particular case we're considering here, the results
are arguably better. But it's scary to think that if there are
any other bugs in that code, we will only find them in queries that
use join removal. That's not a recipe for thorough test coverage.
I wonder if it could be sane to delete the existing logic that builds
attr_needed during initial jointree deconstruction, and instead
always fill attr_needed by running the new "reconstruction" logic.
The trouble with that is we'd have to do it just before starting
join removal, and then again after each successful join removal,
since join removal itself depends on the attr_needed results.
It seems likely that that would net out slower than the current
code. But maybe the difference would be small.
I'm not planning to work on either of these ideas right now,
but I thought I'd put them out there in case someone else is
interested.
regards, tom lane