David Rowley <david.rowley@2ndquadrant.com> writes:
> On 22 May 2017 at 16:10, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> I also just noticed that I don't think I've got ANTI join cases
>> correct in the patch I sent. I'll look at that now.
> I've attached an updated patch.
> This one is much less invasive than my original attempt.
Sorry for not dealing with this sooner --- I put it on the back burner
for PGCon, and just now got back to it.
First off, I agree that it was probably a mistake for this code to
special-case LEFT/FULL joins at all. eqjoinsel() does not do so;
it relies on the join-type correction applied later by
calc_joinrel_size_estimate(). Since that correction is also downstream
of this code, we should be able to do the same here, and indeed may
be double-counting somehow if we don't.
What that leaves is that we're using the "smallest per-column selectivity"
hack only for SEMI/ANTI joins where we can't really get anything helpful
from knowledge of the FK. What your patch does is to fall back on the
traditional clauselist_selectivity calculation for the relevant clauses.
But we'd be better off ignoring the FK altogether and leaving those
clauses to be processed later. That saves some cycles, and it might allow
those clauses to be used more fruitfully with a different FK, and even
if that doesn't happen it's better to let clauselist_selectivity see as
many clauses at once as possible.
So I whacked the patch around to do it like that and pushed it.
I'm not totally satisfied that there isn't any case where the smallest
selectivity hack is appropriate. In the example you're showing here,
the FK columns are independent so that we get more or less the right
answer with or without the FK. But in some quick testing I could not
produce a counterexample proving that that heuristic is helpful;
so for now let's can it.
Thanks, and sorry again for the delay.
regards, tom lane