I wrote:
> I experimented a bit with the attached patch, which modifies
> eqjoinsel_semi in two ways. First, if the number-of-distinct-values
> estimate for the inner rel is just a default rather than having any
> real basis, it replaces it with inner_rel->rows, effectively assuming
> that the inside of the IN or EXISTS is unique.
That might or might not be a good idea ...
> Second, it drops the
> fallback to selectivity 0.5 altogether, just applying the nd1 vs nd2
> heuristic all the time.
... but this probably isn't. A review of the history shows me that
this change amounts to reverting 3f5d2fe30, which doesn't seem like
a good plan because that was fixing user-reported misbehavior.
The problem is still that the nd2/nd1 calculation can produce garbage
if nd2 or nd1 is made-up. It's possible that we can get away with using
nd2 = inner_rel->rows as a suitable substitute for a default nd2, but
I'm much less happy to assume something like that for nd1.
Now, there's still not much to defend the 0.5 selectivity in particular;
according to the commit log for 3f5d2fe30, I used that because it
reproduced the behavior of previous versions that didn't understand what
a semijoin was at all. So we could perhaps substitute some other rule
there, but I don't know what.
I also note that the precise behavior of HEAD in this area only dates
back to ca5f88502, which hasn't even shipped in a release yet, so it
surely doesn't have a lot of field experience justifying it.
Other useful-though-not-terribly-recent discussions in this area include
https://www.postgresql.org/message-id/flat/201201112110.40403.andres%40anarazel.de
https://www.postgresql.org/message-id/13290.1335976455@sss.pgh.pa.us
Given that eqjoinsel_semi has been more or less a constant source of
issues, maybe we need to think about a wholesale redesign rather than
just incremental tweaking. But I have few ideas about what would be
better.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers