Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct
Date
Msg-id 29570.1326326461@sss.pgh.pa.us
Whole thread Raw
In response to Botched estimation in eqjoinsel_semi for cases without reliable ndistinct  (Andres Freund <andres@anarazel.de>)
Responses Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct  (Andres Freund <andres@anarazel.de>)
Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct  (Andres Freund <andres@anarazel.de>)
List pgsql-bugs
Andres Freund <andres@anarazel.de> writes:
> Whats your opinion on this?

Looks pretty bogus to me.  You're essentially assuming that the side of
the join without statistics is unique, which is a mighty dubious
assumption.  (In cases where we *know* it's unique, something like this
could be reasonable, but I believe get_variable_numdistinct already
accounts for such cases.)  The reason for the reversion to pre-8.4
behavior was that with the other behavior, we might sometimes make
extremely optimistic estimates (ie, conclude that the join result is
very small) on the basis of, really, nothing at all.  AFAICS this
proposal just reintroduces unwarranted assumptions, and therefore will
probably produce as many worse results as better ones.

Also, why the asymmetry in null handling?  And why did you only touch
one of the two code paths in eqjoinsel_semi?  They have both got this
issue of how to estimate with inadequate stats.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: Weird message when creating PK constraint named like table
Next
From: Andres Freund
Date:
Subject: Re: Botched estimation in eqjoinsel_semi for cases without reliable ndistinct