> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > What I am more concerned about is a join that uses the most common
> > value. We do an index scan in that case.
>
> No, we do whichever plan looks cheapest. Again, it's all about
> statistics.
>
> Right now, eqjoinsel() is just a stub that returns a constant
> selectivity estimate. It might be useful to compute some more
> sophisticated value based on pg_statistic entries for the two
> columns, but right now I doubt you could tell much. Should keep
> the join case in mind when we extend the statistics...
OK, let me be more specific. Suppose the most common value in a column
is 3. For a query "col = 3", we know 3 is most common, and use the most
common statistics rather than the dispersion statistic, right?
OK, let's assume use of the most common statistic causes a sequential
scan, but use of dispersion causes an index scan.
The query "col = 3" uses sequential scan. In the query "col = tab2.col2",
the dispersion statistic is used, causing an index scan.
However, assume tab2.col2 equals 3. I assume this would cause an index
scan because the executor doesn't know about the most common value,
right? Is it worth trying to improve that?
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026