Re: Understanding histograms - Mailing list pgsql-performance

From Tom Lane
Subject Re: Understanding histograms
Date
Msg-id 26230.1209597464@sss.pgh.pa.us
Whole thread Raw
In response to Re: Understanding histograms  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Understanding histograms
List pgsql-performance
Jeff Davis <pgsql@j-davis.com> writes:
> On Wed, 2008-04-30 at 10:43 -0400, Tom Lane wrote:
>> Surely that's not very sane?  The MCV list plus histogram generally
>> don't include every value in the table.

> My understanding of Len's question is that, although the MCV list plus
> the histogram don't include every distinct value in the general case,
> they do include every value in the specific case where the histogram is
> not full.

I don't believe that's true.  It's possible that a small histogram means
that you are seeing every value that was in ANALYZE's sample, but it's
a mighty long leap from that to the assumption that there are no other
values in the table.  In any case that seems more an artifact of the
implementation than a property the histogram would be guaranteed to
have.

> ... the statistics aren't guaranteed to be perfectly up-to-date, so an
> estimate of zero might be risky.

Right.  As a matter of policy we never estimate less than one matching
row; and I've seriously considered pushing that up to at least two rows
except when we see that the query condition matches a unique constraint.
You can get really bad join plans from overly-small estimates.

            regards, tom lane

pgsql-performance by date:

Previous
From: Jeff Davis
Date:
Subject: Re: Understanding histograms
Next
From: Gregory Stark
Date:
Subject: Re: Understanding histograms