Re: "Bug" in statistics for v7.2? - Mailing list pgsql-hackers

From Zeugswetter Andreas SB SD
Subject Re: "Bug" in statistics for v7.2?
Date
Msg-id 46C15C39FEB2C44BA555E356FBCD6FA488780E@m0114.s-mxs.net
Whole thread Raw
In response to "Bug" in statistics for v7.2?  ("Marc G. Fournier" <scrappy@hub.org>)
Responses Re: "Bug" in statistics for v7.2?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> That explains it ...
> 
>  profiles_faith | count
> ----------------+--------
>               0 | 485938
>               1 |      2
>               2 |      6
>               7 |      2
>               8 |     21
> (5 rows)
> 
> Cool, another waste of space *sigh*
> 
> thanks ...
> 
> 
> On Wed, 13 Feb 2002, Tom Lane wrote:
> 
> > "Marc G. Fournier" <scrappy@hub.org> writes:
> > > Okay, if I'm understanding pg_stats at all, which I may not be, n_distinct
> > > should represent # of distinct values in that row, no?
> > > But, I have one field that has 5 distinct values:
> > > But pg_stats is reporting 1:
> >
> > The pg_stats values are only, um, statistical.  If 99.9% of the table is
> > the same value and the other four values appear only once or twice, it's
> > certainly possible for ANALYZE's sample to include only the common value
> > and miss the rare ones.  AFAIK that will not break anything; if you have
> > an example where the planner seems to be fooled because of this, let's
> > see it.

Hmm ? How about select * from xxx where profiles_faith = 7
would estimate all rows, no ? Instead of 2.
That is why I think a bin for "very uncommon" values could also be 
useful sometimes.

Andreas


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas SB SD"
Date:
Subject: Re: alter table drop column status
Next
From: Thomas Swan
Date:
Subject: possible pg_dumpall (7.1.3) bug