We tried 1000 as the default and found that the plans were good plans
and were consistent, but the pg_statistics was not exactly the same.
We took Tom's' advice and tried SET SEED=0 (actually select setseed (0)
).
We did runs last night on our project machine which produced consistent
pg_statistics data and (of course) the same plans.
We will next try runs where we vary the default buckets. Other than 10
and 1000, what numbers would you like us to try besides. Previously the
number 100 was mentioned. Are there others?
On Wed, 2003-09-10 at 12:44, Bruce Momjian wrote:
> Tom Lane wrote:
> > Mary Edie Meredith <maryedie@osdl.org> writes:
> > > Stephan Szabo kindly responded to our earlier queries suggesting we look
> > > at default_statistics_target and ALTER TABLE ALTER COLUMN SET
> > > STATISTICS.
> >
> > > These determine the number of bins in the histogram for a given column.
> > > But for a large number of rows (for example 6 million) the maximum value
> > > (1000) does not guarantee that ANALYZE will do a full scan of the table.
> > > We do not see a way to guarantee the same statistics run to run without
> > > forcing ANALYZE to examine every row of every table.
> >
> > Do you actually still have a problem with the plans changing when the
> > stats target is above 100 or so? I think the notion of "force ANALYZE
> > to do a full scan" is inherently wrongheaded ... it certainly would not
> > produce numbers that have anything to do with ordinary practice.
> >
> > If you have data statistics that are so bizarre that the planner still
> > gets things wrong with a target of 1000, then I'd like to know more
> > about why.
>
> Has there been any progress in determining if the number of default
> buckets (10) is the best value?
--
Mary Edie Meredith <maryedie@osdl.org>
Open Source Development Lab