Re: Simple postgresql.conf wizard - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Simple postgresql.conf wizard
Date
Msg-id 603c8f070811271415s158f3a1vcaa61630071fceec@mail.gmail.com
Whole thread Raw
In response to Re: Simple postgresql.conf wizard  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Simple postgresql.conf wizard  (Joshua Tolley <eggyknap@gmail.com>)
Re: Simple postgresql.conf wizard  (Gregory Stark <stark@enterprisedb.com>)
Re: Simple postgresql.conf wizard  (tomas@tuxteam.de)
List pgsql-hackers
> Even though we all agree default_statistics_target = 10 is too low,
> proposing a 40X increase in the default value requires more evidence
> than this.  In particular, the prospect of a 1600-fold increase in
> the typical cost of eqjoinsel() is a mite scary.

I just did some very quick testing of a couple values for
default_statistics_target on a database for work whose contents I am
not at liberty to share.  Suffice it to say it is a database with
small tables, but a lot of the queries involve many joins.

ANALYZE with default_statistics_target set to 10 takes 13 s.  With
100, 92 s.  With 1000, 289 s.

I tried several representative queries and there was no measurable
change in run-time.  I compared a couple of the plans and the plans
didn't change either.

So then I said - gee, how much can I reduce this?  I looked at two of
the queries in more detail.  The lowest default_statistics_target that
produced the same plan for both queries was 5.  Reducing the
default_statistics_target all the way down to 1 changed the plans, but
the new plans were just as fast as the old plans.

Given the amount of clamor for a higher value for
default_statistics_target, I'm a little surprised by these results.
It may be that the queries I'm running are not ones for which more
statistics generate better plans, but I think they really are
representative of what we run.  Maybe someone could suggest some types
of query that would be likely to helped by better statistics?

A random thought: maybe the reason I'm not seeing any benefit is
because my tables are just too small - most contain at most a few
thousand rows, and some are much smaller.  Maybe
default_statistics_target should vary with the table size?  Something
like, 0.1% of the rows to a maximum of 100...  and then 0.01% of the
rows after that to some higher maximum.... but always a minimum of at
least 10.  I'm guessing that people with really big tables are less
likely to mind longer planning times and more likely to benefit from
finding better plans...

...Robert


pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: [PATCHES] GIN improvements
Next
From: Joshua Tolley
Date:
Subject: Re: Simple postgresql.conf wizard