Re: Overhauling GUCS - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Overhauling GUCS
Date
Msg-id 484D6C1C.5070004@agliodbs.com
Whole thread Raw
In response to Re: Overhauling GUCS  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: Overhauling GUCS  (Gregory Stark <stark@enterprisedb.com>)
Re: Overhauling GUCS  ("Greg Sabino Mullane" <greg@turnstep.com>)
List pgsql-hackers
Tom,

>>> Actually, the reason it's still 10 is that the effort expended to get it
>>> changed has been *ZERO*.  I keep asking for someone to make some
>>> measurements, do some benchmarking, anything to make a plausible case
>>> for a specific higher value as being a reasonable place to set it.
>>> The silence has been deafening.
>> Not surprising really. It is a simple adjustment to make and it also is
>> easy to spot when its a problem. However it is not trivial to test for
>> (in terms of time and effort). I know 10 is wrong and so do you. If you
>> don't I am curious why I see so many posts from you saying, "Your
>> estimates are off, what is your default_statistics_target?" with yet
>> even more responses saying, "Uhh 10." 

I tried (back in 7.4) to do some systematic testing of this.  The 
problem is that the cases were higher d_s_t are required are 
specifically ones with complex, unbalanced data distributions and/or 
very large databases.  This makes test cases extremely difficult and 
time-consuming to generate; further, I found that the test cases I had 
from my clients' databases were not portable (in addition to being 
confidential).

Also, I'd actually assert that "10" seems to be perfectly adequate for 
the majority of users.  That is, the number of users where I've 
recommended increasing d_s_t for the whole database is smaller than the 
number where I don't, and of course we never hear from most users at 
all.  So I'm pretty happy recommending "Leave the default.  If you 
encounter problem queries, increase it to 100, and analyse the database.  If you're running a data warehouse, increase
itto 1000."
 

Where analyze does systematically fall down is with databases over 500GB 
in size, but that's not a function of d_s_t but rather of our tiny 
sample size.

--Josh


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: pg_dump restore time and Foreign Keys
Next
From: Zdenek Kotala
Date:
Subject: Re: handling TOAST tables in autovacuum