Default Stats Revisited - Mailing list pgsql-hackers

From Josh Berkus
Subject Default Stats Revisited
Date
Msg-id 200403101125.07325.josh@agliodbs.com
Whole thread Raw
Responses Re: Default Stats Revisited
List pgsql-hackers
Folks,

Early on in the default_stats thread, I made a proposal that got dropped 
without discussion.   I'd like to revisit it, because I still think it's a 
good idea.

The Issue:   The low default_stats_target of 10 is not sufficient for many 
complex queries involving multi-column correlation or oddly distributed data.  
Yet modestly increasing the stats target for *all* columns, as demonstrated, 
substantially increases the time required for Analyze, without gain on most 
queries.  

If only there were a way to automatically increas the default stats on only 
"important" columns, and not on other columns!  Yet if we burden the DBA with 
flagging important colummns all over the database, we haven't saved him/her 
any work.

Ah, but there is a way!   Most "important" columns are already indicated ... 
because they are indexed.    If we implemented a system where indexed columns 
would have a significantly higher stats_target than non-indexed columns, this 
might improve our default behavior without overburdening Analyze.

Proposal:  That we consider:-- adding a new GUC default_stats_indexed-- that this GUC be set initially to 100 if
stats_targetis 10-- that the system be adjusted to that indexed columns take their     stats_target from
default_stats_indexedand not default_stats_target-- that expressional indexes be ignored for this purpose, as
implementation   would be too complex, and they have their own stats anyway
 

If this proposal is worth considering, I will spend some time building up a 
test case to demonstrate the cost and utility of the plan.  With Neil's help, 
of course!

-- 
-Josh BerkusAglio Database SolutionsSan Francisco



pgsql-hackers by date:

Previous
From: Andreas Pflug
Date:
Subject: Re: PITR Functional Design v2 for 7.5
Next
From: Bruce Momjian
Date:
Subject: Re: selective statement logging