Re: [GENERAL] Idea for the statistics collector - Mailing list pgsql-hackers

From Christopher Kings-Lynne
Subject Re: [GENERAL] Idea for the statistics collector
Date
Msg-id 4265C50D.1000709@familyhealth.com.au
Whole thread Raw
In response to Re: [GENERAL] Idea for the statistics collector  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: [GENERAL] Idea for the statistics collector  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Good god - how old was that email? 2002???

Chris

Bruce Momjian wrote:
> Added to TODO:
> 
>     * Add tool to query pg_stat_* tables and report indexes that aren't needed
>       or tables that might need indexes
> 
> ---------------------------------------------------------------------------
> 
> Christopher Kings-Lynne wrote:
> 
>>I was thinking of writing a command line tool like 'pgtune' that looks at
>>the stats views and will generate SQL code for, or do automatically the
>>following:
>>
>>* Dropping indices that are never used
>>* Creating appropriate indices to avoid large, expensive sequential scans.
>>
>>This would put us in the 'mysql makes my indices for me by magic' league -
>>but would be far more powerful and flexible.  How to do multikey indices is
>>beyond me tho.
>>
>>*sigh* I'm recovering from a septoplasty on my nose atm, so I might have
>>some time to do some coding!
>>
>>Chris
>>
>>----- Original Message -----
>>From: "Bruce Momjian" <pgman@candle.pha.pa.us>
>>To: "Martijn van Oosterhout" <kleptog@svana.org>
>>Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
>>Sent: Friday, June 21, 2002 10:50 AM
>>Subject: Re: [HACKERS] [GENERAL] Idea for the statistics collector
>>
>>
>>
>>>Martijn van Oosterhout wrote:
>>>
>>>>Since it's currently all for collecting statistics on tables, why can't
>>
>>it
>>
>>>>collect another type of statistic, like:
>>>>
>>>>- How often the estimator gets it wrong?
>>>>
>>>>At the end of an index scan, the executor could compare the number of
>>
>>rows
>>
>>>>returned against what was estimated, and if it falls outside a certain
>>>>range, flag it.
>>>>
>>>>Also, the average ratio of rows coming out of a distinct node vs the
>>
>>number
>>
>>>>going in.
>>>>
>>>>For a join clause, the amount of correlation between two columns (hard).
>>>>
>>>>etc
>>>>
>>>>Ideally, the planner could then use this info to make better plans.
>>>>Eventually, the whole system could become somewhat self-tuning.
>>>>
>>>>Does anyone see any problems with this?
>>>
>>>[ Discussion moved to hackers.]
>>>
>>>I have thought that some type of feedback from the executor back into
>>>the optimizer would be a good feature.  Not sure how to do it, but your
>>>idea makes sense.  It certainly could update the table statistics after
>>>a sequential scan.
>>>
>>>--
>>>  Bruce Momjian                        |  http://candle.pha.pa.us
>>>  pgman@candle.pha.pa.us               |  (610) 853-3000
>>>  +  If your life is a hard drive,     |  830 Blythe Avenue
>>>  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>>>
>>>---------------------------(end of broadcast)---------------------------
>>>TIP 6: Have you searched our list archives?
>>>
>>>http://archives.postgresql.org
>>>
>>
>>
>>
>>
>>---------------------------(end of broadcast)---------------------------
>>TIP 4: Don't 'kill -9' the postmaster
>>
>>
>>
> 
> 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] Idea for the statistics collector
Next
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] Idea for the statistics collector