Re: [GENERAL] Idea for the statistics collector - Mailing list pgsql-hackers

From Christopher Kings-Lynne
Subject Re: [GENERAL] Idea for the statistics collector
Date
Msg-id 027d01c219c1$3062d080$0200a8c0@SOL
Whole thread Raw
In response to Re: [GENERAL] Idea for the statistics collector  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: [GENERAL] Idea for the statistics collector
Re: [GENERAL] Idea for the statistics collector
Re: [GENERAL] Idea for the statistics collector
List pgsql-hackers
I was thinking of writing a command line tool like 'pgtune' that looks at
the stats views and will generate SQL code for, or do automatically the
following:

* Dropping indices that are never used
* Creating appropriate indices to avoid large, expensive sequential scans.

This would put us in the 'mysql makes my indices for me by magic' league -
but would be far more powerful and flexible.  How to do multikey indices is
beyond me tho.

*sigh* I'm recovering from a septoplasty on my nose atm, so I might have
some time to do some coding!

Chris

----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Martijn van Oosterhout" <kleptog@svana.org>
Cc: "PostgreSQL-development" <pgsql-hackers@postgresql.org>
Sent: Friday, June 21, 2002 10:50 AM
Subject: Re: [HACKERS] [GENERAL] Idea for the statistics collector


> Martijn van Oosterhout wrote:
> > Since it's currently all for collecting statistics on tables, why can't
it
> > collect another type of statistic, like:
> >
> > - How often the estimator gets it wrong?
> >
> > At the end of an index scan, the executor could compare the number of
rows
> > returned against what was estimated, and if it falls outside a certain
> > range, flag it.
> >
> > Also, the average ratio of rows coming out of a distinct node vs the
number
> > going in.
> >
> > For a join clause, the amount of correlation between two columns (hard).
> >
> > etc
> >
> > Ideally, the planner could then use this info to make better plans.
> > Eventually, the whole system could become somewhat self-tuning.
> >
> > Does anyone see any problems with this?
>
> [ Discussion moved to hackers.]
>
> I have thought that some type of feedback from the executor back into
> the optimizer would be a good feature.  Not sure how to do it, but your
> idea makes sense.  It certainly could update the table statistics after
> a sequential scan.
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org
>





pgsql-hackers by date:

Previous
From: "Julian Mehnle"
Date:
Subject: Re: ODBC Driver 7.02.0001 (Win32) (Unicode mode): CRLF->LF works, LF->CRLF doesn't
Next
From: Vince Vielhaber
Date:
Subject: Re: Our archive searching stinks