Home > mailing lists

Re: Super Optimizing Postgres - Mailing list pgsql-hackers

From	Hannu Krosing
Subject	Re: Super Optimizing Postgres
Date	November 19, 2001 16:52:03
Msg-id	3BF7578F.7000108@sid.tm.ee Whole thread Raw
In response to	Re: Super Optimizing Postgres (Bruce Momjian <pgman@candle.pha.pa.us>)
List	pgsql-hackers

Tree view


Justin Clift wrote:

>>In my example, two computers with exactly the same hardware, except one has a
>>5400 RPM IDE drive, the other has a 10,000 RPM IDE drive. These machines should
>>not use the same settings, it is obvious that a sequential scan block read on
>>one will be faster than the other.
>>
>
>If we're going to do this bit properly, then we'll have to take into
>consideration many database objects will need their own individual
>statistics.  For example, lets say we have a database with a bunch of
>10k rpm SCSI drives which the tables are on, and the system also has one
>or more 15k rpm SCSI drives (lets say a Seagate Cheetah II drives) on
>which the indices have been placed.  With the 10k rpm drives, the tables
>needing the fastest throughput or having the highest usage are put on
>the outer edges of the disk media, and the rest of the tables are placed
>in the available space.
>
>On this theoretical system, we will be better off measuring the
>performance of each table and index in turn then generating and storing
>costs for each one which are as "accurate as possible at this point in
>time".
>
That would mean that these statistic values must be stored in pg_class 
and not be SET
variables at all.
This will probably have the added benefit that some cacheing effects of 
small/big tables
will be accounted for automatically so you dont have to do that in 
optimizer.

>  A model like this would probably have these costs re-calculated
>each time the ANALYZE command is run to ensure their accuracy through
>database growth and changes.
>
Then the ANALYZE should be run on both tables and indexes. AFAIK we 
currently
analyze only real data.

>
>I think this would be decently accurate, and RAID systems would be
>accurately analysed.  Don't know how to take into account large cache
>sizes though.  :)
>
Maybe some volatile statistict on how much of table "may be" cached in 
the disk/fs
caches, assuming that we currently know how much of each is in shared 
memory.

-----------
Hannu

pgsql-hackers by date:

From: Tom Lane
Date: 19 November 2001, 16:52:02
Subject: Re: OCTET_LENGTH is wrong

From: czl@iname.com (charles)
Date: 19 November 2001, 16:52:07
Subject: Re: pg locking problem

Re: Super Optimizing Postgres - Mailing list pgsql-hackers

Previous

Next