Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 1019064489.1924.22.camel@taru.tm.ee
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Hannu Krosing <hannu@tm.ee>)
Responses Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, 2002-04-17 at 19:15, Hannu Krosing wrote:
> On Wed, 2002-04-17 at 17:16, Tom Lane wrote:
> > 
> > It's entirely possible that the default value of random_page_cost is too
> > high, at least for many modern machines.  The experiments I did to get
> > the 4.0 figure were done a couple years ago, on hardware that wasn't
> > exactly new at the time.  I have not heard of anyone else trying to
> > measure it though.
> > 
> > I don't think I have the source code I used anymore, but the principle
> > is simple enough:
> > 
> > 1. Make a large file (several times the size of your machine's RAM, to
> > ensure you swamp out kernel disk buffering effects).  Fill with random
> > data. (NB: do not fill with zeroes, some filesystems optimize this away.)
> 
> People running postgres often already have large files of random data
> under $PGDATA directory :)

OTOH, it is also important where the file is on disk. As seen from disk
speed test graphs on http://www.tomshardware.com , the speed difference
of sequential reads is 1.5 to 2.5 between inner and outer tracks. 

> > 2. Time reading the file sequentially, 8K per read request.
> > Repeat enough to get a statistically trustworthy number.
> > 
> > 3. Time reading randomly-chosen 8K pages from the file.  Repeat
> > enough to get a trustworthy number (the total volume of pages read
> > should be several times the size of your RAM).
> > 
> > 4. Divide.
> > 
> > The only tricky thing about this is making sure you are measuring disk
> > access times and not being fooled by re-accessing pages the kernel still
> > has cached from a previous access.  (The PG planner does try to account
> > for caching effects, but that's a separate estimate;

Will it make the random and seq read cost equal when cache size >
database size and enough queries are performed to assume that all data
is in cache.

Also, can it distinguish between data in pg internal cache (shared
memory) and data in OS filesystem cache ?

> > the value of
> > random_page_cost isn't supposed to include caching effects.)  AFAIK the
> > only good way to do that is to use a large test, which means it takes
> > awhile to run; and you need enough spare disk space for a big test file.
> 
> If you have the machine all for yourself you can usually tell it to use
> less RAM at boot time.
> 
> On linux it is append=" mem=32M" switch in lilo.conf or just mem=32M on
> lilo boot command line.
> 
> > It'd be interesting to get some numbers for this across a range of
> > hardware, filesystems, etc ...
> 
> ---------------
> Hannu
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
> 
> http://archives.postgresql.org




pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Next
From: mlw
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE