Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From Lincoln Yeoh
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 5.1.0.14.1.20020420113553.02ff6740@192.228.128.13
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (mlw <markw@mohawksoft.com>)
Responses Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
At 10:48 AM 4/18/02 -0400, mlw wrote:
>Bruce Momjian wrote:
> >
> > Have you tried reducing 'random_page_cost' in postgresql.conf.  That
> > should solve most of your problems if you would like more index scans.
>
>My random page cost is 1 :-)

What happens when you set random page cost to 1? Between an index scan of 
50% of a table and a full table scan which would the optimizer pick? With 
it at 1, what percentage would be the switchover point?

Because I'm thinking that for _repeated_ queries when there is caching the 
random page cost for "small" selections may be very low after the first 
very costly select (may not be that costly for smart SCSI drives). So 
selecting 10% of a table randomly may not be that costly after the first 
select. Whereas for sequential scans 100% of the table must fit in the 
cache. If the cache is big enough then whichever results in selecting less 
should be faster ( noting that typically sequential RAM reads are faster 
than random RAM reads ). If the cache is not big enough then selecting less 
may be better up till the point where the total amount repeatedly selected 
cannot be cached, in which case sequential scans should be better. This is 
of course for queries in serial, not queries in parallel. How would one 
take these issues into account in an optimizer?

Mark's problems with the optimizer seem to be something else tho: 
statistics off.

>I had a database where I had to have "enable_seqscan=false" in the config 
>file.
>The nature of the data always makes the statistics bogus, and it always 
>refused
>to use the index.
>My one most important experience (I've had more than one) with this whole 
>topic
>is DMN's music database, when PostgreSQL uses the index, the query executes in
>a fraction of a second. When "enable_seqscan=true" PostgreSQL refuses to use
>the index, and the query takes a about a minute. No matter how much I analyze,
>I have to disable sequential scan for the system to work correctly.

I'm just wondering why not just use enable_seqscan=false for those 
problematic queries as a "hint"? Unless your query does need some seq scans 
as well?

By the way, are updates treated the same as selects by the optimizer?

Regards,
Link.



pgsql-hackers by date:

Previous
From: Curt Sampson
Date:
Subject: Re: Schema (namespace) privilege details
Next
From: Tom Lane
Date:
Subject: Re: Schema (namespace) privilege details