Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From mlw
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 3CBEDCC2.D9E44A0E@mohawksoft.com
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Lincoln Yeoh <lyeoh@pop.jaring.my>)
List pgsql-hackers
Bruce Momjian wrote:
> 
> mlw wrote:
> > I don't think we will agree, we have seen different behaviors, and our
> > experiences seem to conflict. This however does not mean that either of us is
> > in error, it just may mean that we use data with very different
> > characteristics.
> >
> > This thread is kind of frustrating for me because over the last couple years I
> > have seen this problem many times and the answer is always the same, "The
> > statistics need to be improved." Tom, you and I have gone back and forth about
> > this more than once.
> >
> 
> Have you tried reducing 'random_page_cost' in postgresql.conf.  That
> should solve most of your problems if you would like more index scans.

My random page cost is 1 :-)

I had a database where I had to have "enable_seqscan=false" in the config file.
The nature of the data always makes the statistics bogus, and it always refused
to use the index. 

It is frustrating because sometimes it *is* a problem for some unknown number
of users (including myself), as evidenced by the perenial "why isn't postgres
using my index" posts, and for the last two years you guys keep saying it isn't
a problem, or that the statistics just need improvement. Sorry for my tone, but
I have pulled out my hair numerous times on this very problem.

This whole process has lead me to change my mind. I don't think adding weight
to an index scan is the answer, I think having the ability to submit hints to
the planner is the only way to really address this or any future issues.

Just so you understand my perspective, I am not thinking of the average web
monkey. I am thinking of the expert DBA or archetect who want to deploy a
system, and needs to have real control over performance in critical areas.

My one most important experience (I've had more than one) with this whole topic
is DMN's music database, when PostgreSQL uses the index, the query executes in
a fraction of a second. When "enable_seqscan=true" PostgreSQL refuses to use
the index, and the query takes a about a minute. No matter how much I analyze,
I have to disable sequential scan for the system to work correctly. 

cdinfo=# set enable_seqscan=false ;
SET VARIABLE
cdinfo=# explain select * from ztitles, zsong where ztitles.muzenbr =
zsong.muzenbr and ztitles.artistid = 100 ;
NOTICE:  QUERY PLAN:

Merge Join  (cost=3134.95..242643.42 rows=32426 width=356) ->  Sort  (cost=3134.95..3134.95 rows=3532 width=304)
-> Index Scan using ztitles_artistid on ztitles  (cost=0.00..3126.62
 
rows=3532 width=304) ->  Index Scan using zsong_muzenbr on zsong  (cost=0.00..237787.51
rows=4298882 width=52)

EXPLAIN
cdinfo=# set enable_seqscan=true ;
SET VARIABLE
cdinfo=# explain select * from ztitles, zsong where ztitles.muzenbr =
zsong.muzenbr and ztitles.artistid = 100 ;
NOTICE:  QUERY PLAN:

Hash Join  (cost=3126.97..61889.37 rows=32426 width=356) ->  Seq Scan on zsong  (cost=0.00..52312.66 rows=4298882
width=52)->  Hash  (cost=3126.62..3126.62 rows=3532 width=304)       ->  Index Scan using ztitles_artistid on ztitles
(cost=0.00..3126.62
rows=3532 width=304)

EXPLAIN
cdinfo=# select count(*) from zsong ; count
---------4298882
(1 row)


pgsql-hackers by date:

Previous
From: Michael Loftis
Date:
Subject: Re: Index Scans become Seq Scans after VACUUM ANALYSE
Next
From: Tom Lane
Date:
Subject: Re: timeout implementation issues