Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers
From | mlw |
---|---|
Subject | Re: Index Scans become Seq Scans after VACUUM ANALYSE |
Date | |
Msg-id | 3CBEDCC2.D9E44A0E@mohawksoft.com Whole thread Raw |
In response to | Re: Index Scans become Seq Scans after VACUUM ANALYSE (Bruce Momjian <pgman@candle.pha.pa.us>) |
Responses |
Re: Index Scans become Seq Scans after VACUUM ANALYSE
Re: Index Scans become Seq Scans after VACUUM ANALYSE Re: Index Scans become Seq Scans after VACUUM ANALYSE |
List | pgsql-hackers |
Bruce Momjian wrote: > > mlw wrote: > > I don't think we will agree, we have seen different behaviors, and our > > experiences seem to conflict. This however does not mean that either of us is > > in error, it just may mean that we use data with very different > > characteristics. > > > > This thread is kind of frustrating for me because over the last couple years I > > have seen this problem many times and the answer is always the same, "The > > statistics need to be improved." Tom, you and I have gone back and forth about > > this more than once. > > > > Have you tried reducing 'random_page_cost' in postgresql.conf. That > should solve most of your problems if you would like more index scans. My random page cost is 1 :-) I had a database where I had to have "enable_seqscan=false" in the config file. The nature of the data always makes the statistics bogus, and it always refused to use the index. It is frustrating because sometimes it *is* a problem for some unknown number of users (including myself), as evidenced by the perenial "why isn't postgres using my index" posts, and for the last two years you guys keep saying it isn't a problem, or that the statistics just need improvement. Sorry for my tone, but I have pulled out my hair numerous times on this very problem. This whole process has lead me to change my mind. I don't think adding weight to an index scan is the answer, I think having the ability to submit hints to the planner is the only way to really address this or any future issues. Just so you understand my perspective, I am not thinking of the average web monkey. I am thinking of the expert DBA or archetect who want to deploy a system, and needs to have real control over performance in critical areas. My one most important experience (I've had more than one) with this whole topic is DMN's music database, when PostgreSQL uses the index, the query executes in a fraction of a second. When "enable_seqscan=true" PostgreSQL refuses to use the index, and the query takes a about a minute. No matter how much I analyze, I have to disable sequential scan for the system to work correctly. cdinfo=# set enable_seqscan=false ; SET VARIABLE cdinfo=# explain select * from ztitles, zsong where ztitles.muzenbr = zsong.muzenbr and ztitles.artistid = 100 ; NOTICE: QUERY PLAN: Merge Join (cost=3134.95..242643.42 rows=32426 width=356) -> Sort (cost=3134.95..3134.95 rows=3532 width=304) -> Index Scan using ztitles_artistid on ztitles (cost=0.00..3126.62 rows=3532 width=304) -> Index Scan using zsong_muzenbr on zsong (cost=0.00..237787.51 rows=4298882 width=52) EXPLAIN cdinfo=# set enable_seqscan=true ; SET VARIABLE cdinfo=# explain select * from ztitles, zsong where ztitles.muzenbr = zsong.muzenbr and ztitles.artistid = 100 ; NOTICE: QUERY PLAN: Hash Join (cost=3126.97..61889.37 rows=32426 width=356) -> Seq Scan on zsong (cost=0.00..52312.66 rows=4298882 width=52)-> Hash (cost=3126.62..3126.62 rows=3532 width=304) -> Index Scan using ztitles_artistid on ztitles (cost=0.00..3126.62 rows=3532 width=304) EXPLAIN cdinfo=# select count(*) from zsong ; count ---------4298882 (1 row)
pgsql-hackers by date: