Re: Index usage - Mailing list pgsql-general
From | Dennis Björklund |
---|---|
Subject | Re: Index usage |
Date | |
Msg-id | Pine.LNX.4.44.0308261321210.31606-100000@zigo.dhs.org Whole thread Raw |
In response to | Re: Index usage (Pedro Alves <pmalves@think.pt>) |
List | pgsql-general |
On Tue, 26 Aug 2003, Pedro Alves wrote: > The vacuum analyze is run on daily basis, so that cannot be the point. On all tables I assume. > Is there any way to force the planner to use the index? For experiments you can set enable_seqscan to false, but the real solution is not to force it to always use indexscans since sometimes seqscans is faster. > Bellow is the explain analyze of the querys. Indeed, the second range has > more rows (9105 / 21503), but is this SO big that the planner cannot > handle? Can not handle is to say to much. If you are going to use lots of the rows in a table a sequential scan is going to be faster then using the index. THe problem here seems to be that it thinks that it needs a lot of rows so it chooses to do a seq. scan. The best way is to understand why the estimate is off. Usually it's because one have no run vacuum analyze on that table. > This is running in a dedicated machine with 512Mb ram. Is there any > configuration parameter so that I can increase the "index to seq turn point"? :) If you have a fast disk you might for example set random_page_cost to 2 or something. Here is a small text about tuning pg: http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html > Aggregate (cost=10660.84..10660.84 rows=1 width=0) (actual > time=172.41..172.41 rows=1 loops=1) > -> Index Scan using requisicaoanalise_datacolh on requisicaoanalise (cost=0.00..10654.06 rows=2711 width=0) (actualtime=0.13..145.50 rows=9105 loops=1) > Index Cond: ((ra_datacolh >= '2003-04-01'::date) AND (ra_datacolh > < '2003-05-01'::date)) > Filter: (isactive = 0) > Total runtime: 172.62 msec Here he estimated 2711 rows and choosed a index scan. The actual number of rows was 9105 > Aggregate (cost=31019.00..31019.00 rows=1 width=0) (actual time=43252.40..43252.40 rows=1 loops=1) > -> Seq Scan on requisicaoanalise (cost=0.00..30965.24 rows=21503 width=0) (actual time=8.43..43224.01 rows=9248 loops=1) > Filter: ((ra_datacolh >= '2003-06-01'::date) AND (ra_datacolh < '2003-07-01'::date) AND (isactive = 0)) > Total runtime: 43252.57 msec Here the estimate is to use 21503 rows, but the actual number was 9248. Both estimates are way of. Sure, you can lower the point where it uses index scan instead of a seq. scan, but maybe you get even bigger missmatches in the future. What I would do is to run vacuum analyse manually on the tables that are used in the query and try again, to be really sure. Then, if that does not work maybe one should try increase the statistics-gathering for the date column (using alter table). Also I would try to understand what is in pg_statistic. I don't know much about these last parts and have never looked inside pg_statistics, but that is what I would look into. Maybe someone else have other ideas or can help you figure out why the estimates are wrong. -- /Dennis
pgsql-general by date: