Re: Query planner refuses to use index - Mailing list pgsql-general
From | Janning Vygen |
---|---|
Subject | Re: Query planner refuses to use index |
Date | |
Msg-id | 200507211824.01573.vygen@gmx.de Whole thread Raw |
In response to | Query planner refuses to use index (Kilian Hagemann <hagemann1@egs.uct.ac.za>) |
Responses |
Re: Query planner refuses to use index
|
List | pgsql-general |
Am Donnerstag, 21. Juli 2005 17:45 schrieb Kilian Hagemann: > Hi there, > > I know this subject has come up before many times, but I'm struggling for > hours with the following problem and none of the posts seem to have a > solution. I have a table with a good 13 million entries with > > station_data=# \d speed > Table "public.speed" > Column | Type | Modifiers > --------------+----------+----------- > set_id | smallint | not null > rec_time | abstime | not null > wind_speed | smallint | > Indexes: > "speed_pkey" primary key, btree (set_id, rec_time) > > I use lots of queries of the form SELECT * FROM speed WHERE set_id=xxx AND > rec_time >=yyy where xxx is an integer and yyy is an abstime. At first, I > battled to get the query planner to use an index at all, even when forcing, > but > > http://archives.postgresql.org/pgsql-general/2001-09/msg01120.php > > suggested I need to use set_id=5::smallint. It works, but why is pg not > intelligent enough to figure out that the literal 5 and smallint are > compatible? > > So I thought I had solved my problem, but then it still refused to use the > index, even though sequential scans are prohibitively expensive: > > station_data=# EXPLAIN ANALYZE SELECT * FROM speed WHERE set_id=5::smallint > AND rec_time >= '1999/01/01'::abstime; > QUERY PLAN > --------------------------------------------------------------------------- >------------------------------------------- Seq Scan on speed > (cost=0.00..276640.28 rows=677372 width=8) (actual > time=14024.081..78236.525 rows=652389 loops=1) > Filter: ((set_id = 5::smallint) AND (rec_time >= '1999-01-01 > 00:00:00+02'::abstime)) > Total runtime: 80156.330 ms > > When I explicitly turn off sequential scans by issuing "SET enable_seqscan > TO OFF;", I get what I want: > station_data=# EXPLAIN ANALYZE SELECT * FROM speed WHERE set_id=5::smallint > AND rec_time >= '1999/01/01'::abstime; > QUERY PLAN > --------------------------------------------------------------------------- >----------------------------------------------------------- Index Scan using > speed_pkey on speed (cost=0.00..2009924.87 rows=677372 width=8) (actual > time=50.070..5775.698 rows=652389 loops=1) > Index Cond: ((set_id = 5::smallint) AND (rec_time >= '1999-01-01 > 00:00:00+02'::abstime)) > Total runtime: 8819.371 ms > > which is 10 times faster. We're down to the last recommendation of section > 11.8 in the documentation, so I increased the statistics gathered with "SET > default_statistics_target TO 50;", but that makes no difference either. > > Am I left to disable seqscans for eternity (in which case may I file a bug) > or is there something else I might be missing? some hints: - You never want to set enable_seq off in production database. - did you run "VACUUM ANALYZE speed" lately? - what version are you running? - if you look at the cost values you can see what cost the planner expects. "actual time" is what it tooks to really execute the query. You see in example 1 with seq-scan the estimated cost is 276,640 and with index scan it is 2,009,924. So the planner chooses the seq scan. Maybe because you have not analyzed lately? - another parameter to look at is random_page_cost: "Sets the planner's estimate of the cost of a nonsequentially fetched disk page. This is measured as a multiple of the cost of a sequential page fetch. A higher value makes it more likely a sequential scan will be used, a lower value makes it more likely an index scan will be used. The default is four." kind regards, janning
pgsql-general by date: