Index works with foo.val=bar.val but not foo.val<=bar.val - Mailing list pgsql-performance
From | 8lhhxba02@sneakemail.com |
---|---|
Subject | Index works with foo.val=bar.val but not foo.val<=bar.val |
Date | |
Msg-id | 908-19450@sneakemail.com Whole thread Raw |
List | pgsql-performance |
can anyone imagine why my query refuses to use indexes for a foo.val<=bar.val query (but works for foo.val=bar.val foo.val<=1). the i_value fields are ints (int4), and selectivity of both of these queries is 2/10000. and I'm running pgsql 7.4.3. Also, the default_statistics_target is 1000 and I increased the effective_cache_size from 1000 to 80000 but it still doesn't work (but seems sort of unrelated. there seems to be something fundamentally wrong, because even if I set enable_seqscan=off and all the other crazy plans it tries, it never gets to index scan). If anyone has any ideas, please let me know, I've been trying to get this working for a long time now. this query produces the correct plan using Index Scan: INSERT INTO answer SELECT foo.cid as pid, bar.cid as sid,count(bar.cid) as cnt, 1 FROM foo, bar WHERE foo.field=bar.field AND ( foo.op='i=i' AND bar.op='i=i' AND bar.i_value=foo.i_value ) GROUP BY foo.cid,bar.cid; ------explain analyze---------- Subquery Scan "*SELECT*" (cost=9.04..9.07 rows=2 width=20) (actual time=1.000..1.000 rows=2 loops=1) -> HashAggregate (cost=9.04..9.04 rows=2 width=8) (actual time=1.000..1.000 rows=2 loops=1) -> Nested Loop (cost=0.00..9.02 rows=2 width=8) (actual time=1.000..1.000 rows=2 loops=1) Join Filter: (("outer".field)::text = ("inner".field)::text) -> Seq Scan on foo (cost=0.00..1.01 rows=1 width=17) (actual time=0.000..0.000 rows=1 loops=1) Filter: ((op)::text = 'i=i'::text) -> Index Scan using bar_index_op_i_value on bar (cost=0.00..7.98 rows=2 width=17) (actual time=1.000..1.000 rows=2 loops=1) Index Cond: (((bar.op)::text = 'i=i'::text) AND (bar.i_value = "outer".i_value)) Total runtime: 1.000 ms ------------------------- this almost identical query doesn't (with = as <=): INSERT INTO answer SELECT foo.cid as pid, bar.cid as sid,count(bar.cid) as cnt, 5 FROM foo, bar WHERE foo.field=bar.field AND ( foo.op='i<=i' AND bar.op='i=i' AND bar.i_value<=foo.i_value ) GROUP BY foo.cid,bar.cid; -------------Explain------- Table contains 9 rows QUERY PLAN ---------- Subquery Scan "*SELECT*" (cost=385.02..435.03 rows=3334 width=20) (actual time=50.000..50.000 rows=2 loops=1) -> HashAggregate (cost=385.02..393.35 rows=3334 width=8) (actual time=50.000..50.000 rows=2 loops=1) -> Nested Loop (cost=0.00..360.01 rows=3334 width=8) (actual time=9.000..50.000 rows=2 loops=1) Join Filter: ((("outer".field)::text = ("inner".field)::text) AND ("inner".i_value <= "outer".i_value)) -> Seq Scan on foo (cost=0.00..1.01 rows=1 width=17) (actual time=0.000..0.000 rows=1 loops =1) Filter: ((op)::text = 'i<=i'::text) -> Seq Scan on bar (cost=0.00..209.00 rows=10000 width=17) (actual time=0.000..29.000 rows= 10000 loops=1) Filter: ((op)::text = 'i=i'::text) Total runtime: 51.000 ms ---------------------------- These queries both are on tables foo with 10,000 values (with i_value values from 1-5000) and one single entry in bar (with i_value=1) If table bar has more than 1 entry, it resorts to a merge join, why can't I get this to use Index Scan also? This is what happens if bar has like 20 values (1-20): INSERT INTO answer SELECT foo.cid as pid, bar.cid as sid,count(bar.cid) as cnt, 5 FROM foo, bar WHERE foo.field=bar.field AND ( foo.op='i<=i' AND bar.op='i=i' AND bar.i_value<=foo.i_value ) GROUP BY foo.cid,bar.cid; -------------Explain------- Table contains 8 rows QUERY PLAN ---------- Subquery Scan "*SELECT*" (cost=2.74..3.04 rows=20 width=20) (actual time=0.000..0.000 rows=20 loops=1) -> HashAggregate (cost=2.74..2.79 rows=20 width=8) (actual time=0.000..0.000 rows=20 loops=1) -> Merge Join (cost=1.29..2.59 rows=20 width=8) (actual time=0.000..0.000 rows=20 loops=1) Merge Cond: ("outer".i_value = "inner".i_value) Join Filter: (("inner".field)::text = ("outer".field)::text) -> Index Scan using bar_index on bar (cost=0.00..500.98 rows=10000 width=17) (actu al time=0.000..0.000 rows=21 loops=1) Filter: ((op)::text = 'i=i'::text) -> Sort (cost=1.29..1.32 rows=10 width=17) (actual time=0.000..0.000 rows=19 loops=1) Sort Key: foo.i_value -> Seq Scan on foo (cost=0.00..1.12 rows=10 width=17) (actual time=0.000..0.000 rows= 10 loops=1) Filter: ((op)::text = 'i=i'::text) Total runtime: 4.000 ms Thanks! --Kris "Love is like Pi: natural, irrational, and very important." -- Lisa Hoffman
pgsql-performance by date: