Re: [repost] partial index / funxtional idx or bad sql? - Mailing list pgsql-performance
From | Tom Lane |
---|---|
Subject | Re: [repost] partial index / funxtional idx or bad sql? |
Date | |
Msg-id | 8275.1052837172@sss.pgh.pa.us Whole thread Raw |
In response to | Re: [repost] partial index / funxtional idx or bad sql? ("Jim C. Nasby" <jim@nasby.net>) |
List | pgsql-performance |
"Jim C. Nasby" <jim@nasby.net> writes: > Wow, I'll have to keep that in mind. Shouldn't the optimizer be able to > handle that? Could this get added to the TODO? No, 'cause it's done (in CVS tip). I'm actually a bit hesitant now to recommend that people do such things, because the 7.4 optimizer is likely to produce a better plan from the unmodified IN query than it will from any explicitly "improved" version. The 7.4 code knows several ways to do IN efficiently, but when you hand-transform the query you are forcing the choice; perhaps wrongly. An example from CVS tip and the regression database in which hand transformation forces a less efficient plan choice: regression=# explain analyze select * from tenk1 a where unique1 in (select ten from tenk1); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Merge Join (cost=483.17..484.91 rows=10 width=248) (actual time=407.14..409.16 rows=10 loops=1) Merge Cond: ("outer".unique1 = "inner".ten) -> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..1571.97 rows=10000 width=244) (actual time=0.41..1.60 rows=11loops=1) -> Sort (cost=483.17..483.19 rows=10 width=4) (actual time=406.57..406.65 rows=10 loops=1) Sort Key: tenk1.ten -> HashAggregate (cost=483.00..483.00 rows=10 width=4) (actual time=406.08..406.26 rows=10 loops=1) -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.19..261.84 rows=10000 loops=1) Total runtime: 410.74 msec (8 rows) regression=# explain analyze select * from tenk1 a, (select distinct ten from tenk1) b regression-# where a.unique1 = b.ten; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=1122.39..1232.59 rows=10 width=248) (actual time=476.67..666.02 rows=10 loops=1) -> Subquery Scan b (cost=1122.39..1172.39 rows=10 width=4) (actual time=475.94..662.00 rows=10 loops=1) -> Unique (cost=1122.39..1172.39 rows=10 width=4) (actual time=475.89..661.65 rows=10 loops=1) -> Sort (cost=1122.39..1147.39 rows=10000 width=4) (actual time=475.85..559.27 rows=10000 loops=1) Sort Key: ten -> Seq Scan on tenk1 (cost=0.00..458.00 rows=10000 width=4) (actual time=0.37..274.87 rows=10000 loops=1) -> Index Scan using tenk1_unique1 on tenk1 a (cost=0.00..6.01 rows=1 width=244) (actual time=0.27..0.31 rows=1 loops=10) Index Cond: (a.unique1 = "outer".ten) Total runtime: 687.53 msec (9 rows) So, for now, make the transformation ... but keep a note about the IN version to try whenever you update to 7.4. regards, tom lane
pgsql-performance by date: