Thread: Wrong query plan when using a left outer join
I have the following setup: A table called hand: Table "stage.hand_meta" Column | Type | Modifiers ---------------+--------------------------+-------------------------------------------------------------hand_id | integer | not null default nextval('hand_meta_hand_id_seq'::regclass)hand_no | bigint | not nullsite_id | smallint | not nullgame_id | smallint | not nulltime | timestamp with time zone | notnulltournament_id | bigint | Indexes: "hand_meta_pkey" PRIMARY KEY, btree (hand_id) CLUSTER "hand_meta_hand_no_site_unq" UNIQUE, btree (hand_no, site_id) "hand_meta_time_idx" btree ("time") "hand_meta_tournament_id_idx" btree (tournament_id) Referenced by: TABLE "handhistory_plain" CONSTRAINT "handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) TABLE "handhistory_staged" CONSTRAINT "staged_hand_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) Getting the max hand_id (primary key) results in using an index: feiketracker=> explain analyze select max(hand_id) from stage.hand; QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------------------------Result (cost=0.03..0.04 rows=1 width=0) (actual time=0.379..0.383 rows=1 loops=1) InitPlan 1 (returns $0) -> Limit (cost=0.00..0.03 rows=1 width=4) (actual time=0.337..0.340 rows=1 loops=1) -> Index Scan Backward using hand_meta_pkey on hand_meta (cost=0.00..82667.12 rows=2479440 width=4) (actual time=0.319..0.319 rows=1 loops=1) Index Cond: (hand_id IS NOT NULL)Total runtime: 0.823 ms (6 rows) Now, if i create a view which left outer joins another table and select max hand_id it uses a seq_scan, which I think it should'nt use, as it only needs to query hand_meta and then use the index: feiketracker=> create view seqscan_example as (select * from hand_meta left join handhistory_plain using(hand_id)); CREATE VIEW Time: 72.736 ms feiketracker=> explain analyze select max(hand_id) from seqscan_example; QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------Aggregate (cost=49261.00..49261.01 rows=1 width=4) (actual time=34672.052..34672.054 rows=1 loops=1) -> Seq Scan on hand_meta (cost=0.00..43062.40 rows=2479440 width=4) (actual time=0.180..16725.109 rows=2479440 loops=1)Total runtime: 34672.874 ms (3 rows) feiketracker=> select version(); version ------------------------------------------------------------------------------------------------------------------------------------PostgreSQL 9.0.6on armv5tejl-unknown-linux-gnueabi, compiled by GCC gcc (GCC) 3.4.4 (release) (CodeSourcery ARM 2005q3-2), 32-bit (1 row) I cannot think of a reason to use a seqscan, the left join should indicate all results from hand_meta should be used, hand_id is the primary key, so selecting max(hand_id) from the table or the view should result in the same execution plan or am I thinking wrong?
On Tue, Jan 17, 2012 at 7:54 AM, Feike Steenbergen <feikesteenbergen@gmail.com> wrote: > I have the following setup: > > A table called hand: > > > Table "stage.hand_meta" > Column | Type | > Modifiers > ---------------+--------------------------+------------------------------------------------------------- > hand_id | integer | not null default > nextval('hand_meta_hand_id_seq'::regclass) > hand_no | bigint | not null > site_id | smallint | not null > game_id | smallint | not null > time | timestamp with time zone | not null > tournament_id | bigint | > Indexes: > "hand_meta_pkey" PRIMARY KEY, btree (hand_id) CLUSTER > "hand_meta_hand_no_site_unq" UNIQUE, btree (hand_no, site_id) > "hand_meta_time_idx" btree ("time") > "hand_meta_tournament_id_idx" btree (tournament_id) > Referenced by: > TABLE "handhistory_plain" CONSTRAINT > "handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES > hand_meta(hand_id) > TABLE "handhistory_staged" CONSTRAINT "staged_hand_hand_id_fkey" > FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) > > Getting the max hand_id (primary key) results in using an index: > > > feiketracker=> explain analyze select max(hand_id) from stage.hand; > > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > Result (cost=0.03..0.04 rows=1 width=0) (actual time=0.379..0.383 > rows=1 loops=1) > InitPlan 1 (returns $0) > -> Limit (cost=0.00..0.03 rows=1 width=4) (actual > time=0.337..0.340 rows=1 loops=1) > -> Index Scan Backward using hand_meta_pkey on hand_meta > (cost=0.00..82667.12 rows=2479440 width=4) (actual time=0.319..0.319 > rows=1 loops=1) > Index Cond: (hand_id IS NOT NULL) > Total runtime: 0.823 ms > (6 rows) > > > Now, if i create a view which left outer joins another table and > select max hand_id it uses a seq_scan, which I think it should'nt use, > as it only needs to query hand_meta and then use the index: > > > feiketracker=> create view seqscan_example as (select * from hand_meta > left join handhistory_plain using(hand_id)); > CREATE VIEW > Time: 72.736 ms > > feiketracker=> explain analyze select max(hand_id) from seqscan_example; > QUERY PLAN > ----------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=49261.00..49261.01 rows=1 width=4) (actual > time=34672.052..34672.054 rows=1 loops=1) > -> Seq Scan on hand_meta (cost=0.00..43062.40 rows=2479440 > width=4) (actual time=0.180..16725.109 rows=2479440 loops=1) > Total runtime: 34672.874 ms > (3 rows) > > > feiketracker=> select version(); > version > ------------------------------------------------------------------------------------------------------------------------------------ > PostgreSQL 9.0.6 on armv5tejl-unknown-linux-gnueabi, compiled by GCC > gcc (GCC) 3.4.4 (release) (CodeSourcery ARM 2005q3-2), 32-bit > (1 row) > > > I cannot think of a reason to use a seqscan, the left join should > indicate all results from hand_meta should be used, hand_id is the > primary key, so selecting max(hand_id) from the table or the view > should result in the same execution plan or am I thinking wrong? > it's not always so simple for the planner to eliminate left join... imagine that the view on the right side of join has some side effects. so postgres will never "cut off" the right join side. but postgres will still try to choose best execution plan. seq scan may simply be faster here. breaking point is somewhere near 50% selectivity. when handhistory_plain starts geting much bigger, plan will change. try to experiment with SET enable_seqscan TO false; - and see what happens. BTW, add a foreign key and index on handhistory_plain.hand_id (unless you have it already). BTW2, if you really don't care on handhistory you can just use original query with no join. Filip
> BTW, add a foreign key and index on handhistory_plain.hand_id (unless> you have it already). It's there already: feiketracker=# \d+ handhistory_plain; Table "stage.handhistory_plain"Column | Type | Modifiers | Storage |Description ---------+---------+-----------+----------+-------------hand_id | integer | not null | plain |history | text | notnull | extended | Indexes: "handhistory_plain_pkey" PRIMARY KEY, btree (hand_id) CLUSTER Foreign-key constraints: "handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id) > BTW2, if you really don't care on handhistory you can just use > original query with no join. Well, sometimes I do, sometimes I don't. For easier application access I wanted to create a view that joins both these tables together: easier application design and better performance, as the analyzer should know best when not to use the handhistory_plain table. The design is as follows: hand_meta - holds all metadata for a pokerhand handhistory_plain holds the history for a pokerhand hand_meta is going to be used the most, it is around 165 bytes per tuple handhistory_plain is not going to be used often (it is there as a reference); it is around 5000 bytes per tuple. They both hold the same column as primary key, handhistory_plain holds a fraction of the tuples of hand_meta, the split was only made to make sure the processed data (hand_meta) is smaller in size and should therefore require less I/O and thus increase performance. I'm not sure what to make of: > imagine that the view on the right side of join has some side effects. I can see some side effects may occur, but as it is a left join, the left hand side will always be part of the returning set (there is no where clause), so the index should be used. Even though I don't understand, you seem to be right, a natural join is 30 times faster: feiketracker=# explain analyze select max(hand_id) from hand_meta left join handhistory_plain using(hand_id); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------Aggregate (cost=10000049261.00..10000049261.01 rows=1 width=4) (actual time=31179.238..31179.241 rows=1 loops=1) -> Seq Scan on hand_meta (cost=10000000000.00..10000043062.40 rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440 loops=1)Total runtime: 31179.725 ms (3 rows) Time: 31185.088 ms feiketracker=# explain analyze select max(hand_id) from hand_meta join handhistory_plain using(hand_id); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------Aggregate (cost=53043.61..53043.62 rows=1 width=4) (actual time=962.242..962.245 rows=1 loops=1) -> Nested Loop (cost=0.00..53029.93 rows=5470 width=4) (actual time=0.400..920.582 rows=5470 loops=1) -> Index Scan using handhistory_plain_pkey on handhistory_plain (cost=0.00..14494.27 rows=5470 width=4) (actual time=0.215..101.177 rows=5470 loops=1) -> Index Scan using hand_meta_pkey on hand_meta (cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1 loops=5470) Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id)Total runtime: 962.968 ms > try to experiment with SET enable_seqscan TO false; - and see what happens. Didn't make a difference; therefore I think postgres determines it is unable to use the index, is that correct? Thank you for now: I'll use the inner join (or natural join in this case) for this specific view
oops, but ofcourse, a natural view will not give the correct answer, back to the drawing board ... On Tue, Jan 17, 2012 at 19:53, Feike Steenbergen <feikesteenbergen@gmail.com> wrote: >> BTW, add a foreign key and index on handhistory_plain.hand_id (unless> you have it already). > It's there already: > > feiketracker=# \d+ handhistory_plain; > Table "stage.handhistory_plain" > Column | Type | Modifiers | Storage | Description > ---------+---------+-----------+----------+------------- > hand_id | integer | not null | plain | > history | text | not null | extended | > Indexes: > "handhistory_plain_pkey" PRIMARY KEY, btree (hand_id) CLUSTER > Foreign-key constraints: > "handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES > hand_meta(hand_id) > >> BTW2, if you really don't care on handhistory you can just use >> original query with no join. > > Well, sometimes I do, sometimes I don't. For easier application access > I wanted to create a view that joins both these tables together: > easier application design and better performance, as the analyzer > should know best when not to use the handhistory_plain table. > > > The design is as follows: > > hand_meta - holds all metadata for a pokerhand > handhistory_plain holds the history for a pokerhand > > hand_meta is going to be used the most, it is around 165 bytes per tuple > handhistory_plain is not going to be used often (it is there as a > reference); it is around 5000 bytes per tuple. > > They both hold the same column as primary key, handhistory_plain holds > a fraction of the tuples of hand_meta, the split was only made to make > sure the processed data (hand_meta) is smaller in size and should > therefore require less I/O and thus increase performance. > > I'm not sure what to make of: >> imagine that the view on the right side of join has some side effects. > I can see some side effects may occur, but as it is a left join, the > left hand side will always be part of the returning set (there is no > where clause), so the index should be used. > Even though I don't understand, you seem to be right, a natural join > is 30 times faster: > > feiketracker=# explain analyze select max(hand_id) from hand_meta left > join handhistory_plain using(hand_id); > QUERY > PLAN > --------------------------------------------------------------------------------------------------------------------------------------------- > Aggregate (cost=10000049261.00..10000049261.01 rows=1 width=4) > (actual time=31179.238..31179.241 rows=1 loops=1) > -> Seq Scan on hand_meta (cost=10000000000.00..10000043062.40 > rows=2479440 width=4) (actual time=0.131..16039.886 rows=2479440 > loops=1) > Total runtime: 31179.725 ms > (3 rows) > > Time: 31185.088 ms > > feiketracker=# explain analyze select max(hand_id) from hand_meta join > handhistory_plain using(hand_id); > > QUERY PLAN > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > Aggregate (cost=53043.61..53043.62 rows=1 width=4) (actual > time=962.242..962.245 rows=1 loops=1) > -> Nested Loop (cost=0.00..53029.93 rows=5470 width=4) (actual > time=0.400..920.582 rows=5470 loops=1) > -> Index Scan using handhistory_plain_pkey on > handhistory_plain (cost=0.00..14494.27 rows=5470 width=4) (actual > time=0.215..101.177 rows=5470 loops=1) > -> Index Scan using hand_meta_pkey on hand_meta > (cost=0.00..7.03 rows=1 width=4) (actual time=0.100..0.115 rows=1 > loops=5470) > Index Cond: (hand_meta.hand_id = handhistory_plain.hand_id) > Total runtime: 962.968 ms > > >> try to experiment with SET enable_seqscan TO false; - and see what happens. > Didn't make a difference; therefore I think postgres determines it is > unable to use the index, is that correct? > > > Thank you for now: I'll use the inner join (or natural join in this > case) for this specific view
2012/1/17 Filip Rembiałkowski <plk.zuber@gmail.com>: > postgres will still try to choose best execution plan. seq scan may simply be > faster here. breaking point is somewhere near 50% selectivity. The tipping point is usually far lower than that; in fact, it's more often around 10%. Random IO is *very* expensive, as compared to sequential IO (at least on spinning rust; SSDs are a different matter, of course). It's usually vastly cheaper to read in an entire table and filter the rows you want than to seek left and right (and then left, left, and right again) to cherry-pick the pages you need. rls -- :wq