Wrong query plan when using a left outer join - Mailing list pgsql-sql

From Feike Steenbergen
Subject Wrong query plan when using a left outer join
Date
Msg-id CAK_s-G3pBtXqD2FW=zuELC9rQLS4fyL=ww9vD3C1O1oBmY=eOA@mail.gmail.com
Whole thread Raw
Responses Re: Wrong query plan when using a left outer join  (Filip Rembiałkowski <plk.zuber@gmail.com>)
List pgsql-sql
I have the following setup:

A table called hand:

                                       Table "stage.hand_meta"   Column     |           Type           |
Modifiers
---------------+--------------------------+-------------------------------------------------------------hand_id       |
integer                 | not null default
 
nextval('hand_meta_hand_id_seq'::regclass)hand_no       | bigint                   | not nullsite_id       | smallint
             | not nullgame_id       | smallint                 | not nulltime          | timestamp with time zone |
notnulltournament_id | bigint                   |
 
Indexes:   "hand_meta_pkey" PRIMARY KEY, btree (hand_id) CLUSTER   "hand_meta_hand_no_site_unq" UNIQUE, btree (hand_no,
site_id)  "hand_meta_time_idx" btree ("time")   "hand_meta_tournament_id_idx" btree (tournament_id)
 
Referenced by:   TABLE "handhistory_plain" CONSTRAINT
"handhistory_plain_hand_id_fkey" FOREIGN KEY (hand_id) REFERENCES
hand_meta(hand_id)   TABLE "handhistory_staged" CONSTRAINT "staged_hand_hand_id_fkey"
FOREIGN KEY (hand_id) REFERENCES hand_meta(hand_id)

Getting the max hand_id (primary key) results in using an index:


feiketracker=> explain analyze select max(hand_id) from stage.hand;
QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------------Result
(cost=0.03..0.04 rows=1 width=0) (actual time=0.379..0.383
 
rows=1 loops=1)  InitPlan 1 (returns $0)    ->  Limit  (cost=0.00..0.03 rows=1 width=4) (actual
time=0.337..0.340 rows=1 loops=1)          ->  Index Scan Backward using hand_meta_pkey on hand_meta
(cost=0.00..82667.12 rows=2479440 width=4) (actual time=0.319..0.319
rows=1 loops=1)                Index Cond: (hand_id IS NOT NULL)Total runtime: 0.823 ms
(6 rows)


Now, if i create a view which left outer joins another table and
select max hand_id it uses a seq_scan, which I think it should'nt use,
as it only needs to query hand_meta and then use the index:


feiketracker=> create view seqscan_example as (select * from hand_meta
left join handhistory_plain using(hand_id));
CREATE VIEW
Time: 72.736 ms

feiketracker=> explain analyze select max(hand_id) from seqscan_example;
       QUERY PLAN
 

-----------------------------------------------------------------------------------------------------------------------------Aggregate
(cost=49261.00..49261.01 rows=1 width=4) (actual
 
time=34672.052..34672.054 rows=1 loops=1)  ->  Seq Scan on hand_meta  (cost=0.00..43062.40 rows=2479440
width=4) (actual time=0.180..16725.109 rows=2479440 loops=1)Total runtime: 34672.874 ms
(3 rows)


feiketracker=> select version();                                                             version

------------------------------------------------------------------------------------------------------------------------------------PostgreSQL
9.0.6on armv5tejl-unknown-linux-gnueabi, compiled by GCC
 
gcc (GCC) 3.4.4 (release) (CodeSourcery ARM 2005q3-2), 32-bit
(1 row)


I cannot think of a reason to use a seqscan, the left join should
indicate all results from hand_meta should be used, hand_id is the
primary key, so selecting max(hand_id) from the table or the view
should result in the same execution plan or am I thinking wrong?


pgsql-sql by date:

Previous
From: Gera Mel Handumon
Date:
Subject: UPDATE COMPATIBILITY
Next
From: Thomas Kellerer
Date:
Subject: Re: UPDATE COMPATIBILITY