Can someone explain the problem with this select - Mailing list pgsql-sql

From Richard Ray
Subject Can someone explain the problem with this select
Date
Msg-id Pine.LNX.4.64.0612051316260.20963@rray.drdc.mstc.ms.gov
Whole thread Raw
Responses Re: Can someone explain the problem with this select
Re: Can someone explain the problem with this select
Re: Can someone explain the problem with this select
List pgsql-sql
Allow me to demonstrate my pitiful SQL knowledge
I have tables documents and comments
If I run join and list doc_nums the query is quite fast
If I run join and use subselect the query is extremely slow
Can someone offer analysis

Thanks
Richard

dcc=# EXPLAIN ANALYZE select doc_num from documents limit 10;
QUERYPLAN
 
----------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1.01 rows=10 width=13) (actual time=0.013..0.061 
 
rows=10 loops=1)   ->  Seq Scan on documents  (cost=0.00..909333.85 rows=9014885 width=13) 
(actual time=0.008..0.027 rows=10 loops=1) Total runtime: 0.125 ms
(3 rows)

dcc=#



dcc=#  EXPLAIN ANALYZE select * from documents left outer join comments 
on (documents.doc_num = comments.doc_num) where documents.doc_num in 

('105364107','105513059','105513095','105513112','105585627','102933195','014650340','014650361','014650362','105419865');

QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashLeft Join  (cost=21.23..61.54 rows=10 width=444) (actual 
 
time=0.507..0.574 rows=10 loops=1)   Hash Cond: ("outer".doc_num = "inner".doc_num)   ->  Bitmap Heap Scan on documents
(cost=20.03..60.28 rows=10 
 
width=361) (actual time=0.397..0.432 rows=10 loops=1)         Recheck Cond: ((doc_num = '105364107'::bpchar) OR
(doc_num= 
 
'105513059'::bpchar) OR (doc_num = '105513095'::bpchar) OR (doc_num = 
'105513112'::bpchar) OR (doc_num = '105585627'::bpchar) OR (doc_num = 
'102933195'::bpchar) OR (doc_num = '014650340'::bpchar) OR (doc_num = 
'014650361'::bpchar) OR (doc_num = '014650362'::bpchar) OR (doc_num = 
'105419865'::bpchar))         ->  BitmapOr  (cost=20.03..20.03 rows=10 width=0) (actual 
time=0.383..0.383 rows=0 loops=1)               ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
rows=1 width=0) (actual time=0.059..0.059 rows=1 loops=1)                     Index Cond: (doc_num =
'105364107'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.039..0.039 rows=1 loops=1)                     Index Cond: (doc_num =
'105513059'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)                     Index Cond: (doc_num =
'105513095'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)                     Index Cond: (doc_num =
'105513112'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.034..0.034 rows=1 loops=1)                     Index Cond: (doc_num =
'105585627'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1)                     Index Cond: (doc_num =
'102933195'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.036..0.036 rows=1 loops=1)                     Index Cond: (doc_num =
'014650340'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)                     Index Cond: (doc_num =
'014650361'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.031..0.031 rows=1 loops=1)                     Index Cond: (doc_num =
'014650362'::bpchar)              ->  Bitmap Index Scan on documents_pkey  (cost=0.00..2.00 
 
rows=1 width=0) (actual time=0.035..0.035 rows=1 loops=1)                     Index Cond: (doc_num =
'105419865'::bpchar)  ->  Hash  (cost=1.16..1.16 rows=16 width=83) (actual time=0.080..0.080 
 
rows=16 loops=1)         ->  Seq Scan on comments  (cost=0.00..1.16 rows=16 width=83) 
(actual time=0.005..0.037 rows=16 loops=1) Total runtime: 0.775 ms
(28 rows)

dcc=#



dcc=# EXPLAIN ANALYZE select * from documents left outer join comments on 
(documents.doc_num = comments.doc_num) where documents.doc_num in (select 
doc_num from documents limit 10);

QUERY PLAN

--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
-- Merge IN Join  (cost=100000002.19..136154797.93 rows=10 width=654) 
(actual time
=23.534..2216180.550 rows=10 loops=1)   Merge Cond: ("outer".doc_num = "inner".doc_num)   ->  Merge Left Join
(cost=0.00..36129585.92rows=10083868 width=654) 
 
(actual time=23.239..2188733.430 rows=6696218 loops=1)         Merge Cond: ("outer".doc_num = "inner".doc_num)
-> Index Scan using documents_pkey on documents 
 
(cost=0.00..35723277.
60 rows=10083868 width=569) (actual time=6.845..2107300.767 rows=6695853 
loops=1
)         ->  Index Scan using doc_num_idx on comments 
(cost=0.00..377203.50 row
s=311612 width=85) (actual time=16.368..6984.365 rows=243797 loops=1)   ->  Sort  (cost=100000002.19..100000002.22
rows=10width=13) (actual 
 
time=0.2
52..0.293 rows=10 loops=1)         Sort Key: "IN_subquery".doc_num         ->  Limit  (cost=100000000.00..100000001.92
rows=10width=13) 
 
(actual t
ime=0.019..0.128 rows=10 loops=1)               ->  Seq Scan on documents  (cost=100000000.00..101940460.68 
rows=
10083868 width=13) (actual time=0.010..0.055 rows=10 loops=1) Total runtime: 2216180.973 ms
(11 rows)

dcc=#


pgsql-sql by date:

Previous
From: "Collin Peters"
Date:
Subject: Question about "AT TIME ZONE"
Next
From: Din Adrian
Date:
Subject: Re: transaction in function