Re: sub-query optimization - Mailing list pgsql-sql

From Brad Hilton
Subject Re: sub-query optimization
Date
Msg-id 1045502407.7192.10.camel@aragorn.vpop.net
Whole thread Raw
In response to Re: sub-query optimization  ("Tomasz Myrta" <jasiek@klaster.net>)
List pgsql-sql
On Fri, 2003-02-14 at 17:52, Tomasz Myrta wrote:
> It looks ok now. Probably it needs some cosmetics changes.
> >
> > select a.*
> > from
> >  categories c cross join category_map m
> >  join article_categories ac on (c.id = ac.category_id and m.child_id =
> > ac.category_id)
> >  join articles a on (a.id = ac.article_id)
> > where
> >  m.parent_id=1 and
> >  not c.restrict_views and
> >  m.child_id = c.id and
> >  a.post_status='publish'
> >
> Can you send explain analyze this query? Maybe table
> joins should be reordered or they need other indexes they have?
> 

Sure, I appreciate your interest and help.  I modified the above query
to do a "select distinct a.*" since I need to fetch all fields and need
them grouped by id.  I'm also including a query which uses derived
tables and avoids the penalty of "distinct a.*."  The second query is
much faster because of this, but it's still slower than I was hoping.  
~3 seconds for a query isn't going to fly. :(  If you can see any
warning signs from the "explain" output, I'd love to hear from you.

Thanks,
-Brad

----------------------------------
explain analyze select distinct a.*
fromcategories c cross join category_map mjoin article_categories ac on (c.id = ac.category_id and m.child_id =
ac.category_id)join articles a on (a.id = ac.article_id)
wherem.parent_id=1 andnot c.restrict_views andm.child_id = c.id anda.post_status='publish'

---------------------------------
Unique  (cost=61058.89..68058.89 rows=20000 width=203) (actual
time=7649.35..8465.96 rows=100000 loops=1)  ->  Sort  (cost=61058.89..61558.89 rows=200000 width=203) (actual
time=7649.35..7898.56 rows=200000 loops=1)        Sort Key: a.id, a.user_id, a.blog_id, a.remote_ip,
a.create_time, a.publish_time, a.update_time, a.title, a.body,
a.long_body, a.excerpt, a.post_status, a.publish_date        ->  Hash Join  (cost=5133.25..17614.25 rows=200000
width=203)
(actual time=590.36..6029.65 rows=200000 loops=1)              Hash Cond: ("outer".article_id = "inner".id)
->  Hash Join  (cost=18.33..6499.33 rows=200000 width=16)
 
(actual time=2.08..801.69 rows=200000 loops=1)                    Hash Cond: ("outer".category_id = "inner".id)
          ->  Seq Scan on article_categories ac 
 
(cost=0.00..2981.00 rows=200000 width=8) (actual time=0.01..293.28
rows=200000 loops=1)                    ->  Hash  (cost=18.01..18.01 rows=131 width=8)
(actual time=1.72..1.72 rows=0 loops=1)                          ->  Hash Join  (cost=6.64..18.01 rows=131
width=8) (actual time=0.84..1.58 rows=131 loops=1)                                Hash Cond: ("outer".child_id =
"inner".id)                                ->  Seq Scan on category_map m 
(cost=0.00..9.07 rows=131 width=4) (actual time=0.02..0.43 rows=131
loops=1)                                      Filter: (parent_id = 1)                                ->  Hash
(cost=6.31..6.31rows=131
 
width=4) (actual time=0.40..0.40 rows=0 loops=1)                                      ->  Seq Scan on categories c 
(cost=0.00..6.31 rows=131 width=4) (actual time=0.02..0.24 rows=131
loops=1)                                            Filter: (NOT
restrict_views)              ->  Hash  (cost=2885.00..2885.00 rows=100000 width=187)
(actual time=588.13..588.13 rows=0 loops=1)                    ->  Seq Scan on articles a  (cost=0.00..2885.00
rows=100000 width=187) (actual time=0.03..429.67 rows=100000 loops=1)                          Filter: (post_status =
'publish'::character
varying)Total runtime: 18544.75 msec

--------------------------------
explain analyze select articles.* from(select article_id from article_categories ac, categories c, category_map cm
whereac.category_id = c.id and ac.category_id = cm.child_id and c.id = cm.child_id and c.restrict_views = FALSE and
cm.parent_id= 1group by article_id) Xjoin articles on (articles.id = X.article_id) where articles.post_status =
'publish'

--------------------------------Merge Join  (cost=26538.07..30754.75 rows=20000 width=191) (actual
time=1736.36..3079.64 rows=100000 loops=1)  Merge Cond: ("outer".id = "inner".article_id)  ->  Index Scan using
articles_pkeyon articles  (cost=0.00..3616.68
 
rows=100000 width=187) (actual time=0.06..674.13 rows=100000 loops=1)        Filter: (post_status =
'publish'::charactervarying)  ->  Sort  (cost=26538.07..26588.07 rows=20000 width=16) (actual
 
time=1736.27..1800.42 rows=100000 loops=1)        Sort Key: x.article_id        ->  Subquery Scan x
(cost=24109.30..25109.30rows=20000
 
width=16) (actual time=1073.54..1594.71 rows=100000 loops=1)              ->  Group  (cost=24109.30..25109.30
rows=20000width=16)
 
(actual time=1073.54..1447.94 rows=100000 loops=1)                    ->  Sort  (cost=24109.30..24609.30 rows=200000
width=16) (actual time=1073.52..1191.72 rows=200000 loops=1)                          Sort Key: ac.article_id
              ->  Hash Join  (cost=18.66..6499.66
 
rows=200000 width=16) (actual time=2.14..777.38 rows=200000 loops=1)                                Hash Cond:
("outer".category_id=
 
"inner".id)                                ->  Seq Scan on article_categories ac 
(cost=0.00..2981.00 rows=200000 width=8) (actual time=0.01..304.06
rows=200000 loops=1)                                ->  Hash  (cost=18.33..18.33 rows=131
width=8) (actual time=1.80..1.80 rows=0 loops=1)                                      ->  Hash Join  (cost=6.97..18.33
rows=131 width=8) (actual time=0.78..1.65 rows=131 loops=1)                                            Hash Cond:
("outer".child_id = "inner".id)                                            ->  Seq Scan on
category_map cm  (cost=0.00..9.07 rows=131 width=4) (actual
time=0.01..0.54 rows=131 loops=1)                                                  Filter: (parent_id =
1)                                            ->  Hash  (cost=6.64..6.64
rows=131 width=4) (actual time=0.39..0.39 rows=0 loops=1)                                                  ->  Seq Scan
on
categories c  (cost=0.00..6.64 rows=131 width=4) (actual time=0.04..0.25
rows=131 loops=1)                                                        Filter:
(restrict_views = false)Total runtime: 3221.05 msec




pgsql-sql by date:

Previous
From: Terry Yapt
Date:
Subject: Re: rownum
Next
From: Tomasz Myrta
Date:
Subject: Re: sub-query optimization