Re: Nested loops overpriced - Mailing list pgsql-performance

From Peter Eisentraut
Subject Re: Nested loops overpriced
Date
Msg-id 200705091817.45056.peter_e@gmx.net
Whole thread Raw
In response to Re: Nested loops overpriced  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Nested loops overpriced
List pgsql-performance
Am Mittwoch, 9. Mai 2007 16:11 schrieb Tom Lane:
> Well, there's something funny going on here.  You've got for instance
>
>            ->  Index Scan using email_pkey on email  (cost=0.00..3.85
> rows=1 width=8) (actual time=0.005..0.005 rows=0 loops=280990) Index Cond:
> (email.email_id = eh_from.email_id)
>                  Filter: (("time" >= '2007-05-05 17:01:59'::timestamp
> without time zone) AND ("time" < '2007-05-05 18:01:59'::timestamp without
> time zone))
>
> on the inside of a nestloop whose outer side is predicted to return
> 107156 rows.  That should've been discounted to *way* less than 3.85
> cost units per iteration.

This is the new plan with 8.2.4.  It's still got the same problem, though.

                                                                                        QUERY PLAN
                                                                   

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=5627064.21..5627718.73 rows=32726 width=184) (actual time=4904.834..5124.585 rows=35000 loops=1)
   ->  Sort  (cost=5627064.21..5627146.03 rows=32726 width=184) (actual time=4904.771..4947.892 rows=35000 loops=1)
         Sort Key: eh_subj.header_body
         ->  Nested Loop  (cost=0.00..5624610.06 rows=32726 width=184) (actual time=0.397..4628.141 rows=35000 loops=1)
               ->  Nested Loop  (cost=0.00..1193387.12 rows=28461 width=120) (actual time=0.322..3960.360 rows=35000
loops=1)
                     ->  Nested Loop  (cost=0.00..1081957.26 rows=28648 width=112) (actual time=0.238..3572.023
rows=35000loops=1) 
                           ->  Index Scan using dummy_index on email_header eh_from  (cost=0.00..13389.15 rows=280662
width=104)(actual time=0.133..1310.248 rows=280990 loops=1) 
                           ->  Index Scan using email_pkey on email  (cost=0.00..3.79 rows=1 width=8) (actual
time=0.005..0.005rows=0 loops=280990) 
                                 Index Cond: (email.email_id = eh_from.email_id)
                                 Filter: (("time" >= '2007-05-05 17:01:59'::timestamp without time zone) AND ("time" <
'2007-05-0518:01:59'::timestamp without time zone)) 
                     ->  Index Scan using mime_part_pkey on mime_part  (cost=0.00..3.88 rows=1 width=12) (actual
time=0.005..0.006rows=1 loops=35000) 
                           Index Cond: ((email.email_id = mime_part.email_id) AND (mime_part.mime_part_id = 0))
               ->  Index Scan using idx__email_header__email_id__mime_part_id on email_header eh_subj
(cost=0.00..155.47rows=18 width=104) (actual time=0.009..0.014 rows=1 loops=35000) 
                     Index Cond: ((email.email_id = eh_subj.email_id) AND (0 = eh_subj.mime_part_id))
                     Filter: (header_name = 'subject'::text)
 Total runtime: 5161.390 ms

> Are you using any nondefault planner settings?

random_page_cost = 3
effective_cache_size = 384MB

> How big are these tables, anyway?

email        35 MB
email_header    421 MB
mime_part    37 MB

Everything is analyzed, vacuumed, and reindexed.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Apparently useless bitmap scans
Next
From: Jignesh Shah
Date:
Subject: ZFS and Postgresql - WASRe: Best OS for Postgres 8.2