Thread: Nested Loop

Nested Loop

From
"Gauri Kanekar"
Date:
Hi List,
 
how to speedup nested loop queries and by which parameters.
--
Regards
Gauri

Re: Nested Loop

From
Michael Fuhr
Date:
On Mon, Mar 26, 2007 at 05:34:39PM +0530, Gauri Kanekar wrote:
> how to speedup nested loop queries and by which parameters.

Please post a query you're trying to tune and the EXPLAIN ANALYZE
output, as well as any changes you've already made in postgresql.conf
or configuration variables you've set in a particular session.
Without more information we can't give much advice other than to
make sure you're vacuuming and analyzing the tables often enough
to keep them from becoming bloated with dead rows and to keep the
statistics current, and to review a configuration checklist such
as this one:

http://www.powerpostgresql.com/PerfList

--
Michael Fuhr

Re: Nested Loop

From
"Gauri Kanekar"
Date:
Sorry,
 
this are the Confg Setting
max_connections = 100                   # (change requires restart)
shared_buffers = 300MB
work_mem = 256MB
max_fsm_pages = 400000
max_fsm_relations = 500
wal_buffers = 512
checkpoint_segments = 20
checkpoint_timeout = 900
enable_bitmapscan = on
enable_seqscan = off
enable_tidscan = on
random_page_cost = 2
cpu_index_tuple_cost = 0.001
effective_cache_size = 800MB
join_collapse_limit = 1                 # JOINs
datestyle = 'iso, mdy'
lc_messages = 'en_US.UTF-8'                     # locale for system error message
lc_monetary = 'en_US.UTF-8'                     # locale for monetary formatting
lc_numeric = 'en_US.UTF-8'                      # locale for number formatting
lc_time = 'en_US.UTF-8'                         # locale for time formatting
 
all other are the default values.
 
 
                                                                                                  QUERY PLAN                
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=1116330.73..1116432.34 rows=6774 width=128) (actual time=438565.297..440455.386 rows=646881 loops=1)
   ->  Hash Join  (cost=10802.93..1116093.64 rows=6774 width=128) (actual time=1904.797..377717.036 rows=10438694 loops=1)
         Hash Cond: (rm.ck = rc.k)
         ->  Hash Join  (cost=10651.73..1115840.83 rows=6774 width=105) (actual time=1890.765..347169.113 rows=10438694 loops=1)
               Hash Cond: (rm.chk = rc.ky)
               ->  Hash Join  (cost=9835.35..1114905.90 rows=6774 width=83) (actual time=1873.463..317623.437 rows=10438694 loops=1)
                     Hash Cond: ( rm.ckey = rc.k)
                     ->  Hash Join  (cost=615.77..1105533.91 rows=6774 width=85) (actual time=1842.309..288198.666 rows=10438694 loops=1)
                           Hash Cond: ( rm.sk = rs.k)
                           ->  Hash Join  (cost=77.32..1104885.39 rows=6774 width=58) (actual time=1831.908..259147.154 rows=10438694 loops=1)
                                 Hash Cond: (rm.advk = ra.k)
                                 ->  Nested Loop  (cost=0.00..1104714.83 rows=6801 width=44) (actual time=1820.153..229779.814 rows=10945938 loops=1)
                                       Join Filter: ( rm.nk = rn.k)
                                       ->  Index Scan using r_idx on rn  (cost=0.00..4.27 rows=1 width=4) (actual time=0.093..0.095 rows=1 loops=1)
                                             Index Cond: (id = 607)
                                       ->  Nested Loop  (cost=0.00..1104370.50 rows=27205 width=48) (actual time=7.920..202878.054 rows=10945998 loops=1)
                                             ->  Index Scan using rpts_ldt_idx on rd  (cost= 0.00..4.27 rows=1 width=12) (actual time=0.097..0.352 rows=30 loops=1)
                                                   Index Cond: ((sdt >= '2006-12-01 00:00:00'::timestamp without time zone) AND (sd <= '2006-12-30 00:00:00'::timestamp without time zone))
                                             ->  Index Scan using rmidx on rm  (cost=0.00..1100192.24 rows=333919 width=44) (actual time=3.109..5835.861 rows=364867 loops=30)
                                                   Index Cond: (rmdkey = rd.k)
                                 ->  Hash  (cost=68.15..68.15 rows=734 width=22) (actual time=11.692..11.692 rows=734 loops=1)
                                       ->  Index Scan using radvki on radvt  (cost= 0.00..68.15 rows=734 width=22) (actual time=9.112..10.517 rows=734 loops=1)
                                             Filter: ((name)::text <> 'SYSTEM'::text)
                           ->  Hash  (cost= 500.35..500.35 rows=3048 width=35) (actual time=10.377..10.377 rows=3048 loops=1)
                                 ->  Index Scan using rskidx on rs  (cost=0.00..500.35 rows=3048 width=35) (actual time=0.082..5.589 rows=3048 loops=1)
                     ->  Hash  (cost=9118.63..9118.63 rows=8076 width=6) (actual time=31.124..31.124 rows=8076 loops=1)
                           ->  Index Scan using rcridx on rcr  (cost=0.00..9118.63 rows=8076 width=6) (actual time= 2.036..19.218 rows=8076 loops=1)
               ->  Hash  (cost=769.94..769.94 rows=3715 width=30) (actual time=17.275..17.275 rows=3715 loops=1)
                     ->  Index Scan using ridx on rcl  (cost=0.00..769.94 rows=3715 width=30) (actual time=4.238..11.432 rows=3715 loops=1)
         ->  Hash  (cost=120.38..120.38 rows=2466 width=31) (actual time=14.010..14.010 rows=2466 loops=1)
               ->  Index Scan using rckdx on rcpn  (cost= 0.00..120.38 rows=2466 width=31) (actual time=4.564..9.926 rows=2466 loops=1)
 Total runtime: 441153.878 ms
(32 rows)
 
 
we are using 8.2 version

 
On 3/26/07, Michael Fuhr <mike@fuhr.org> wrote:
On Mon, Mar 26, 2007 at 05:34:39PM +0530, Gauri Kanekar wrote:
> how to speedup nested loop queries and by which parameters.

Please post a query you're trying to tune and the EXPLAIN ANALYZE
output, as well as any changes you've already made in postgresql.conf
or configuration variables you've set in a particular session.
Without more information we can't give much advice other than to
make sure you're vacuuming and analyzing the tables often enough
to keep them from becoming bloated with dead rows and to keep the
statistics current, and to review a configuration checklist such
as this one:

http://www.powerpostgresql.com/PerfList

--
Michael Fuhr



--
Regards
Gauri

Re: Nested Loop

From
"Dave Dutcher"
Date:
-----Original Message-----
>From: pgsql-performance-owner@postgresql.org On Behalf Of Gauri Kanekar
>Subject: Re: [PERFORM] Nested Loop
>
>join_collapse_limit = 1                 # JOINs

Is there a reason you have this set to 1?  Postgres can't consider multiple
join orders when you do that.  I would try setting that back to the default
and seeing if this query is any faster.

Other than that it looked like the problems with the query might be bad
estimates of rows.  One is that postgres expects there to be 1 matching row
from rd when there are actually 30.  You might try increasing the statistics
targets on rd.sd and rd.sdt, reanalyzing, and seeing if that helps.  Also
postgres expects the join of rd and rm to return about 27205 rows when it
actually returns 10 million.  I'm not sure what you can do about that.
Maybe if Postgres gets a better estimate for rd it would then estimate the
join better.

Dave




Re: Nested Loop

From
Ragnar
Date:
On mán, 2007-03-26 at 20:33 +0530, Gauri Kanekar wrote:

you did not show your query, nor did you answer whather you had vacuumed
and analyzed.

> enable_seqscan = off

why this? this is unlikely to help


>
> QUERY PLAN
> ...
>                                  ->  Nested Loop
> (cost=0.00..1104714.83 rows=6801 width=44) (actual
> time=1820.153..229779.814 rows=10945938 loops=1)

the estimates are way off here. you sure you have analyzed?

gnari

>


Re: Nested Loop

From
"Gauri Kanekar"
Date:
Hi,
 
here is the query
 
SELECT rs.id AS sid, rs.name AS sname, rc.id AS campid, rc.name AS campname, rc.rev_type AS revtype, rc.act_type AS actntype, ra.id AS advid, ra.name AS advname, rpt_chn.id AS chanid, rpt_chn.name AS channame, rpt_cre.dn AS dn, SUM(rm.imdel) AS impression, SUM(rm.cdel) AS click, rd.sqldate AS date FROM rm, rn CROSS JOIN rd, ra, rs, rc, rpt_chn, rpt_cre WHERE rm.date_key = rd.key AND rm.net_key = rn.key AND rm.adv_key = ra.key AND rm.camp_key = rc.key AND rm.s_key = rs.key AND rm.chn_key = rpt_chn.key AND rm.cre_key = rpt_cre.key AND ra.name != 'SYSTEM' AND rd.sqldate BETWEEN '12/1/2006' AND '12/30/2006' AND ( rn.id IN ( 607 ) ) GROUP BY rd.sqldate , rs.id, rs.name, ra.id, ra.name, rc.id, rc.name, rc.rev_type , rc.act_type, rpt_chn.id, rpt_chn.name, rpt_cre.dn;

 
 
On 3/26/07, Ragnar <gnari@hive.is> wrote:
On mán, 2007-03-26 at 20:33 +0530, Gauri Kanekar wrote:

you did not show your query, nor did you answer whather you had vacuumed
and analyzed.

> enable_seqscan = off

why this? this is unlikely to help


>
> QUERY PLAN
> ...
>                                  ->  Nested Loop
> (cost=0.00..1104714.83 rows=6801 width=44) (actual
> time=1820.153..229779.814 rows=10945938 loops=1)

the estimates are way off here. you sure you have analyzed?

gnari

>




--
Regards
Gauri

Re: Nested Loop

From
Ragnar
Date:
On þri, 2007-03-27 at 16:13 +0530, Gauri Kanekar wrote:
>
> SELECT rs.id AS sid, rs.name AS sname, rc.id AS campid, rc.name AS
> campname, rc.rev_type AS revtype, rc.act_type AS actntype, ra.id AS
> advid, ra.name AS advname, rpt_chn.id AS chanid, rpt_chn.name AS
> channame, rpt_cre.dn AS dn, SUM(rm.imdel) AS impression, SUM(rm.cdel)
> AS click, rd.sqldate AS date FROM rm, rn CROSS JOIN rd, ra, rs, rc,
> rpt_chn, rpt_cre WHERE rm.date_key = rd.key AND rm.net_key = rn.key
> AND rm.adv_key = ra.key AND rm.camp_key = rc.key AND rm.s_key = rs.key
> AND rm.chn_key = rpt_chn.key AND rm.cre_key = rpt_cre.key AND
> ra.name != 'SYSTEM' AND rd.sqldate BETWEEN '12/1/2006' AND
> '12/30/2006' AND ( rn.id IN ( 607 ) ) GROUP BY rd.sqldate , rs.id,
> rs.name, ra.id, ra.name, rc.id, rc.name, rc.rev_type , rc.act_type,
> rpt_chn.id, rpt_chn.name, rpt_cre.dn;

you did not answer other questions, so do this:
1) VACUUM ANALYZE your database
2) set these in your postgresql.conf:
enable_seqscan = true
join_collapse_limit = 8
3) restart postgresql
4) do the EXPLAIN ANALYZE again, and send us it's output

gnari