Thread: left outer join terrible slow compared to inner join

left outer join terrible slow compared to inner join

From
Thomas Beutin
Date:
Hi,

i've a speed problem withe the following statement:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p
LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

This is terrible slow compared to the inner join:
SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
FROM ot_adresse AS a, ot_produkt AS p, ot_kat_prod AS pz
WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37'
AND p.p_id = pz.p_id;

These are the EXPLAIN ANALYZE output of both statements on
postgres 7.2.4:

EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ot_adresse AS a, ot_produkt AS p LEFT OUTER
JOIN ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37'; 
NOTICE:  QUERY PLAN:

Unique  (cost=22061.28..22061.30 rows=1 width=272) (actual time=13332.01..13332.97 rows=11 loops=1)
  ->  Sort  (cost=22061.28..22061.28 rows=2 width=272) (actual time=13332.00..13332.03 rows=46 loops=1)
        ->  Nested Loop  (cost=21627.92..22061.27 rows=2 width=272) (actual time=13303.51..13328.98 rows=46 loops=1)
              ->  Index Scan using o_adresse_id_uidx on o_adresse  (cost=0.00..5.96 rows=1 width=34) (actual
time=0.16..0.19rows=1 loops=1) 
              ->  Materialize  (cost=21900.98..21900.98 rows=12347 width=238) (actual time=13071.53..13111.92
rows=51394loops=1) 
                    ->  Merge Join  (cost=21627.92..21900.98 rows=12347 width=238) (actual time=11724.45..12908.46
rows=51394loops=1) 
                          ->  Sort  (cost=16815.61..16815.61 rows=6640 width=68) (actual time=4283.02..4307.07
rows=26049loops=1) 
                                ->  Seq Scan on o_produkt  (cost=0.00..16394.06 rows=6640 width=68) (actual
time=0.06..1126.96rows=26049 loops=1) 
                          ->  Sort  (cost=4812.31..4812.31 rows=40851 width=170) (actual time=7441.36..7481.73
rows=51521loops=1) 
                                ->  Subquery Scan pz  (cost=0.00..1683.51 rows=40851 width=170) (actual
time=0.14..1161.81rows=40896 loops=1) 
                                      ->  Seq Scan on o_kat_prod  (cost=0.00..1683.51 rows=40851 width=170) (actual
time=0.13..419.07rows=40896 loops=1) 
Total runtime: 13377.02 msec

EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ot_adresse AS a, ot_produkt AS p,
ot_kat_prodAS pz WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37' AND p.p_id = pz.p_id; 
NOTICE:  QUERY PLAN:

Unique  (cost=41.29..41.31 rows=1 width=272) (actual time=6.67..7.64 rows=11 loops=1)
  ->  Sort  (cost=41.29..41.29 rows=2 width=272) (actual time=6.67..6.71 rows=46 loops=1)
        ->  Nested Loop  (cost=0.00..41.28 rows=2 width=272) (actual time=0.68..3.73 rows=46 loops=1)
              ->  Nested Loop  (cost=0.00..23.80 rows=1 width=102) (actual time=0.46..0.87 rows=11 loops=1)
                    ->  Index Scan using o_adresse_id_uidx on o_adresse  (cost=0.00..5.96 rows=1 width=34) (actual
time=0.16..0.17rows=1 loops=1) 
                    ->  Index Scan using o_produkt_a_id_idx on o_produkt  (cost=0.00..17.83 rows=1 width=68) (actual
time=0.29..0.65rows=11 loops=1) 
              ->  Index Scan using o_kat_prod_p_id_idx on o_kat_prod  (cost=0.00..17.42 rows=5 width=170) (actual
time=0.16..0.24rows=4 loops=11) 
Total runtime: 7.96 msec

Do i've any chance to get the indexes used in the OUTER JOIN?

Thanks for any hints!
-tb
--
Thomas Beutin                             tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> i've a speed problem withe the following statement:
> SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
> FROM ot_adresse AS a, ot_produkt AS p
> LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
> WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

In 7.2 (and 7.3), this syntax forces the planner to join ot_produkt to
ot_kat_prod first, which is terribly inefficient because the WHERE
constraints don't constrain that join at all.  You could work around
this by writing instead

 FROM (ot_adresse AS a CROSS JOIN ot_produkt AS p)
 LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
 WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';

See http://www.postgresql.org/docs/7.3/static/explicit-joins.html
for discussion.  (Note: 7.4 will be less rigid about this issue.)

            regards, tom lane

Re: left outer join terrible slow compared to inner join

From
Thomas Beutin
Date:
On Thu, Aug 28, 2003 at 11:42:00AM -0400, Tom Lane wrote:
> Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> > i've a speed problem withe the following statement:
> > SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id
> > FROM ot_adresse AS a, ot_produkt AS p
> > LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
> > WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
>
> In 7.2 (and 7.3), this syntax forces the planner to join ot_produkt to
> ot_kat_prod first, which is terribly inefficient because the WHERE
> constraints don't constrain that join at all.  You could work around
> this by writing instead
>
>  FROM (ot_adresse AS a CROSS JOIN ot_produkt AS p)
>  LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id )
>  WHERE  p.a_id = a.id AND a.id = '105391105424941' AND a.m_id = '37';
>
Thanks for the suggestion, but the result is close to the original outer
join without the explicit cross join but far away from the speed of the
inner join.

This uses the index o_produkt_a_id_idx on o_produkt, but the index
o_kat_prod_p_id_idx on o_kat_prod is still not used:


EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt AS
p)LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id = '105391105424941' AND
a.m_id= '37'; 
NOTICE:  QUERY PLAN:

Unique  (cost=2217.96..2217.98 rows=1 width=272) (actual time=6776.21..6777.17 rows=11 loops=1)
  ->  Sort  (cost=2217.96..2217.96 rows=2 width=272) (actual time=6776.20..6776.24 rows=46 loops=1)
        ->  Nested Loop  (cost=0.00..2217.95 rows=2 width=272) (actual time=721.82..6773.09 rows=46 loops=1)
              ->  Nested Loop  (cost=0.00..23.80 rows=1 width=102) (actual time=0.69..1.74 rows=11 loops=1)
                    ->  Index Scan using o_adresse_id_uidx on o_adresse  (cost=0.00..5.96 rows=1 width=34) (actual
time=0.29..0.31rows=1 loops=1) 
                    ->  Index Scan using o_produkt_a_id_idx on o_produkt  (cost=0.00..17.83 rows=1 width=68) (actual
time=0.38..1.31rows=11 loops=1) 
              ->  Subquery Scan pz  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917
loops=11)
                    ->  Seq Scan on o_kat_prod  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77
rows=40917loops=11) 
Total runtime: 6777.55 msec

Is there any chance to use an index on the joined table o_kat_prod?


Thanks for any hints!
-tb
--
Thomas Beutin                             tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> Thanks for the suggestion, but the result is close to the original outer
> join without the explicit cross join but far away from the speed of the
> inner join.

> EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt
ASp) LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id = '105391105424941' AND
a.m_id= '37'; 
> NOTICE:  QUERY PLAN:

>               ->  Subquery Scan pz  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917
loops=11)
>                     ->  Seq Scan on o_kat_prod  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77
rows=40917loops=11) 

Hmm, I don't understand why ot_kat_prod is being treated as a subquery
here.  It isn't a view or something is it?

            regards, tom lane

Re: left outer join terrible slow compared to inner join

From
"Clay Luther"
Date:
Actually, I was about to post some problems we have with large left outer joins as well we've discovered in a porting
projectfrom NT/SQL Server -> Linux/Postgres. 

We have a particular query that is rather large, left outer joining across several tables.  Under SQL Server, with
identicaldata and schema, this particular query takes 2 seconds. 

Under PostgreSQL, this same query takes 90 seconds -- that's right, 90 seconds.  45x longer than SQL Server.  This was
quitea shock to us (we'd not seen such a performance deficit between the two dbs until this) and could, in fact, force
usaway from Postgres. 

I'd be happy to forward the explain to anyone who'd care to look at it...

cwl


> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, August 28, 2003 1:10 PM
> To: Thomas Beutin
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
> Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> > Thanks for the suggestion, but the result is close to the
> original outer
> > join without the explicit cross join but far away from the
> speed of the
> > inner join.
>
> > EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id,
> pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN
> ot_produkt AS p) LEFT OUTER JOIN  ot_kat_prod AS pz ON (
> p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id =
> '105391105424941' AND a.m_id = '37';
> > NOTICE:  QUERY PLAN:
>
> >               ->  Subquery Scan pz  (cost=0.00..1683.51
> rows=40851 width=170) (actual time=0.03..485.07 rows=40917 loops=11)
> >                     ->  Seq Scan on o_kat_prod
> (cost=0.00..1683.51 rows=40851 width=170) (actual
> time=0.02..281.77 rows=40917 loops=11)
>
> Hmm, I don't understand why ot_kat_prod is being treated as a subquery
> here.  It isn't a view or something is it?
>
>             regards, tom lane
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>

Re: left outer join terrible slow compared to inner join

From
Sean Chittenden
Date:
> Actually, I was about to post some problems we have with large left
> outer joins as well we've discovered in a porting project from
> NT/SQL Server -> Linux/Postgres.
>
> We have a particular query that is rather large, left outer joining
> across several tables.  Under SQL Server, with identical data and
> schema, this particular query takes 2 seconds.
>
> Under PostgreSQL, this same query takes 90 seconds -- that's right,
> 90 seconds.  45x longer than SQL Server.  This was quite a shock to
> us (we'd not seen such a performance deficit between the two dbs
> until this) and could, in fact, force us away from Postgres.
>
> I'd be happy to forward the explain to anyone who'd care to look at
> it...

Post an EXPLAIN ANALYZE of the query...

-sc

--
Sean Chittenden

Re: left outer join terrible slow compared to inner join

From
"Clay Luther"
Date:
Interstingly enough, the EXPLAIN ANALYZE itself took 90+ seconds:

claycle@gkar:.../PhoneInserter/PhoneInserter > time psql -f e_stationd.sql ccm > analyzed.txt

real    1m46.770s
user    0m0.010s
sys     0m0.010s

Here is the output of the above explain analyze execution:


                                                 QUERY PLAN
                                                                                                              

---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Sort  (cost=17157472946.49..17157472949.56 rows=1229 width=2066) (actual time=106513.59..106515.67 rows=1780 loops=1)
   Sort Key: d.name
   ->  Nested Loop  (cost=1.15..17157472883.44 rows=1229 width=2066) (actual time=2724.18..106407.50 rows=1780 loops=1)
         Join Filter: ("outer".fkcallingsearchspace = "inner".pkid)
         ->  Nested Loop  (cost=1.15..17157470797.64 rows=1229 width=1976) (actual time=2723.56..104804.55 rows=1780
loops=1)
               Join Filter: (("inner".tkservice = 11) AND ("inner".paramname = 'DefaultNetworkLocale'::character
varying))
               ->  Nested Loop  (cost=1.15..17157452120.74 rows=1229 width=1941) (actual time=2721.32..97560.43
rows=1780loops=1) 
                     Join Filter: (("inner".tkservice = 11) AND ("inner".paramname = 'DefaultUserLocale'::character
varying))
                     ->  Nested Loop  (cost=1.15..17157433443.83 rows=1229 width=1906) (actual time=2719.93..90288.85
rows=1780loops=1) 
                           Join Filter: (("inner".tkservice = 11) AND ("inner".paramname =
'MLPPDomainIdentifier'::charactervarying)) 
                           ->  Nested Loop  (cost=1.15..17157414766.93 rows=1229 width=1871) (actual
time=2717.94..83141.71rows=1780 loops=1) 
                                 Join Filter: (("inner".tkservice = 11) AND ("inner".paramname =
'MLPPPreemptionSetting'::charactervarying)) 
                                 ->  Nested Loop  (cost=1.15..17157396090.02 rows=1229 width=1836) (actual
time=2714.90..76002.08rows=1780 loops=1) 
                                       Join Filter: (("inner".tkservice = 11) AND ("inner".paramname =
'MLPPIndicationStatus'::charactervarying)) 
                                       ->  Nested Loop  (cost=1.15..17157377413.11 rows=1229 width=1801) (actual
time=2712.05..68810.87rows=1780 loops=1) 
                                             Join Filter: ((("outer".fkvoicemessagingpilot IS NULL) OR
("outer".fkvoicemessagingpilot= "inner".pkid)) AND (("inner".isdefault = true) OR ("outer".fkvoicemessagingpilot =
"inner".pkid)))
                                             ->  Nested Loop  (cost=1.15..17157376083.00 rows=1229 width=1759) (actual
time=2711.44..68020.45rows=1780 loops=1) 
                                                   Join Filter: ((("outer".fkvoicemessagingprofile IS NULL) OR
("outer".fkvoicemessagingprofile= "inner".pkid)) AND (("inner".isdefault = true) OR ("outer".fkvoicemessagingprofile =
"inner".pkid)))
                                                   ->  Nested Loop  (cost=1.15..17157375032.94 rows=946 width=1722)
(actualtime=2710.83..67240.60 rows=1780 loops=1) 
                                                         Join Filter: ("outer".fkroutefilter = "inner".pkid)
                                                         ->  Nested Loop  (cost=1.15..17157373703.81 rows=946
width=1623)(actual time=2710.21..66328.33 rows=1780 loops=1) 
                                                               Join Filter: ("outer".fkdialplan = "inner".pkid)
                                                               ->  Nested Loop  (cost=1.15..17157372736.52 rows=946
width=1599)(actual time=2709.74..65695.13 rows=1780 loops=1) 
                                                                     Join Filter: ("outer".fkdigitdiscardinstruction =
"inner".pkid)
                                                                     ->  Nested Loop  (cost=1.15..17157370184.69
rows=946width=1579) (actual time=2709.01..64628.50 rows=1780 loops=1) 
                                                                           Join Filter: ("outer".fkcallmanager =
"inner".pkid)
                                                                           ->  Nested Loop  (cost=1.15..17157369196.12
rows=946width=1563) (actual time=2708.58..63948.35 rows=1780 loops=1) 
                                                                                 Join Filter:
("inner".fkcallmanagergroup= "outer".pkid) 
                                                                                 ->  Nested Loop
(cost=1.15..17157368659.26rows=473 width=1531) (actual time=2708.13..63271.44 rows=890 loops=1) 
                                                                                       Join Filter:
("outer".fkcallmanagergroup= "inner".pkid) 
                                                                                       ->  Nested Loop
(cost=1.15..17157368154.34rows=473 width=1515) (actual time=2707.69..62936.58 rows=890 loops=1) 
                                                                                             Join Filter:
("outer".fkmediaresourcelist= "inner".pkid) 
                                                                                             ->  Nested Loop
(cost=1.15..17157367660.05rows=473 width=1499) (actual time=2707.29..62613.05 rows=890 loops=1) 
                                                                                                   Join Filter:
("outer".fkdevicepool= "inner".pkid) 
                                                                                                   ->  Nested Loop
(cost=1.15..17157366235.14rows=473 width=1431) (actual time=2706.54..61664.68 rows=890 loops=1) 
                                                                                                         Join Filter:
("outer".fkcallingsearchspace_translation= "inner".pkid) 
                                                                                                         ->  Nested
Loop (cost=1.15..17157365432.22 rows=473 width=1341) (actual time=2705.92..61180.46 rows=890 loops=1) 
                                                                                                               Join
Filter:("outer".fkcallingsearchspace_sharedlineappear = "inner".pkid) 
                                                                                                               ->
NestedLoop  (cost=1.15..17157364629.30 rows=473 width=1251) (actual time=2705.19..60567.86 rows=890 loops=1) 

JoinFilter: ("outer".fkroutepartition = "inner".pkid) 
                                                                                                                     ->
Nested Loop  (cost=1.15..17157363751.89 rows=473 width=1221) (actual time=2704.59..59883.81 rows=890 loops=1) 

  Join Filter: ("outer".fknumplan = "inner".pkid) 

  ->  Nested Loop  (cost=1.15..17157345954.08 rows=473 width=605) (actual time=2691.46..32912.02 rows=890 loops=1) 

        Join Filter: ("outer".pkid = "inner".fkdevice) 

        ->  Nested Loop  (cost=1.15..17157317358.59 rows=462 width=504) (actual time=2657.87..5239.00 rows=808 loops=1) 

              Join Filter: ("outer".fkmediaresourcelist = "inner".pkid) 

              ->  Hash Join  (cost=1.15..17157316875.80 rows=462 width=472) (actual time=2657.66..5139.98 rows=808
loops=1)

                    Hash Cond: ("outer".tkdeviceprotocol = "inner".enum) 

                    ->  Nested Loop  (cost=0.00..17157316866.57 rows=462 width=452) (actual time=2656.96..5062.74
rows=808loops=1) 

                          Join Filter: ("outer".fklocation = "inner".pkid) 

                          ->  Nested Loop  (cost=0.00..17157316860.79 rows=462 width=432) (actual time=2656.88..4992.47
rows=808loops=1) 

                                Join Filter: ("outer".fkcallingsearchspace_aar = "inner".pkid) 

                                ->  Nested Loop  (cost=0.00..17157316076.55 rows=462 width=342) (actual
time=2656.46..4706.90rows=808 loops=1) 

                                      Join Filter: ("outer".fkcallingsearchspace = "inner".pkid) 

                                      ->  Seq Scan on device d  (cost=0.00..17157315292.30 rows=462 width=252) (actual
time=2655.97..4380.22rows=808 loops=1) 

                                            Filter: ((tkdeviceprofile = 0) AND (subplan)) 

                                            SubPlan 

                                              ->  Materialize  (cost=12307973.61..12307973.61 rows=40 width=145)
(actualtime=2.87..3.44 rows=455 loops=924) 

                                                    ->  Unique  (cost=0.00..12307973.61 rows=40 width=145) (actual
time=803.51..2650.40rows=808 loops=1) 

                                                          ->  Nested Loop  (cost=0.00..12307972.62 rows=396 width=145)
(actualtime=803.50..2645.21 rows=808 loops=1) 

                                                                Join Filter: ("outer".fkcallmanager = "inner".pkid) 

                                                                Filter: ("inner".name = 'DAL-CCM2'::character varying) 

                                                                ->  Nested Loop  (cost=0.00..12307556.90 rows=396
width=116)(actual time=803.38..2603.11 rows=808 loops=1) 

                                                                      Join Filter: ("inner".fkcallmanagergroup =
"outer".pkid)

                                                                      Filter: ("inner".priority = 0) 

                                                                      ->  Nested Loop  (cost=0.00..12307101.59 rows=396
width=80)(actual time=803.23..2540.22 rows=808 loops=1) 

                                                                            Join Filter: ("outer".fkcallmanagergroup =
"inner".pkid)

                                                                            ->  Nested Loop  (cost=0.00..12306678.94
rows=396width=64) (actual time=803.12..2505.12 rows=808 loops=1) 

                                                                                  Join Filter: ("outer".fkdevicepool =
"inner".pkid)

                                                                                  ->  Index Scan using device_pkey on
deviced  (cost=0.00..12305486.23 rows=396 width=32) (actual time=802.92..2160.98 rows=808 loops=1) 

                                                                                        Filter: (((tkclass = 1) OR
(tkclass= 10)) AND (subplan)) 

                                                                                        SubPlan 

                                                                                          ->  Materialize
(cost=8827.26..8827.26rows=1394 width=145) (actual time=0.99..1.55 rows=451 loops=810) 

                                                                                                ->  Nested Loop
(cost=0.00..8827.26rows=1394 width=145) (actual time=1.44..799.37 rows=898 loops=1) 

                                                                                                      Join Filter:
("outer".fkcallmanager= "inner".pkid) 

                                                                                                      Filter:
("inner".name= 'DAL-SJ-CCM3'::character varying) 

                                                                                                      ->  Nested Loop
(cost=0.00..7363.56rows=1394 width=116) (actual time=1.30..750.11 rows=898 loops=1) 

                                                                                                            Join
Filter:("inner".fkcallmanagergroup = "outer".pkid) 

                                                                                                            Filter:
("inner".priority<> 0) 

                                                                                                            ->  Nested
Loop (cost=0.00..5760.46 rows=1394 width=80) (actual time=0.76..655.61 rows=1394 loops=1) 

                                                                                                                  Join
Filter:("outer".fkcallmanagergroup = "inner".pkid) 

                                                                                                                  ->
NestedLoop  (cost=0.00..4272.37 rows=1394 width=64) (actual time=0.62..597.83 rows=1394 loops=1) 


JoinFilter: ("outer".fkdevicepool = "inner".pkid) 


-> Seq Scan on device d  (cost=0.00..72.94 rows=1394 width=32) (actual time=0.02..11.42 rows=1394 loops=1) 


-> Seq Scan on devicepool dp  (cost=0.00..2.45 rows=45 width=32) (actual time=0.01..0.28 rows=45 loops=1394) 

                                                                                                                  ->
SeqScan on callmanagergroup cmg  (cost=0.00..1.03 rows=3 width=16) (actual time=0.01..0.02 rows=3 loops=1394) 

                                                                                                            ->  Seq
Scanon callmanagergroupmember gm  (cost=0.00..1.06 rows=6 width=36) (actual time=0.01..0.04 rows=6 loops=1394) 

                                                                                                      ->  Seq Scan on
callmanagercm  (cost=0.00..1.02 rows=2 width=29) (actual time=0.01..0.02 rows=2 loops=898) 

                                                                                  ->  Seq Scan on devicepool dp
(cost=0.00..2.45rows=45 width=32) (actual time=0.01..0.28 rows=45 loops=808) 

                                                                            ->  Seq Scan on callmanagergroup cmg
(cost=0.00..1.03rows=3 width=16) (actual time=0.01..0.02 rows=3 loops=808) 

                                                                      ->  Seq Scan on callmanagergroupmember gm
(cost=0.00..1.06rows=6 width=36) (actual time=0.01..0.04 rows=6 loops=808) 

                                                                ->  Seq Scan on callmanager cm  (cost=0.00..1.02 rows=2
width=29)(actual time=0.01..0.02 rows=2 loops=808) 

                                      ->  Seq Scan on callingsearchspace cs  (cost=0.00..1.31 rows=31 width=90) (actual
time=0.01..0.21rows=31 loops=808) 

                                ->  Seq Scan on callingsearchspace aarcs  (cost=0.00..1.31 rows=31 width=90) (actual
time=0.01..0.18rows=31 loops=808) 

                          ->  Seq Scan on "location" l  (cost=0.00..0.00 rows=1 width=20) (actual time=0.00..0.00
rows=0loops=808) 

                    ->  Hash  (cost=1.12..1.12 rows=12 width=20) (actual time=0.16..0.16 rows=0 loops=1) 

                          ->  Seq Scan on typedeviceprotocol p  (cost=0.00..1.12 rows=12 width=20) (actual
time=0.05..0.11rows=12 loops=1) 

              ->  Seq Scan on mediaresourcelist ol  (cost=0.00..1.02 rows=2 width=32) (actual time=0.01..0.02 rows=2
loops=808)

        ->  Seq Scan on devicenumplanmap dnp  (cost=0.00..43.62 rows=1462 width=101) (actual time=0.03..27.03 rows=1462
loops=808)

  ->  Seq Scan on numplan np  (cost=0.00..28.39 rows=739 width=616) (actual time=0.05..23.73 rows=739 loops=890) 
                                                                                                                     ->
Seq Scan on routepartition rp  (cost=0.00..1.38 rows=38 width=30) (actual time=0.02..0.24 rows=38 loops=890) 
                                                                                                               ->  Seq
Scanon callingsearchspace css  (cost=0.00..1.31 rows=31 width=90) (actual time=0.01..0.21 rows=31 loops=890) 
                                                                                                         ->  Seq Scan
oncallingsearchspace cst  (cost=0.00..1.31 rows=31 width=90) (actual time=0.01..0.19 rows=31 loops=890) 
                                                                                                   ->  Seq Scan on
devicepooldp  (cost=0.00..2.45 rows=45 width=68) (actual time=0.03..0.61 rows=45 loops=890) 
                                                                                             ->  Seq Scan on
mediaresourcelistdl  (cost=0.00..1.02 rows=2 width=16) (actual time=0.01..0.02 rows=2 loops=890) 
                                                                                       ->  Seq Scan on callmanagergroup
cmg (cost=0.00..1.03 rows=3 width=16) (actual time=0.01..0.03 rows=3 loops=890) 
                                                                                 ->  Seq Scan on callmanagergroupmember
gm (cost=0.00..1.06 rows=6 width=32) (actual time=0.01..0.05 rows=6 loops=890) 
                                                                           ->  Seq Scan on callmanager cm
(cost=0.00..1.02rows=2 width=16) (actual time=0.01..0.02 rows=2 loops=1780) 
                                                                     ->  Seq Scan on digitdiscardinstruction ddi
(cost=0.00..2.31rows=31 width=20) (actual time=0.01..0.18 rows=31 loops=1780) 
                                                               ->  Seq Scan on dialplan dlp  (cost=0.00..1.01 rows=1
width=24)(actual time=0.01..0.02 rows=1 loops=1780) 
                                                         ->  Seq Scan on routefilter rf  (cost=0.00..1.18 rows=18
width=99)(actual time=0.01..0.12 rows=18 loops=1780) 
                                                   ->  Seq Scan on voicemessagingprofile vm  (cost=0.00..1.04 rows=4
width=37)(actual time=0.02..0.05 rows=4 loops=1780) 
                                             ->  Seq Scan on voicemessagingpilot vmp  (cost=0.00..1.03 rows=3 width=42)
(actualtime=0.02..0.04 rows=3 loops=1780) 
                                       ->  Seq Scan on processconfig pc  (cost=0.00..9.68 rows=368 width=35) (actual
time=0.02..2.49rows=368 loops=1780) 
                                 ->  Seq Scan on processconfig pcc  (cost=0.00..9.68 rows=368 width=35) (actual
time=0.02..2.45rows=368 loops=1780) 
                           ->  Seq Scan on processconfig pcd  (cost=0.00..9.68 rows=368 width=35) (actual
time=0.02..2.42rows=368 loops=1780) 
                     ->  Seq Scan on processconfig pcf  (cost=0.00..9.68 rows=368 width=35) (actual time=0.02..2.45
rows=368loops=1780) 
               ->  Seq Scan on processconfig pcg  (cost=0.00..9.68 rows=368 width=35) (actual time=0.02..2.43 rows=368
loops=1780)
         ->  Seq Scan on callingsearchspace cssv  (cost=0.00..1.31 rows=31 width=90) (actual time=0.02..0.24 rows=31
loops=1780)
 Total runtime: 106530.61 msec
(120 rows)


> -----Original Message-----
> From: Sean Chittenden [mailto:sean@chittenden.org]
> Sent: Thursday, August 28, 2003 1:32 PM
> To: Clay Luther
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
> > Actually, I was about to post some problems we have with large left
> > outer joins as well we've discovered in a porting project from
> > NT/SQL Server -> Linux/Postgres.
> >
> > We have a particular query that is rather large, left outer joining
> > across several tables.  Under SQL Server, with identical data and
> > schema, this particular query takes 2 seconds.
> >
> > Under PostgreSQL, this same query takes 90 seconds -- that's right,
> > 90 seconds.  45x longer than SQL Server.  This was quite a shock to
> > us (we'd not seen such a performance deficit between the two dbs
> > until this) and could, in fact, force us away from Postgres.
> >
> > I'd be happy to forward the explain to anyone who'd care to look at
> > it...
>
> Post an EXPLAIN ANALYZE of the query...
>
> -sc
>
> --
> Sean Chittenden
>

Re: left outer join terrible slow compared to inner join

From
Stephen Frost
Date:
* Clay Luther (claycle@cisco.com) wrote:
> I'd be happy to forward the explain to anyone who'd care to look at it...

Could you try 7.4b1?  It sounds like maybe things are better there from
what Tom was saying for left outer joins..

    Stephen

Attachment

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
Stephen Frost <sfrost@snowman.net> writes:
> * Clay Luther (claycle@cisco.com) wrote:
>> I'd be happy to forward the explain to anyone who'd care to look at it...

> Could you try 7.4b1?  It sounds like maybe things are better there from
> what Tom was saying for left outer joins..

If it's not better in 7.4, please send me the details.

            regards, tom lane

Re: left outer join terrible slow compared to inner join

From
Thomas Beutin
Date:
On Thu, Aug 28, 2003 at 02:10:15PM -0400, Tom Lane wrote:
> Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> > Thanks for the suggestion, but the result is close to the original outer
> > join without the explicit cross join but far away from the speed of the
> > inner join.
>
> > EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id, pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN ot_produkt
ASp) LEFT OUTER JOIN  ot_kat_prod AS pz ON ( p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id = '105391105424941' AND
a.m_id= '37'; 
> > NOTICE:  QUERY PLAN:
>
> >               ->  Subquery Scan pz  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.03..485.07 rows=40917
loops=11)
> >                     ->  Seq Scan on o_kat_prod  (cost=0.00..1683.51 rows=40851 width=170) (actual time=0.02..281.77
rows=40917loops=11) 
>
> Hmm, I don't understand why ot_kat_prod is being treated as a subquery
> here.  It isn't a view or something is it?
Sorry, You're correct, actually this is a view of an other table called
o_kat_prod defined as
SELECT * FROM o_kat_prod;
but supposed to be in my next step something like
SELECT * FROM o_kat_prod WHERE <a_single_field> IS NOT NULL;

And doing the explicit cross join statement on o_kat_prod instead of
ot_kat_prod gives the expected performance to me ( 7.42 msec instead
of 7324.49 msec with EXPLAIN ANALYZE).

Do i've any chance to get the same performance on the view?

Thanks for any help!
-tb
--
Thomas Beutin                             tb@laokoon.IN-Berlin.DE
Beam me up, Scotty. There is no intelligent live down in Redmond.

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
>> Hmm, I don't understand why ot_kat_prod is being treated as a subquery
>> here.  It isn't a view or something is it?

> Sorry, You're correct, actually this is a view of an other table called
> o_kat_prod defined as
> SELECT * FROM o_kat_prod;
> but supposed to be in my next step something like
> SELECT * FROM o_kat_prod WHERE <a_single_field> IS NOT NULL;

Still curious, since I'd expect 7.2 to flatten a simple view into the
upper query.

> Do i've any chance to get the same performance on the view?

Try 7.3, it may be better.  Or consider 7.4 beta.

            regards, tom lane

Re: left outer join terrible slow compared to inner join

From
Greg Stark
Date:
"Clay Luther" <claycle@cisco.com> writes:

> Interstingly enough, the EXPLAIN ANALYZE itself took 90+ seconds:

"explain" would produce just the plan, normally in a few ms. This query might
take a while though.

"explain analyze" says to produce the plan and then actually run the query and
annotate the plan with the actual timing results at each node. Note the
"actual time" labels on each row. So it's not suprising that it took 90s.


Now, uh, there are 37 tables involved in this query. That's kind of a lot.
Like, really, a lot. It's possible this is a sane, if extremely normalized
design, but well, still. 37 is a big number.

Postgres has to consider 37 factorial different ways of combining these
tables. or about 13,763,750,000,000,000,000,000,000,000,000,000,000,000,000
different combinations. That makes it harder for it to come up with the best
combination. You might consider rewriting it to use the ANSI join syntax "LEFT
JOIN" and "RIGHT JOIN" if you haven't already. That might help it out.


That said. My first guess as to the real problem. Of the 37 tables 36 of them
aren't being accessed using indexes. Do you have indexes on the join columns?
Perhaps you should? Postgres performs better when it has indexes.


I'm a bit puzzled how one could get up to 37 tables in a single query other
than just having taken normalization a bit too far. But if that was the
thinking then I would expect the joins to be on the primary keys of all the
tables, which would presumably have indexes. So, well, I guess I'll just stay
puzzled.

--
greg

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
"Clay Luther" <claycle@cisco.com> writes:
> Here is the output of the above explain analyze execution:

Given the presence of this sort of thing:

>          ->  Nested Loop  (cost=1.15..17157470797.64 rows=1229 width=1976) (actual time=2723.56..104804.55 rows=1780
loops=1)
>                Join Filter: (("inner".tkservice = 11) AND ("inner".paramname = 'DefaultNetworkLocale'::character
varying))

I think you are running into 7.3's inability to determine that certain
outer joins can be simplified to regular joins (specifically, if there
is a strict operator above the OUTER JOIN that will reject null-extended
rows, then there's no need to generate null-extended rows at all).

7.4 should do better.

            regards, tom lane

Re: left outer join terrible slow compared to inner join

From
Jonathan Bartlett
Date:
You can easily get 37 tables if you use a lot of views, especially those
involving unions and unions of views or joins of views.

Jon


Re: left outer join terrible slow compared to inner join

From
"Clay Luther"
Date:
> That said. My first guess as to the real problem. Of the 37
> tables 36 of them
> aren't being accessed using indexes. Do you have indexes on
> the join columns?
> Perhaps you should? Postgres performs better when it has indexe

Heh...well, first let me say:

1) Our database is highly normalized.
2) All joins in the query are performed across indeces.
3) It IS a huge query.

There is a reason to the madness.  Namely, this query was driven by a client application requirement.  In a particular
operationalcase, the application needed to be able to say "give me all this information now!" without making round
tripsto the database.  The query itself has grown over the years (last time I looked at it, it was only 24 joins, not
37). But, as I said before,  

1) It works
2) It works VERY fast (in SQLServer)
3) It works in production and has for years now

When I looked at the query for the first time in years last week, here is an email I sent to other people on my team:

--snip--
Here's the ugly one, the station-d statement:

How or why we ever wrote such a monster is beyond me :-)
--snip--

But, it exists, it works, and it currently not scheduled to be changed.  Such is the nature of inertia.

:-)

cwl

> -----Original Message-----
> From: Greg Stark [mailto:gsstark@mit.edu]
> Sent: Thursday, August 28, 2003 4:32 PM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
> "Clay Luther" <claycle@cisco.com> writes:
>
> > Interstingly enough, the EXPLAIN ANALYZE itself took 90+ seconds:
>
> "explain" would produce just the plan, normally in a few ms.
> This query might
> take a while though.
>
> "explain analyze" says to produce the plan and then actually
> run the query and
> annotate the plan with the actual timing results at each
> node. Note the
> "actual time" labels on each row. So it's not suprising that
> it took 90s.
>
>
> Now, uh, there are 37 tables involved in this query. That's
> kind of a lot.
> Like, really, a lot. It's possible this is a sane, if
> extremely normalized
> design, but well, still. 37 is a big number.
>
> Postgres has to consider 37 factorial different ways of
> combining these
> tables. or about
> 13,763,750,000,000,000,000,000,000,000,000,000,000,000,000
> different combinations. That makes it harder for it to come
> up with the best
> combination. You might consider rewriting it to use the ANSI
> join syntax "LEFT
> JOIN" and "RIGHT JOIN" if you haven't already. That might help it out.
>
>
> That said. My first guess as to the real problem. Of the 37
> tables 36 of them
> aren't being accessed using indexes. Do you have indexes on
> the join columns?
> Perhaps you should? Postgres performs better when it has indexes.
>
>
> I'm a bit puzzled how one could get up to 37 tables in a
> single query other
> than just having taken normalization a bit too far. But if
> that was the
> thinking then I would expect the joins to be on the primary
> keys of all the
> tables, which would presumably have indexes. So, well, I
> guess I'll just stay
> puzzled.
>
> --
> greg
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>

Re: left outer join terrible slow compared to inner join

From
Mike Mascari
Date:
Clay Luther wrote:

> Heh...well, first let me say:
>
> 1) Our database is highly normalized.

Excellent. When faced with the choice of ensuring integrity myself in
the face of redundancy vs. Tom Lane's ability to improve the planner,
optimizer, and executor, I always vote for the latter!

> 2) All joins in the query are performed across indeces.
> 3) It IS a huge query.
>
> There is a reason to the madness.  Namely, this query was driven by a client application requirement.  In a
particularoperational case, the application needed to be able to say "give me all this information now!" without making
roundtrips to the database.  The query itself has grown over the years (last time I looked at it, it was only 24 joins,
not37).  But, as I said before,  
>
> 1) It works
> 2) It works VERY fast (in SQLServer)
> 3) It works in production and has for years now

I have faced these issues before in older versions of PostgreSQL:


http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&frame=right&th=67a713f0107dc77a&seekm=01C0CF88.292AB320.mascarm%40mascari.com#link1

I'd suggest a few things:

1) How long does it take to execute just a plain EXPLAIN? I suspect it
might be spending more time planning than actually executing

2) You might be able to play around with explicit join syntax in part
of your queries:

http://www.postgresql.org/docs/7.3/static/explicit-joins.html

3) I've found the GEQO threshold to be way too low:

http://www.postgresql.org/docs/7.3/static/runtime-config.html#RUNTIME-CONFIG-OPTIMIZER

4) If you have any UDF's used in the WHERE clause, attempt to rewrite
the query without them or use #2 to defer their evaluation if they are
costly. I've found that PostgreSQL, when left to its own devices, can
often choose to evaluate a UDF before a join, where the join would
have been far less costly to evaluate first.

I haven't tried 7.4beta though. It may solve all your problems and
answer all your questions. For me, each release has reduced planning
time by an order of magnitude. Hopefully, that trend will continue ad
infinitum. :-)

Mike Mascari
mascarm@mascari.com



Re: left outer join terrible slow compared to inner join

From
Greg Stark
Date:
Mike Mascari <mascarm@mascari.com> writes:

> > 1) Our database is highly normalized.

If anything I was worried it was "excessively" normalized. Sometimes people go
overboard, taking columns that really could be simple attributes and make them
reference tables. But that usually doesn't cause performance problems, just
programmer headaches. It was just a first impression, the simple number of
tables isn't evidence.

> > 2) All joins in the query are performed across indeces.

Ok, well only one of the table accesses is actually using an index in that
plan. I don't understand what's going on in enough detail to explain why.

Perhaps the optimizer thinks the tables are just small enough to not make them
worthwhile.

Which columns actually have indexes, are they all the same data type as the
value they're being compared against?

--
greg

Re: left outer join terrible slow compared to inner join

From
"Clay Luther"
Date:
> Perhaps the optimizer thinks the tables are just small enough
> to not make them
> worthwhile.
>
> Which columns actually have indexes, are they all the same
> data type as the
> value they're being compared against?


Yup, this was my guess.  Many of the tables being joined in are not excessively large (10s or 100s of records), while
tableslike device and numplan are VERY large (or can be). 

Most, if not all of the joins -- if memory serves -- are being made across foreign keys of either uuid type or simple
ints.

The uuid type I created (implements libuuid as a type for postgresql), and, yes, I defined all the operator classes for
btree-ing(it does work nicely). 

I'm going to follow Tom's suggestion at try it against 7.4 next week.  Until then...

cwl

> -----Original Message-----
> From: Greg Stark [mailto:gsstark@mit.edu]
> Sent: Thursday, August 28, 2003 8:20 PM
> To: Mike Mascari
> Cc: Clay Luther; Greg Stark; pgsql-general@postgresql.org
> Subject: Re: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
>
> Mike Mascari <mascarm@mascari.com> writes:
>
> > > 1) Our database is highly normalized.
>
> If anything I was worried it was "excessively" normalized.
> Sometimes people go
> overboard, taking columns that really could be simple
> attributes and make them
> reference tables. But that usually doesn't cause performance
> problems, just
> programmer headaches. It was just a first impression, the
> simple number of
> tables isn't evidence.
>
> > > 2) All joins in the query are performed across indeces.
>
> Ok, well only one of the table accesses is actually using an
> index in that
> plan. I don't understand what's going on in enough detail to
> explain why.
>
> Perhaps the optimizer thinks the tables are just small enough
> to not make them
> worthwhile.
>
> Which columns actually have indexes, are they all the same
> data type as the
> value they're being compared against?
>
> --
> greg
>
>

Re: left outer join terrible slow compared to inner join

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Now, uh, there are 37 tables involved in this query. That's kind of a lot.

> Postgres has to consider 37 factorial different ways of combining these
> tables. or about 13,763,750,000,000,000,000,000,000,000,000,000,000,000,000
> different combinations.

Of course, we don't do that ... long before you get to 37 tables, the
optimizer gives up on exhaustive search and goes for approximate
answers.  So part of the problem might be that the partial GEQO plan
search isn't finding an especially good plan.  But it looked to me like
this is basically a star schema and the exact order in which we join the
detail tables isn't much of an issue.

I thought the problem is more likely to be that 7.3 isn't doing enough
analysis of when it can simplify outer joins --- there are conditions
in the join steps of Clay's example that really should have been pushed
down to the scan steps.  We'll have to await his report of whether 7.4
does better.

            regards, tom lane

select taking forever

From
Steven Tower
Date:
I have a basic SQL call that takes forever because Postgresql seems to want to use a Seq row scan on the table Products which has around 41k rows.  Below is the sql call and the explain.

explain select * from ChargeCodes where AccountID = '{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' and ChargeCodeID IN   (Select ChargeCodeID from Products where ProductID in     (select ProductID from OrderRules where WebUserRoleID in         (Select WebUserRoleID from WebUsers where WebUserID = '{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}')))

Seq Scan on chargecodes  (cost=0.00..19217292988.42 rows=36 width=108)
  Filter: ((accountid = '{58C215AA-2C71-446F-88F3-BC2C0D23EF28}'::text) AND (subplan))
  SubPlan
    ->  Materialize  (cost=263250588.84..263250588.84 rows=20535 width=42)
          ->  Seq Scan on products  (cost=0.00..263250588.84 rows=20535 width=42)
                Filter: (subplan)
                SubPlan
                  ->  Materialize  (cost=6409.75..6409.75 rows=554 width=42)
                        ->  Seq Scan on orderrules  (cost=0.00..6409.75 rows=554 width=42)
                              Filter: (subplan)
                              SubPlan
                                ->  Materialize  (cost=5.75..5.75 rows=1 width=42)
                                      ->  Index Scan using webusers_pkey on webusers  (cost=0.00..5.75 rows=1 width=42)
                                            Index Cond: (webuserid = '{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}'::text)

All the above tables have proper index's, all of them excuse their individual calls quickly, but when put together it's slow as can be.  In the cases of where it chose to do a Seq scan, all except the one for products are correct (tables with 10-100 rows at most).

Can anyone help and make a suggestion as to why it's doing a sequence scan, what gets even more interesting is that even if I turn force index scans on, it still seq scan's products yet when I make any individual calls to products outside of the above context it always uses the index.

Thanks,

Steven



Re: select taking forever

From
"Shridhar Daithankar"
Date:
On 29 Aug 2003 at 7:04, Steven Tower wrote:

>
> I have a basic SQL call that takes forever because Postgresql seems to want to
> use a Seq row scan on the table Products which has around 41k rows. Below is
> the sql call and the explain.
>
> explain select * from ChargeCodes where AccountID = '{58C215AA-2C71-446F-88F3-
> BC2C0D23EF28}' and ChargeCodeID IN (Select ChargeCodeID from Products where
> ProductID in (select ProductID from OrderRules where WebUserRoleID
> in (Select WebUserRoleID from WebUsers where WebUserID = '{3CD5D4F5-
> 448B-11D5-83DB-0001023EA2FA}')))

So many in's nested? I doubt a 7.3.x would survive that.

Upgrade to 7.4 beta2. The 'in' problem is taken care of, in there..

Can you somehow rewrite that as nested joins? They could be much better..

Bye
 Shridhar

--
There comes to all races an ultimate crisis which you have yet to face.... One day our minds became so powerful we
daredthink of ourselves as gods.        -- Sargon, "Return to Tomorrow", stardate 4768.3 


Re: select taking forever

From
Steven Tower
Date:
I will give 7.4 a try.  Actually I have played around with rewriting this, it's not my select statement, personally I would have designed the tables a little differently in order to facilitate better statements.  I agree, the IN's are a little much. :-)

Steven

On Fri, 2003-08-29 at 07:14, Shridhar Daithankar wrote:
On 29 Aug 2003 at 7:04, Steven Tower wrote:

> 
> I have a basic SQL call that takes forever because Postgresql seems to want to 
> use a Seq row scan on the table Products which has around 41k rows. Below is 
> the sql call and the explain.
> 
> explain select * from ChargeCodes where AccountID = '{58C215AA-2C71-446F-88F3-
> BC2C0D23EF28}' and ChargeCodeID IN (Select ChargeCodeID from Products where 
> ProductID in (select ProductID from OrderRules where WebUserRoleID 
> in (Select WebUserRoleID from WebUsers where WebUserID = '{3CD5D4F5-
> 448B-11D5-83DB-0001023EA2FA}')))

So many in's nested? I doubt a 7.3.x would survive that.

Upgrade to 7.4 beta2. The 'in' problem is taken care of, in there..

Can you somehow rewrite that as nested joins? They could be much better..

Bye Shridhar

--
There comes to all races an ultimate crisis which you have yet to face.... One day our minds became so powerful we dared think of ourselves as gods.		-- Sargon, "Return to Tomorrow", stardate 4768.3


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate      subscribe-nomail command to majordomo@postgresql.org so that your      message can get through to the mailing list cleanly
Attachment

Re: select taking forever

From
Nick Barr
Date:
Steven Tower wrote:
> I have a basic SQL call that takes forever because Postgresql seems to
> want to use a Seq row scan on the table Products which has around 41k
> rows.  Below is the sql call and the explain.
>
> explain select * from ChargeCodes where AccountID =
> '{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' and ChargeCodeID IN   (Select
> ChargeCodeID from Products where ProductID in     (select ProductID from
> OrderRules where WebUserRoleID in         (Select WebUserRoleID from
> WebUsers where WebUserID = '{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}')))
>

[snip]

Anything before version 7.4 does not handle the IN statement very well,
and the recomendation from the people in the know is to use EXISTS. See

http://www.postgresql.org/docs/7.3/static/functions-subquery.html#AEN10407

for more detials.

So either upgrade to 7.4, although its still in beta at the moment, or
change the query to something like:

SELECT * FROM ChargeCodes t1 WHERE
AccountID='{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' AND EXISTS ( SELECT 1
FROM Products t2 WHERE t2.ChargeCodeID=t1.ChargeCodeID AND EXISTS (
SELECT 1 FROM OrderRules t3 WHERE t3.ProductID=t2.ProductID AND EXISTS (
SELECT 1 FROM WebUsers t4 WHERE
t4.WebUserId='{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}' AND
t4.WebUserRoleID=t3.WebUserRoleID )

or perhaps simpler but you will have to compare outputs.....

SELECT * FROM ChargeCodes t1 WHERE
AccountID='{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' AND EXISTS (SELECT 1
FROM Products t2, OrderRules t3, WebUsers t4 WHERE
t1.ChargeCodeID=t2.ChargeCodeID AND t2.ProductID=t3.ProductID AND
t3.WebUserRoleID=t4.WebUserRoleID AND
t4.WebUserId='{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}')


Nick


Re: select taking forever

From
Steven Tower
Date:
Nick,

Thanks much.  I haven't been in the SQL frame of mind and sure enough as soon as you said it, I laughed because I personally have never used in, I have always used exists. 

I will give some revision on the below a try, uh, sometimes the brain just doesn't want to help you. :-)  Glad you are all here.

Steven

On Fri, 2003-08-29 at 08:17, Nick Barr wrote:
Steven Tower wrote:
> I have a basic SQL call that takes forever because Postgresql seems to 
> want to use a Seq row scan on the table Products which has around 41k 
> rows.  Below is the sql call and the explain.
> 
> explain select * from ChargeCodes where AccountID = 
> '{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' and ChargeCodeID IN   (Select 
> ChargeCodeID from Products where ProductID in     (select ProductID from 
> OrderRules where WebUserRoleID in         (Select WebUserRoleID from 
> WebUsers where WebUserID = '{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}')))
> 

[snip]

Anything before version 7.4 does not handle the IN statement very well, 
and the recomendation from the people in the know is to use EXISTS. See

http://www.postgresql.org/docs/7.3/static/functions-subquery.html#AEN10407

for more detials.

So either upgrade to 7.4, although its still in beta at the moment, or 
change the query to something like:

SELECT * FROM ChargeCodes t1 WHERE 
AccountID='{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' AND EXISTS ( SELECT 1 
FROM Products t2 WHERE t2.ChargeCodeID=t1.ChargeCodeID AND EXISTS ( 
SELECT 1 FROM OrderRules t3 WHERE t3.ProductID=t2.ProductID AND EXISTS ( 
SELECT 1 FROM WebUsers t4 WHERE 
t4.WebUserId='{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}' AND 
t4.WebUserRoleID=t3.WebUserRoleID )

or perhaps simpler but you will have to compare outputs.....

SELECT * FROM ChargeCodes t1 WHERE 
AccountID='{58C215AA-2C71-446F-88F3-BC2C0D23EF28}' AND EXISTS (SELECT 1 
FROM Products t2, OrderRules t3, WebUsers t4 WHERE 
t1.ChargeCodeID=t2.ChargeCodeID AND t2.ProductID=t3.ProductID AND 
t3.WebUserRoleID=t4.WebUserRoleID AND 
t4.WebUserId='{3CD5D4F5-448B-11D5-83DB-0001023EA2FA}')


Nick


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
               http://www.postgresql.org/docs/faqs/FAQ.html
Attachment

Re: left outer join terrible slow compared to inner join

From
Jonathan Bartlett
Date:
> And doing the explicit cross join statement on o_kat_prod instead of
> ot_kat_prod gives the expected performance to me ( 7.42 msec instead
> of 7324.49 msec with EXPLAIN ANALYZE).
>
> Do i've any chance to get the same performance on the view?

I've had this problem and it was due to improper typing - It was a join on
an int8 column, which has to be explicitly cast if it's a parameter to the
query.

Jon


Re: left outer join terrible slow compared to inner join

From
"Clay Luther"
Date:
I had an opportunity to test this massive left outer join this with 7.4b2 today.

It took <3 seconds on an untuned, new install...which is MUCH better.  In fact, it performed as well as Oracle 9i.

cwl


> -----Original Message-----
> From: Clay Luther
> Sent: Thursday, August 28, 2003 1:26 PM
> To: 'Tom Lane'; Thomas Beutin
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] left outer join terrible slow compared to inner
> join
>
>
> Actually, I was about to post some problems we have with
> large left outer joins as well we've discovered in a porting
> project from NT/SQL Server -> Linux/Postgres.
>
> We have a particular query that is rather large, left outer
> joining across several tables.  Under SQL Server, with
> identical data and schema, this particular query takes 2 seconds.
>
> Under PostgreSQL, this same query takes 90 seconds -- that's
> right, 90 seconds.  45x longer than SQL Server.  This was
> quite a shock to us (we'd not seen such a performance deficit
> between the two dbs until this) and could, in fact, force us
> away from Postgres.
>
> I'd be happy to forward the explain to anyone who'd care to
> look at it...
>
> cwl
>
>
> > -----Original Message-----
> > From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> > Sent: Thursday, August 28, 2003 1:10 PM
> > To: Thomas Beutin
> > Cc: pgsql-general@postgresql.org
> > Subject: Re: [GENERAL] left outer join terrible slow
> compared to inner
> > join
> >
> >
> > Thomas Beutin <tyrone@laokoon.IN-Berlin.DE> writes:
> > > Thanks for the suggestion, but the result is close to the
> > original outer
> > > join without the explicit cross join but far away from the
> > speed of the
> > > inner join.
> >
> > > EXPLAIN ANALYZE SELECT DISTINCT pz.l1_id, pz.l2_id,
> > pz.l3_id, pz.l4_id FROM ( ot_adresse AS a CROSS JOIN
> > ot_produkt AS p) LEFT OUTER JOIN  ot_kat_prod AS pz ON (
> > p.p_id = pz.p_id ) WHERE  p.a_id = a.id AND a.id =
> > '105391105424941' AND a.m_id = '37';
> > > NOTICE:  QUERY PLAN:
> >
> > >               ->  Subquery Scan pz  (cost=0.00..1683.51
> > rows=40851 width=170) (actual time=0.03..485.07 rows=40917 loops=11)
> > >                     ->  Seq Scan on o_kat_prod
> > (cost=0.00..1683.51 rows=40851 width=170) (actual
> > time=0.02..281.77 rows=40917 loops=11)
> >
> > Hmm, I don't understand why ot_kat_prod is being treated as
> a subquery
> > here.  It isn't a view or something is it?
> >
> >             regards, tom lane
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 5: Have you checked our extensive FAQ?
> >
> >                http://www.postgresql.org/docs/faqs/FAQ.html
> >
>