Thread: Help with nested loop left join performance

Help with nested loop left join performance

From
"George Woodring"
Date:
I am running 7.4.8 and have a query that I have been running for a while
that has recently have experienced a slowdown.  The original query
involves a UNION but I have narrowed it down to this half of the query
as being my issue.  (The other half take 4 seconds).

The only issue that I have had is index bloat which I had to reindex the
entire cluster to get rid of.  My query has been slow since the start of
the index bloat.

Thanks in advance

Woody




explain analyze SELECT column1, column2, column3, column4, column5,
column6, column7, column8 FROM (SELECT CASE status WHEN 0 THEN 0 WHEN 1
THEN 1 ELSE -1 END AS column1,  mac AS column2, account AS column3,
number || ' ' || address AS column4, 'qmod' || '.' || 'dmod' AS column5,
node AS column6, grid AS column7, boxtype AS column8, number, address
FROM settop_billing  LEFT OUTER JOIN (dhct JOIN dhct_davic USING(mac))
USING (mac) WHERE region='GTown1E' AND node='1E012' ) AS foo;


QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
--------------
 Nested Loop Left Join  (cost=38659.85..55988.94 rows=2 width=84)
(actual time=14054.297..294973.046 rows=418 loops=1)
   Join Filter: ("outer".mac = "inner".mac)
   ->  Index Scan using settop_billing_region_node_index on
settop_billing  (cost=0.00..7.99 rows=2 width=82) (actual
time=0.115..8.582 rows=418 loops=1)         Index Cond: (((region)::text
= 'GTown1E'::text) AND ((node)::text = '1E012'::text))
   ->  Materialize  (cost=38659.85..42508.98 rows=384913 width=8)
(actual time=2.211..286.267 rows=382934 loops=418)
         ->  Hash Join  (cost=14784.66..38659.85 rows=384913 width=8)
(actual time=923.855..13647.840 rows=382934 loops=1)
               Hash Cond: ("outer".mac = "inner".mac)
               ->  Append  (cost=0.00..8881.11 rows=384912 width=8)
(actual time=0.023..10914.365 rows=384900 loops=1)
                     ->  Seq Scan on dhct_davic  (cost=0.00..0.00 rows=1
width=8) (actual time=0.002..0.002 rows=0 loops=1)
                     ->  Seq Scan on probe_dhct_davic dhct_davic
(cost=0.00..8881.11 rows=384911 width=8) (actual time=0.018..10505.255
rows=384900 loops=1)
               ->  Hash  (cost=12154.13..12154.13 rows=410613 width=6)
(actual time=923.433..923.433 rows=0 loops=1)
                     ->  Seq Scan on dhct  (cost=0.00..12154.13
rows=410613 width=6) (actual time=0.019..534.641 rows=409576 loops=1)
 Total runtime: 294994.440 ms
(13 rows)

The tables involved are defined as follows:

                                               Table
"public.settop_billing"
     Column      |          Type          |
Modifiers
-----------------+------------------------+-----------------------------
---------------------------------------------------
 cable_billingid | integer                | not null default
nextval('public.new_cable_billing_cable_billingid_seq'::text)
 mac             | macaddr                | not null
 account         | character varying(20)  |
 number          | character varying(12)  |
 address         | character varying(200) |
 name            | character varying(100) |
 phone           | character varying(10)  |
 region          | character varying(30)  |
 node            | character varying(10)  |
 grid            | character varying(15)  |
 lat             | numeric                |
 long            | numeric                |
 boxtype         | character(1)           |
Indexes:
    "settop_billing_mac_index" unique, btree (mac)
    "settop_billing_account_index" btree (account)
    "settop_billing_lat_log_index" btree (lat, long)
    "settop_billing_region_node_index" btree (region, node)
Inherits: cable_billing

                              Table "public.dhct"
   Column   |         Type          |                 Modifiers
------------+-----------------------+-----------------------------------
--------
 dhctid     | integer               | not null default
nextval('dhct_id'::text)
 mac        | macaddr               | not null
 ip         | inet                  |
 serial     | macaddr               |
 admin_stat | integer               |
 oper_stat  | integer               |
 qmod       | character varying(50) |
 dmod       | integer               |
 hub        | character varying(50) |
 dncs       | character varying(50) |
 auth       | text                  |
 updtime    | integer               |
Indexes:
    "dhct_pkey" primary key, btree (mac)
    "dhct_qmod_index" btree (qmod)

                              Table "iprobe024.probe_dhct_davic"
 Column  |         Type          |                         Modifiers
---------+-----------------------+--------------------------------------
----------------------
 davicid | integer               | not null default
nextval('public.davic_davicid_seq'::text)
 mac     | macaddr               | not null
 type    | character varying(10) | default 'davic'::character varying
 source  | character varying(20) |
 status  | smallint              |
 updtime | integer               |
 avail1  | integer               |
Indexes:
    "probe_dhct_davic_mac_index" unique, btree (mac)
Inherits: dhct_davic


----------------------------------------
iGLASS Networks
211-A S. Salem St
Apex NC 27502
(919) 387-3550 x813
www.iglass.net

Re: Help with nested loop left join performance

From
Richard Huxton
Date:
George Woodring wrote:
>
> explain analyze SELECT column1, column2, column3, column4, column5,
> column6, column7, column8 FROM (SELECT CASE status WHEN 0 THEN 0 WHEN 1
> THEN 1 ELSE -1 END AS column1,  mac AS column2, account AS column3,
> number || ' ' || address AS column4, 'qmod' || '.' || 'dmod' AS column5,
> node AS column6, grid AS column7, boxtype AS column8, number, address
> FROM settop_billing  LEFT OUTER JOIN (dhct JOIN dhct_davic USING(mac))
> USING (mac) WHERE region='GTown1E' AND node='1E012' ) AS foo;

Ach y fi! Let's format that a bit better, eh?

explain analyze
SELECT column1, column2, column3, column4, column5,column6, column7, column8
FROM (
     SELECT
         CASE status WHEN 0 THEN 0 WHEN 1 THEN 1 ELSE -1 END AS column1,
         mac AS column2,
         account AS column3,
         number || ' ' || address AS column4,
         'qmod' || '.' || 'dmod' AS column5,
         node AS column6,
         grid AS column7,
         boxtype AS column8,
         number,
         address
     FROM
         settop_billing
     LEFT OUTER JOIN
         (dhct JOIN dhct_davic USING(mac))
     USING
         (mac)
     WHERE
         region='GTown1E' AND node='1E012'
) AS foo;

Now we can see what's happening. Well, looking at it laid out like that,
I'm suspcious of the (dhct JOIN dhct_davic) on the outside of an outer
join. Looking at your explain we do indeed have two sequential scans
over the tables in question - the big one being dhct...

>                ->  Append  (cost=0.00..8881.11 rows=384912 width=8)
> (actual time=0.023..10914.365 rows=384900 loops=1)
>                      ->  Seq Scan on dhct_davic  (cost=0.00..0.00 rows=1
> width=8) (actual time=0.002..0.002 rows=0 loops=1)
>                      ->  Seq Scan on probe_dhct_davic dhct_davic
> (cost=0.00..8881.11 rows=384911 width=8) (actual time=0.018..10505.255
> rows=384900 loops=1)
>                ->  Hash  (cost=12154.13..12154.13 rows=410613 width=6)
> (actual time=923.433..923.433 rows=0 loops=1)
>                      ->  Seq Scan on dhct  (cost=0.00..12154.13
> rows=410613 width=6) (actual time=0.019..534.641 rows=409576 loops=1)

With 7.4 I seem to remember that explicit JOINs force the evaluation
order, but I'm not if even later versions will rewrite your query. It's
too early in the morning for me to figure out if it's safe in all cases.

Anyway, for your purposes, I'd say something more like:
   FROM settop_billing LEFT JOIN dhct LEFT JOIN dhct_davic

That should let the planner do the joins in a more reasonable order.
--
   Richard Huxton
   Archonet Ltd

Re: Help with nested loop left join performance

From
Tom Lane
Date:
Richard Huxton <dev@archonet.com> writes:
> George Woodring wrote:
>>      FROM
>>          settop_billing
>>      LEFT OUTER JOIN
>>          (dhct JOIN dhct_davic USING(mac))
>>      USING
>>          (mac)
>>      WHERE
>>          region='GTown1E' AND node='1E012'

> With 7.4 I seem to remember that explicit JOINs force the evaluation
> order, but I'm not if even later versions will rewrite your query. It's
> too early in the morning for me to figure out if it's safe in all cases.

CVS HEAD can re-order left joins in common cases, but no existing
release will touch the ordering of outer joins at all.

It's impossible to tell here which tables the WHERE-clause restrictions
actually bear on, so there's no way to say whether a different join
order would help.  My guess though is that George may be stuck --- in
general you can't move a join into or out of the right side of a left
join without changing the answers.

            regards, tom lane