BUG #4934: regression in IN with joins in subselect - Mailing list pgsql-bugs

From Benjamin Reed
Subject BUG #4934: regression in IN with joins in subselect
Date
Msg-id 200907221847.n6MIlql9035866@wwwmaster.postgresql.org
Whole thread Raw
Responses Re: BUG #4934: regression in IN with joins in subselect  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
The following bug has been logged online:

Bug reference:      4934
Logged by:          Benjamin Reed
Email address:      ranger@opennms.org
PostgreSQL version: 8.4.0
Operating system:   Mac OS X 10.5
Description:        regression in IN with joins in subselect
Details:

I've hit a regression vs. PostgreSQL 8.2 and 8.3 (and probably others)
related to subselects.  This query:

---(snip!)---
SELECT DISTINCT ipInterface.ipAddr FROM ipInterface
    JOIN node ON (ipInterface.nodeID = node.nodeID)
    JOIN ifServices ON (ipInterface.id = ifServices.ipInterfaceId)
    JOIN service ON (ifServices.serviceID = service.serviceID)
WHERE
    (
        node.nodeID IN (
            SELECT category_node.nodeID FROM category_node, categories
            WHERE categories.categoryID = category_node.categoryID
            AND categories.categoryName = 'IMP_mid'
        )
    ) AND
    (
        node.nodeID IN (
            SELECT category_node.nodeID FROM category_node, categories
            WHERE categories.categoryID = category_node.categoryID
            AND categories.categoryName = 'DEV_AC'
        )
    ) AND
    (
        node.nodeID IN (
            SELECT category_node.nodeID FROM category_node, categories
            WHERE categories.categoryID = category_node.categoryID
            AND categories.categoryName = 'OPS_Online'
        )
    ) AND
    (node.nodeId = 1) AND
    (ipInterface.ipAddr = '192.168.1.1') AND
    (service.serviceName = 'ICMP')
LIMIT 1;
---(snip!)---

...passes in PostgreSQL 8.2 and 8.3 (which I have handy to test), but fail
in 8.4.0, as well as current origin/REL8_4_STABLE in git.  I reported it in
IRC, and the original hope was that it was related to bug #4906, but since
testing latest git, it appears that is not the case.

The query plan in 8.3 is this:

---(snip!)---
 Limit  (cost=4.27..76.68 rows=1 width=50)
   ->  Unique  (cost=4.27..76.68 rows=1 width=50)
         ->  Nested Loop IN Join  (cost=4.27..76.68 rows=1 width=50)
               ->  Nested Loop IN Join  (cost=4.27..60.12 rows=1 width=54)
                     ->  Nested Loop IN Join  (cost=4.27..43.56 rows=1
width=54)
                           ->  Nested Loop  (cost=4.27..26.99 rows=1
width=54)
                                 ->  Nested Loop  (cost=4.27..18.72 rows=1
width=54)
                                       ->  Nested Loop  (cost=4.27..17.90
rows=2 width=58)
                                             ->  Index Scan using
ipinterface_nodeid_ipaddr_ismanaged_idx on ipinterface  (cost=0.00..8.27
rows=1 width=58)
                                                   Index Cond: ((1 = nodeid)
AND ((ipaddr)::text = '192.168.1.1'::text))
                                             ->  Bitmap Heap Scan on
ifservices  (cost=4.27..9.60 rows=2 width=8)
                                                   Recheck Cond:
(ipinterface.id = ifservices.ipinterfaceid)
                                                   ->  Bitmap Index Scan on
ifservicves_ipinterfaceid_idx  (cost=0.00..4.27 rows=2 width=0)
                                                         Index Cond:
(ipinterface.id = ifservices.ipinterfaceid)
                                       ->  Index Scan using pk_serviceid on
service  (cost=0.00..0.40 rows=1 width=4)
                                             Index Cond:
(ifservices.serviceid = service.serviceid)
                                             Filter: ((servicename)::text =
'ICMP'::text)
                                 ->  Index Scan using node_id_type_idx on
node  (cost=0.00..8.27 rows=1 width=4)
                                       Index Cond: (nodeid = 1)
                           ->  Nested Loop  (cost=0.00..16.55 rows=1
width=4)
                                 ->  Index Scan using category_idx on
categories  (cost=0.00..8.27 rows=1 width=4)
                                       Index Cond: ((categoryname)::text =
'IMP_mid'::text)
                                 ->  Index Scan using catenode_unique_idx on
category_node  (cost=0.00..8.27 rows=1 width=8)
                                       Index Cond:
((public.categories.categoryid = public.category_node.categoryid) AND (1 =
public.category_node.nodeid))
                     ->  Nested Loop  (cost=0.00..16.55 rows=1 width=4)
                           ->  Index Scan using category_idx on categories
(cost=0.00..8.27 rows=1 width=4)
                                 Index Cond: ((categoryname)::text =
'OPS_Online'::text)
                           ->  Index Scan using catenode_unique_idx on
category_node  (cost=0.00..8.27 rows=1 width=8)
                                 Index Cond: ((public.categories.categoryid
= public.category_node.categoryid) AND (1 = public.category_node.nodeid))
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=4)
                     ->  Index Scan using category_idx on categories
(cost=0.00..8.27 rows=1 width=4)
                           Index Cond: ((categoryname)::text =
'DEV_AC'::text)
                     ->  Index Scan using catenode_unique_idx on
category_node  (cost=0.00..8.27 rows=1 width=8)
                           Index Cond: ((public.categories.categoryid =
public.category_node.categoryid) AND (1 = public.category_node.nodeid))
---(snip!)---

...and in the stable branch, this:

---(snip!)---
 Limit  (cost=16.56..72.51 rows=1 width=50)
   ->  Unique  (cost=16.56..72.51 rows=1 width=50)
         ->  Nested Loop Semi Join  (cost=16.56..72.51 rows=1 width=50)
               ->  Nested Loop Semi Join  (cost=16.56..55.94 rows=1
width=54)
                     ->  Nested Loop  (cost=16.56..39.38 rows=1 width=54)
                           ->  Nested Loop Semi Join  (cost=16.56..38.57
rows=2 width=58)
                                 ->  Index Scan using node_id_type_idx on
node  (cost=0.00..8.27 rows=1 width=4)
                                       Index Cond: (nodeid = 1)
                                 ->  Nested Loop  (cost=16.56..37.16 rows=2
width=62)
                                       ->  Nested Loop  (cost=16.56..24.85
rows=1 width=62)
                                             ->  Index Scan using
ipinterface_nodeid_ipaddr_ismanaged_idx on ipinterface  (cost=0.00..8.27
rows=1 width=58)
                                                   Index Cond: ((nodeid = 1)
AND ((ipaddr)::text = '192.168.1.1'::text))
                                             ->  HashAggregate
(cost=16.56..16.57 rows=1 width=4)
                                                   ->  Nested Loop
(cost=0.00..16.55 rows=1 width=4)
                                                         ->  Index Scan
using category_idx on categories  (cost=0.00..8.27 rows=1 width=4)
                                                               Index Cond:
((categoryname)::text = 'DEV_AC'::text)
                                                         ->  Index Scan
using catenode_unique_idx on category_node  (cost=0.00..8.27 rows=1
width=8)
                                                               Index Cond:
((public.category_node.categoryid = public.categories.categoryid) AND
(public.category_node.nodeid = 1))
                                       ->  Index Scan using
ifservicves_ipinterfaceid_idx on ifservices  (cost=0.00..12.29 rows=2
width=8)
                                             Index Cond:
(ifservices.ipinterfaceid = ipinterface.id)
                           ->  Index Scan using pk_serviceid on service
(cost=0.00..0.39 rows=1 width=4)
                                 Index Cond: (service.serviceid =
ifservices.serviceid)
                                 Filter: ((service.servicename)::text =
'ICMP'::text)
                     ->  Nested Loop  (cost=0.00..16.55 rows=1 width=4)
                           ->  Index Scan using category_idx on categories
(cost=0.00..8.27 rows=1 width=4)
                                 Index Cond: ((categoryname)::text =
'IMP_mid'::text)
                           ->  Index Scan using catenode_unique_idx on
category_node  (cost=0.00..8.27 rows=1 width=8)
                                 Index Cond:
((public.category_node.categoryid = public.categories.categoryid) AND
(public.category_node.nodeid = 1))
               ->  Nested Loop  (cost=0.00..16.55 rows=1 width=4)
                     ->  Index Scan using category_idx on categories
(cost=0.00..8.27 rows=1 width=4)
                           Index Cond: ((categoryname)::text =
'OPS_Online'::text)
                     ->  Index Scan using catenode_unique_idx on
category_node  (cost=0.00..8.27 rows=1 width=8)
                           Index Cond: ((public.category_node.categoryid =
public.categories.categoryid) AND (public.category_node.nodeid = 1))
---(snip!)---

If I cut the select down to:

---(snip!)---
SELECT DISTINCT ipInterface.ipAddr FROM ipInterface
    JOIN node ON (ipInterface.nodeID = node.nodeID)
    JOIN ifServices ON (ipInterface.id = ifServices.ipInterfaceId)
    JOIN service ON (ifServices.serviceID = service.serviceID)
WHERE
    (node.nodeId = 1) AND
    (ipInterface.ipAddr = '192.168.1.1') AND
    (service.serviceName = 'ICMP')
LIMIT 1;
---(snip!)---

...it passes, but as soon as I add one of the node.nodeID IN() bits, it
fails.  Also, if I change it to hardcode one of the passing node IDs:

---(snip!)---
SELECT DISTINCT ipInterface.ipAddr FROM ipInterface
JOIN node ON (ipInterface.nodeID = node.nodeID)
JOIN ifServices ON (ipInterface.id = ifServices.ipInterfaceId)
JOIN service ON (ifServices.serviceID = service.serviceID)
WHERE
(
node.nodeID IN ( SELECT 1 )
) AND
(node.nodeId = 1) AND
(ipInterface.ipAddr = '192.168.1.1') AND
(service.serviceName = 'ICMP')
LIMIT 1;
---(snip!)---

...it passes, so it appears to be related specifically to the joined
subselect.

If there's anything else you need (database dumps, whatever) to help debug
this, let me know.

Thanks.

pgsql-bugs by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: BUG #4932: Upgrade 8.2.13 -> 8.4.0 - Kerberos option missing
Next
From: Tom Lane
Date:
Subject: Re: BUG #4934: regression in IN with joins in subselect