BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize
Date
Msg-id 17213-988ed34b225a2862@postgresql.org
Whole thread Raw
Responses Re: BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17213
Logged by:          Elvis Pranskevichus
Email address:      elprans@gmail.com
PostgreSQL version: 14.0
Operating system:   Linux
Description:

It appears a combination of Merge Semi Join and Memoize in PostgreSQL 14
produces incorrect results on a particular query.  The bug might be
present
in earlier versions, but I was only able to reproduce a particular plan
under version 14.

======
Schema
======

CREATE TABLE issues (
    id int PRIMARY KEY
);

CREATE TABLE users (
    id int PRIMARY KEY
);

CREATE TABLE watchers (
    issue_id int,
    user_id int,

    CONSTRAINT unique_pair UNIQUE (issue_id, user_id)
);

CREATE INDEX watchers_user_idx ON watchers(user_id);

INSERT INTO issues (id) VALUES (1);
INSERT INTO issues (id) VALUES (2);
INSERT INTO issues (id) VALUES (3);
INSERT INTO issues (id) VALUES (4);

INSERT INTO users (id) VALUES (1);
INSERT INTO users (id) VALUES (2);

INSERT INTO watchers (issue_id, user_id) VALUES (1, 2);
INSERT INTO watchers (issue_id, user_id) VALUES (2, 1);
INSERT INTO watchers (issue_id, user_id) VALUES (3, 1);


=====
Query
=====


SELECT
    i1.id
FROM 
    issues AS i1
WHERE
    EXISTS (
    SELECT
    FROM
    (SELECT
            i3.id
        FROM
            issues AS i3
        WHERE
            i3.id 
            IN
            (SELECT
                w3.issue_id
            FROM
                (SELECT
                    u1.id
                FROM
                    users AS u1
                WHERE
                    EXISTS (
                        SELECT
                            FROM
                            watchers AS w1
                            INNER JOIN users AS u2 ON (w1.user_id = u2.id)
                            INNER JOIN watchers AS w2 ON (u1.id =
w2.user_id)
                        WHERE
                            (i1.id = w1.issue_id)
                            AND (w2.issue_id != i1.id)
                            AND u1.id = u2.id
                    )
                ) AS q
                INNER JOIN users AS superfluous ON (q.id = superfluous.id)
                INNER JOIN watchers AS w3 ON (q.id = w3.user_id)
            )
    ) AS q
    );


===============
Expected Output
===============

 id
----
 2
 3
(2 rows)


===============================
Actual Output (with below plan)
===============================

 id
----
(0 rows)


=================
Plan with Memoize
=================

 Seq Scan on issues i1  (cost=0.00..11577.47 rows=145 width=248) (actual
time=0.091..0.093 rows=0 loops=1)
   Filter: (SubPlan 2)
   Rows Removed by Filter: 4
   SubPlan 2
     ->  Merge Semi Join  (cost=0.61..5694.44 rows=145 width=0) (actual
time=0.020..0.020 rows=0 loops=4)
           Merge Cond: (i3.id = w3.issue_id)
           ->  Index Only Scan using issues_pkey on issues i3
(cost=0.15..52.50 rows=290 width=16) (actual time=0.002..0.003 rows=1
loops=4)
                 Heap Fetches: 4
           ->  Nested Loop  (cost=0.46..5632.72 rows=680 width=16) (actual
time=0.017..0.017 rows=0 loops=4)
                 ->  Nested Loop  (cost=0.30..344.25 rows=1360 width=48)
(actual time=0.004..0.010 rows=3 loops=4)
                       ->  Index Only Scan using unique_pair on watchers w3
(cost=0.15..68.55 rows=1360 width=32) (actual time=0.001..0.002 rows=3
loops=4)
                             Heap Fetches: 12
                       ->  Index Only Scan using users_pkey on users
superfluous  (cost=0.15..0.20 rows=1 width=16) (actual time=0.001..0.001
rows=1 loops=12)
                             Index Cond: (id = w3.user_id)
                             Heap Fetches: 12
                 ->  Memoize  (cost=0.16..7.02 rows=1 width=16) (actual
time=0.002..0.002 rows=0 loops=12)
                       Cache Key: superfluous.id
                       Hits: 10  Misses: 2  Evictions: 0  Overflows: 0
Memory Usage: 1kB
                       ->  Index Only Scan using users_pkey on users u1
(cost=0.15..7.01 rows=1 width=16) (actual time=0.008..0.009 rows=0
loops=2)
                             Index Cond: (id = superfluous.id)
                             Filter: (SubPlan 1)
                             Rows Removed by Filter: 1
                             Heap Fetches: 2
                             SubPlan 1
                               ->  Nested Loop  (cost=0.45..44.73 rows=7
width=0) (actual time=0.006..0.006 rows=0 loops=2)
                                     ->  Index Scan using watchers_user_idx
on watchers w2  (cost=0.15..28.29 rows=7 width=0) (actual time=0.003..0.003
rows=1 loops=2)
                                           Index Cond: (user_id = u1.id)
                                           Filter: (issue_id <> i1.id)
                                           Rows Removed by Filter: 0
                                     ->  Materialize  (cost=0.30..16.36
rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=2)
                                           ->  Nested Loop
(cost=0.30..16.35 rows=1 width=0) (actual time=0.002..0.003 rows=0
loops=1)
                                                 ->  Index Only Scan using
unique_pair on watchers w1  (cost=0.15..8.17 rows=1 width=16) (actual
time=0.002..0.002 rows=0 loops=1)
                                                       Index Cond:
((issue_id = i1.id) AND (user_id = u1.id))
                                                       Heap Fetches: 0
                                                 ->  Index Only Scan using
users_pkey on users u2  (cost=0.15..8.17 rows=1 width=16) (never executed)
                                                       Index Cond: (id =
u1.id)
                                                       Heap Fetches: 0
 Planning Time: 0.755 ms
 Execution Time: 0.146 ms


=============================
With SET enable_memoize = off
=============================

 Seq Scan on issues i1  (cost=0.00..19705.90 rows=145 width=32) (actual
time=0.076..0.116 rows=2 loops=1)
   Filter: (SubPlan 2)
   Rows Removed by Filter: 2
   SubPlan 2
     ->  Merge Semi Join  (cost=0.60..9760.10 rows=145 width=0) (actual
time=0.025..0.025 rows=0 loops=4)
           Merge Cond: (i3.id = w3.issue_id)
           ->  Index Only Scan using issues_pkey on issues i3
(cost=0.15..52.50 rows=290 width=16) (actual time=0.003..0.003 rows=2
loops=4)
                 Heap Fetches: 6
           ->  Nested Loop  (cost=0.45..9698.38 rows=680 width=16) (actual
time=0.021..0.022 rows=0 loops=4)
                 Join Filter: (u1.id = superfluous.id)
                 ->  Nested Loop  (cost=0.30..9552.04 rows=680 width=48)
(actual time=0.020..0.020 rows=0 loops=4)
                       ->  Index Only Scan using unique_pair on watchers w3
(cost=0.15..68.55 rows=1360 width=32) (actual time=0.001..0.002 rows=2
loops=4)
                             Heap Fetches: 10
                       ->  Index Only Scan using users_pkey on users u1
(cost=0.15..6.98 rows=1 width=16) (actual time=0.007..0.007 rows=0
loops=10)
                             Index Cond: (id = w3.user_id)
                             Filter: (SubPlan 1)
                             Rows Removed by Filter: 1
                             Heap Fetches: 10
                             SubPlan 1
                               ->  Nested Loop  (cost=0.45..44.73 rows=7
width=0) (actual time=0.005..0.005 rows=0 loops=10)
                                     ->  Index Scan using watchers_user_idx
on watchers w2  (cost=0.15..28.29 rows=7 width=0) (actual time=0.001..0.002
rows=1 loops=10)
                                           Index Cond: (user_id = u1.id)
                                           Filter: (issue_id <> i1.id)
                                           Rows Removed by Filter: 0
                                     ->  Materialize  (cost=0.30..16.36
rows=1 width=0) (actual time=0.002..0.002 rows=0 loops=13)
                                           ->  Nested Loop
(cost=0.30..16.35 rows=1 width=0) (actual time=0.002..0.002 rows=0
loops=9)
                                                 ->  Index Only Scan using
unique_pair on watchers w1  (cost=0.15..8.17 rows=1 width=16) (actual
time=0.001..0.001 rows=0 loops=9)
                                                       Index Cond:
((issue_id = i1.id) AND (user_id = u1.id))
                                                       Heap Fetches: 2
                                                 ->  Index Only Scan using
users_pkey on users u2  (cost=0.15..8.17 rows=1 width=16) (actual
time=0.001..0.001 rows=1 loops=2)
                                                       Index Cond: (id =
u1.id)
                                                       Heap Fetches: 2
                 ->  Index Only Scan using users_pkey on users superfluous
(cost=0.15..0.20 rows=1 width=16) (actual time=0.002..0.002 rows=1
loops=2)
                       Index Cond: (id = w3.user_id)
                       Heap Fetches: 2
 Planning Time: 0.706 ms
 Execution Time: 0.164 ms


pgsql-bugs by date:

Previous
From: David.Oberg
Date:
Subject: Repo error for pgdg14
Next
From: David Rowley
Date:
Subject: Re: BUG #17213: Wrong result from a query involving Merge Semi Join and Memoize