Re: BUG #15857: Parallel Hash Join makes join instead of exists - Mailing list pgsql-bugs

From Thomas Munro
Subject Re: BUG #15857: Parallel Hash Join makes join instead of exists
Date
Msg-id CA+hUKG+AT83m_r_N-StwcTC4p5U0ZH12SCbmOn2E54p9aWKiTg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15857: Parallel Hash Join makes join instead of exists  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: BUG #15857: Parallel Hash Join makes join instead of exists
List pgsql-bugs
On Tue, Jun 18, 2019 at 8:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Tue, Jun 18, 2019 at 6:40 PM PG Bug reporting form
> <noreply@postgresql.org> wrote:
> > -- This gives an incorrect result of 999991, when 100000 is expected on
> > Postgres 11.3 and 12 beta 1.
>
> Reproduced here.  Investigating.

First clue is that if you change "WHERE c.base_id = a.id" to "WHERE
c.base_id = b.base_id", you get Parallel Hash Semi Join instead of
Parallel Hash Join, but an otherwise identical plan with the same Hash
Cond, and the result changes to 100000 instead of 999991.

Second clue is that if you set enable_parallel_hash to off, you get a
Hash Semi Join for "WHERE c.base_id = b.base_id", but if you use
"WHERE c.base_id = a.id" you get a Hash Join over Hash of Unique of
Sort of c, instead of a Hash Semi Join.

That points to the problem: for JOIN_UNIQUE_INNER we plan a Parallel
Hash Join, but that's nonsense, there is no code to unique-ify the
partial inner side (because that's not possible).   There may be
something better we can do here (like understanding that this should
really be a semi-join), but this works for me to prevent the bad plan:

diff --git a/src/backend/optimizer/path/joinpath.c
b/src/backend/optimizer/path/joinpath.c
index 501ad775cbe..e42c82c2bb4 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -1869,7 +1869,8 @@ hash_inner_and_outer(PlannerInfo *root,
                         * Can we use a partial inner plan too, so
that we can build a
                         * shared hash table in parallel?
                         */
-                       if (innerrel->partial_pathlist != NIL &&
enable_parallel_hash)
+                       if (innerrel->partial_pathlist != NIL &&
+                               save_jointype != JOIN_UNIQUE_INNER &&
enable_parallel_hash)
                        {
                                cheapest_partial_inner =
                                        (Path *)
linitial(innerrel->partial_pathlist);

-- 
Thomas Munro
https://enterprisedb.com



pgsql-bugs by date:

Previous
From: Pantelis Theodosiou
Date:
Subject: Re: BUG #15857: Parallel Hash Join makes join instead of exists
Next
From: PG Bug reporting form
Date:
Subject: BUG #15858: could not stat file - over 4GB