Re: Wrong results from Parallel Hash Full Join - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Wrong results from Parallel Hash Full Join
Date
Msg-id CAAKRu_aaaGtjJK5Jt2NKbs-X2v9md-sW8em7kzMHpwZstrLT-w@mail.gmail.com
Whole thread Raw
In response to Wrong results from Parallel Hash Full Join  (Richard Guo <guofenglinux@gmail.com>)
Responses Re: Wrong results from Parallel Hash Full Join
List pgsql-hackers
On Wed, Apr 12, 2023 at 7:36 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
> I came across $subject and reduced the repro query as below.
>
> create table a (i int);
> create table b (i int);
> insert into a values (1);
> insert into b values (2);
> update b set i = 2;
>
> set min_parallel_table_scan_size to 0;
> set parallel_tuple_cost to 0;
> set parallel_setup_cost to 0;
>
> # explain (costs off) select * from a full join b on a.i = b.i;
>                 QUERY PLAN
> ------------------------------------------
>  Gather
>    Workers Planned: 2
>    ->  Parallel Hash Full Join
>          Hash Cond: (a.i = b.i)
>          ->  Parallel Seq Scan on a
>          ->  Parallel Hash
>                ->  Parallel Seq Scan on b
> (7 rows)
>
> # select * from a full join b on a.i = b.i;
>  i | i
> ---+---
>  1 |
> (1 row)
>
> Tuple (NULL, 2) is missing from the results.

Thanks so much for reporting this, Richard. This is a fantastic minimal
repro!

So, I looked into this, and it seems that, as you can imagine, the tuple
in b is hot updated, resulting in a heap only tuple.

 t_ctid |                              raw_flags
--------+----------------------------------------------------------------------
 (0,2)  | {HEAP_XMIN_COMMITTED,HEAP_XMAX_COMMITTED,HEAP_HOT_UPDATED}
 (0,2)  | {HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}

In ExecParallelScanHashTableForUnmatched() we don't emit the
NULL-extended tuple because HeapTupleHeaderHasMatch() is true for our
desired tuple.

        while (hashTuple != NULL)
        {
            if (!HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(hashTuple)))
            {

HeapTupleHeaderHasMatch() checks if HEAP_TUPLE_HAS_MATCH is set.

In htup_details.h, you will see that HEAP_TUPLE_HAS_MATCH is defined as
HEAP_ONLY_TUPLE
/*
 * HEAP_TUPLE_HAS_MATCH is a temporary flag used during hash joins.  It is
 * only used in tuples that are in the hash table, and those don't need
 * any visibility information, so we can overlay it on a visibility flag
 * instead of using up a dedicated bit.
 */
#define HEAP_TUPLE_HAS_MATCH    HEAP_ONLY_TUPLE /* tuple has a join match */

If you redefine HEAP_TUPLE_HAS_MATCH as something that isn't already
used, say 0x1800, the query returns correct results.

                QUERY PLAN
------------------------------------------
 Gather
   Workers Planned: 2
   ->  Parallel Hash Full Join
         Hash Cond: (a.i = b.i)
         ->  Parallel Seq Scan on a
         ->  Parallel Hash
               ->  Parallel Seq Scan on b
(7 rows)

 i | i
---+---
 1 |
   | 2
(2 rows)

The question is, why does this only happen for a parallel full hash join?

unpa
postgres=# explain (costs off) select * from a full join b on a.i = b.i;
        QUERY PLAN
---------------------------
 Hash Full Join
   Hash Cond: (a.i = b.i)
   ->  Seq Scan on a
   ->  Hash
         ->  Seq Scan on b
(5 rows)

postgres=#  select * from a full join b on a.i = b.i;
 i | i
---+---
 1 |
   | 2
(2 rows)

I imagine it has something to do with what tuples are put in the
parallel hashtable. I am about to investigate that but just wanted to
share what I had so far.

- Melanie



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: longfin missing gssapi_ext.h
Next
From: Justin Pryzby
Date:
Subject: Re: refactoring basebackup.c