Re: BUG #15577: Query returns different results when executedmultiple times - Mailing list pgsql-bugs

From Bartosz Polnik
Subject Re: BUG #15577: Query returns different results when executedmultiple times
Date
Msg-id CAM37Zeshup0OkzJ2z=8_jBPrv-=dQZvs+v0HDmKnFf4wLn7jzQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #15577: Query returns different results when executedmultiple times  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: BUG #15577: Query returns different results when executedmultiple times  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-bugs
Sure. 

SET force_parallel_mode = off;
SET parallel_leader_participation = off;

alter table_c set (parallel_workers=2);
explain (costs false, analyze true)
  SELECT ta.id AS table_a_id,
         tc.id as table_c_id,
         tba.id AS table_b_id
  FROM test.table_b_active tba
         INNER JOIN test.table_c tc ON tba.target_id = tc.id
         INNER JOIN test.table_d td ON tc.table_d_id = td.id
         LEFT JOIN test.table_a ta ON ta.table_c_id = tc.id AND ta.date = '2018-08-31' :: DATE
  WHERE tba.date BETWEEN '2018-08-10' :: DATE AND '2018-09-01' :: DATE
    AND td.group = 'A'
    AND tc.table_e_id = 4
    AND (
      (tba.target_id = tc.id AND tba.group_type = 'A')
      OR tba.source_id = tc.id
    );

Output:
                                                                                                              QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join (actual time=203.910..346.758 rows=31 loops=1)
   ->  Gather (actual time=203.802..373.228 rows=31 loops=1)
         Workers Planned: 2
         Workers Launched: 2
         ->  Nested Loop (actual time=157.358..253.888 rows=16 loops=2)
               ->  Hash Join (actual time=0.365..15.749 rows=945 loops=2)
                     Hash Cond: (tc.table_d_id = td.id)
                     ->  Parallel Seq Scan on table_c tc (actual time=0.113..15.043 rows=1751 loops=2)
                           Filter: (table_e_id = 4)
                           Rows Removed by Filter: 49117
                     ->  Hash (actual time=0.078..0.078 rows=8 loops=2)
                           Buckets: 1024  Batches: 1  Memory Usage: 9kB
                           ->  Seq Scan on table_d td (actual time=0.058..0.065 rows=8 loops=2)
                                 Filter: (group = 'A'::test.group)
                                 Rows Removed by Filter: 55
               ->  Index Scan using table_b_idx_target_id on table_b (actual time=0.248..0.252 rows=0 loops=1890)
                     Index Cond: (target_id = tc.id)
                     Filter: ((date >= '2018-08-10'::date) AND (date <= '2018-09-01'::date) AND (((target_id = tc.id) AND (group_type = 'A'::test.group_type)) OR (source_id = tc.id)))
                     Rows Removed by Filter: 26
   ->  Index Scan using table_a_uq_001 on table_a ta (actual time=0.030..0.030 rows=1 loops=31)
         Index Cond: ((table_c_id = tc.id) AND (date = '2018-08-31'::date))
 Planning Time: 4.923 ms
 Execution Time: 374.408 ms

31 rows - correct result.


On Tue, Jan 8, 2019 at 2:30 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
On Tue, Jan 8, 2019 at 1:48 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
> On Tue, 8 Jan 2019 at 13:43, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
> > It's as if it thought it was executing a semi-join.
>
> EXPLAIN VERBOSE would show the inner_unique flag. If that's set it
> would act as a semi-join.

Hmm, yes, node->js.single_match == true in one (but not both!) process
would fit these results.  But I don't see a mechanism for that.  I
*guess* it's the worker that is skipping duplicates, because the
leader usually has time to emit a few tuples while the worker's
warming up and we see some duplicates (348539) in first few results
(at least it usually does on my machines for eager plans, though maybe
on Windows it's different?)

Bartosz, can we please try with force_parallel_mode = off, but also
parallel_leader_participation = off?  (I meant to write that in an
earlier email but accidentally wrote "on".  Gah.)  Then we should get
the same plan (the forced parallel plan is a bit different as you
noted; maybe whatever is broken isn't triggered that way).  You might
need to set parallel_workers to 2 on the table for it to pick a
parallel plan without leader participation.

--
Thomas Munro
http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Thomas Munro
Date:
Subject: Re: BUG #15577: Query returns different results when executedmultiple times
Next
From: CNG L
Date:
Subject: Is this a bug in auto vacuum worker?