Re: parallel sequential scan returns extraneous rows - Mailing list pgsql-bugs

From Michael Day
Subject Re: parallel sequential scan returns extraneous rows
Date
Msg-id 6388A361-BE20-44B5-9F07-58ABDE24DFAD@rcmail.com
Whole thread Raw
In response to Re: parallel sequential scan returns extraneous rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: parallel sequential scan returns extraneous rows
List pgsql-bugs
I was able to reproduce with this set of data.

create table users (id integer);
create table address (id integer, users_id integer);

insert into users select s from generate_series(1,1000000) s;
insert into address select s, s/2 from generate_series(1,2000000) s;

analyze users;
analyze address;

set max_parallel_workers_per_gather =3D 0;

select count(*)
from users u=20
join address a on (a.users_id =3D u.id)=20
where exists (select 1 from address where users_id =3D u.id);

set max_parallel_workers_per_gather =3D 1;

select count(*)
from users u=20
join address a on (a.users_id =3D u.id)=20
where exists (select 1 from address where users_id =3D u.id);


On 11/29/16, 11:19 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

    Michael Day <blake@rcmail.com> writes:
    > I have found a nasty bug when using parallel sequential scans with an=
 exists clause on postgresql 9.6.1.  I have found that the rows returned usi=
ng parallel sequential scan plans are incorrect (though I haven=E2=80=99t dug suff=
iciently to know in what ways).  See below for an example of the issue.
   =20
    Hm, looks like a planner error: it seems to be forgetting that the join
    to "address" should be a semijoin.  "address" should either be on the
    inside of a "Semi" join (as in your first, correct-looking plan) or be
    passed through a unique-ification stage such as a HashAgg.  Clearly,
    neither thing is happening in the second plan.
   =20
    I couldn't reproduce this in a bit of trying, however.  Can you come
    up with a self-contained test case?
   =20
                regards, tom lane
   =20

pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] object_classes array is broken, again
Next
From: Tom Lane
Date:
Subject: Re: BUG #14438: Wrong row count in the join plan with unique index scan