On Fri, Nov 2, 2018 at 3:00 PM Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Fri, Nov 2, 2018 at 2:29 PM Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
> > On Fri, Nov 2, 2018 at 2:07 PM Andrew Gierth
> > <andrew@tao11.riddles.org.uk> wrote:
> > > >>>>> "Paul" == Paul Schaap <ps@ipggroup.com> writes:
> > > Paul> Hi Andrew,
> > > Paul> Bingo, set enable_parallel_hash=false; gets a correct result
> > > Paul> whereas set enable_parallel_hash=true; gets 0.
> > >
> > > Paul> Yes I might have reversed some of the explains, my excuse its
> > > Paul> Friday and I went to bed late and am burnt out today :-)
> > >
> > > Are all the values of the my_citext column actually null?
> >
> > Thanks for the report Paul and the analysis Andrew. Discussed with
> > Andrew a bit on IRC. Summary: multi-batch left joins are not handling
> > NULLs correctly in the left table when partitioning. Looking into
> > this now.
>
> Here's a repro.
>
> create table r as select generate_series(1, 1000000) i, null::int j;
> update r set j = i where i <= 10;
> create table s as select generate_series(1, 1000000) i;
> analyze;
> select count(*), count(r.j) from r left join s on r.j = s.i;
>
> Unpatched master gives me a 16 batch Parallel Hash Join with the
> incorrect answer:
>
> count | count
> -------+-------
> 10 | 10
>
> With the attached patch the answer is correct:
>
> count | count
> ---------+-------
> 1000000 | 10
>
> The brown-paper-bag level fix is:
>
> - false, /* outer join, currently unsupported */
> + HJ_FILL_OUTER(hjstate),
>
> It is right and full outer joins that are not yet supported by
> Parallel Hash Join. Left outer joins *are* supported. The effect of that
> thinko is to make them behave like inner joins (but only in multi-batch
> joins, ie when work_mem is exceeded).
Pushed.
--
Thomas Munro
http://www.enterprisedb.com