Re: pgsql: Add parallel-aware hash joins. - Mailing list pgsql-committers

From Thomas Munro
Subject Re: pgsql: Add parallel-aware hash joins.
Date
Msg-id CAEepm=2H80cY=DncWKjoSiwaX=xbLcW7dqee+1H5-CQ6pJJnAQ@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add parallel-aware hash joins.  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: pgsql: Add parallel-aware hash joins.  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-committers
On Thu, Dec 28, 2017 at 5:15 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Thu, Dec 28, 2017 at 3:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> !                            Buckets: 1024 (originally 2048)  Batches: 1 (originally 1)  Memory Usage: 0kB
>> !  Execution time: 243.120 ms
>>
>> I don't have enough insight to be totally sure what this means, but the
>> "Memory Usage: 0kB" bit is obviously bogus, so I'd venture that at least
>> part of the issue is failure to return stats from a worker.
>
> Hmm.  Yeah, that seems quite likely -- thanks.  Investigating now.

This is explained by the early exit case in
ExecParallelHashEnsureBatchAccessors().  With just the right timing,
it finishes up not reporting the true nbatch number, and never calling
ExecParallelHashUpdateSpacePeak().

In my patch for commit 5bcf389e (before PHJ), I had extracted and
rejiggered some parts of my PHJ work to fix a problem with EXPLAIN for
parallel-oblivious hash joins running in parallel queries, but I
failed to readapt it properly for PHJ.  EXPLAIN needs to scan all
participants' HashInstrumentation to collect the greatest space
report, not just the first one it finds.  I'll test and post a patch
to fix this tomorrow.

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-committers by date:

Previous
From: Andres Freund
Date:
Subject: pgsql: Fix rare assertion failure in parallel hash join.
Next
From: Thomas Munro
Date:
Subject: Re: pgsql: Add pow(), aka power(), function to pgbench.