Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Thom Brown
Subject Re: Parallel Seq Scan
Date
Msg-id CAA-aLv5zrjCNpei1VFfAcA7sszGZCorzOp=cekV5FHR3xAk5Yw@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: Parallel Seq Scan  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On 12 November 2015 at 15:23, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Nov 11, 2015 at 11:29 PM, Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
>>
>> Hi
>>
>> I have a first query
>>
>> I looked on EXPLAIN ANALYZE output and the numbers of filtered rows are
>> differen
>>
>
> Thanks for the report.  The reason for this problem is that instrumentation
> information from workers is getting aggregated multiple times.  In
> ExecShutdownGatherWorkers(), we call ExecParallelFinish where it
> will wait for workers to finish and then accumulate stats from workers.
> Now ExecShutdownGatherWorkers() could be called multiple times
> (once we read all tuples from workers, at end of node) and it should be
> ensured that repeated calls should not try to redo the work done by first
> call.
> The same is ensured for tuplequeues, but not for parallel executor info.
> I think we can safely assume that we need to call ExecParallelFinish() only
> when there are workers started by the Gathers node, so on those lines
> attached patch should fix the problem.

That fixes the count issue for me, although not the number of buffers
hit, or the actual time taken.

Thom



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: LLVM miscompiles numeric.c access to short numeric var headers
Next
From: Andres Freund
Date:
Subject: Re: checkpointer continuous flushing