Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Parallel Seq Scan
Date
Msg-id CAA4eK1KnpMv6DHWqQznjVwy4mfBqXZDLzMWXoCCnYvdk-VwvXQ@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Parallel Seq Scan
List pgsql-hackers
On Thu, Jan 22, 2015 at 7:23 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Jan 22, 2015 at 5:57 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > 1. Scanning block-by-block has negative impact on performance and
> > I thin it will degrade more if we increase parallel count as that can lead
> > to more randomness.
> >
> > 2. Scanning in fixed chunks improves the performance. Increasing
> > parallel count to a very large number might impact the performance,
> > but I think we can have a lower bound below which we will not allow
> > multiple processes to scan the relation.
>
> I'm confused.  Your actual test numbers seem to show that the
> performance with the block-by-block approach was slightly higher with
> parallelism than without, where as the performance with the
> chunk-by-chunk approach was lower with parallelism than without, but
> the text quoted above, summarizing those numbers, says the opposite.
>

Sorry for causing confusion, I should have been more explicit about
explaining the numbers.  Let me try again,
Values in columns is time in milliseconds to complete the execution,
so higher means it took more time.  If you see in block-by-block, the
time taken to complete the execution with 2 workers is more than
no workers which means parallelism has degraded the performance.

> Also, I think testing with 2 workers is probably not enough.  I think
> we should test with 8 or even 16.
>

Sure, will do this and post the numbers.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan
Next
From: Robert Haas
Date:
Subject: Re: Parallel Seq Scan