Home > mailing lists

Re: Parallel Index Scans - Mailing list pgsql-hackers

From	Amit Kapila
Subject	Re: Parallel Index Scans
Date	October 20, 2016 03:07:51
Msg-id	CAA4eK1+V9L9Tdv19S17=AqveDiSdZhWqfq8Eypvm3Go0q_EsTg@mail.gmail.com Whole thread
In response to	Re: Parallel Index Scans (Peter Geoghegan <pg@heroku.com>)
Responses	Re: Parallel Index Scans Re: Parallel Index Scans
List	pgsql-hackers

Tree view

On Thu, Oct 20, 2016 at 7:39 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Mon, Oct 17, 2016 at 8:08 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> Create Index .... With (parallel_workers = 4);
>>
>> If above syntax looks sensible, then we might need to think what
>> should be used for parallel index build.  It seems to me that parallel
>> tuple sort patch [1] proposed by Peter G. is using above syntax for
>> getting the parallel workers input from user for parallel index
>> builds.
>
> Apparently you see a similar issue with other major database systems,
> where similar storage parameter things are kind of "overloaded" like
> this (they are used by both index creation, and by the optimizer in
> considering whether it should use a parallel index scan). That can be
> a kind of a gotcha for their users, but maybe it's still worth it.
>

I have also checked and found that you are right.  In SQL Server, they
are using max degree of parallelism (MAXDOP) parameter which is I
think is common for all the sql statements.

> In
> any case, the complaints I saw about that were from users who used
> parallel CREATE INDEX with the equivalent of my parallel_workers index
> storage parameter, and then unexpectedly found this also forced the
> use of parallel index scan. Not the other way around.
>

I can understand that it can be confusing to users, so other option
could be to provide separate parameters like parallel_workers_build
and parallel_workers where first can be used for index build and
second can be used for scan.  My personal opinion is to have one
parameter, so that users have one less thing to learn about
parallelism.

> Ideally, the parallel_workers storage parameter will rarely be
> necessary because the optimizer will generally do the right thing in
> all case.
>

Yeah, we can choose not to provide any parameter for parallel index
scans, but some users might want to have a parameter similar to
parallel table scans, so it could be handy for them to use.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

pgsql-hackers by date:

From: Michael Paquier
Date: 20 October 2016, 03:04:19
Subject: Re: Disable autovacuum guc?

From: Peter Geoghegan
Date: 20 October 2016, 03:24:59
Subject: Re: Parallel Index Scans

Re: Parallel Index Scans - Mailing list pgsql-hackers

Previous

Next