Re: [HACKERS] Parallel Index Scans - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Parallel Index Scans
Date
Msg-id CA+TgmoZ+-BR28fgxvbJ0vQ52C=8Ch90GW0G-KxC4REAsXCHiUg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Index Scans  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Thu, Feb 9, 2017 at 5:34 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>> What about parallel CREATE INDEX? The patch currently uses
>> min_parallel_relation_size as an input into the optimizer's custom
>> cost model. I had wondered if that made sense. Note that another such
>> input is the projected size of the final index.
>
> If projected index size is available, then I think Create Index can
> also use a somewhat similar formula where we cap the maximum number of
> workers based on the size of the index.  Now, I am not sure if the
> threshold values of guc's kept for the scan are realistic for Create
> Index operation.

I think that would be an abuse of the GUC, because the idea of the
existing GUC - and the new one we're proposing to create here - has
always been about the amount of data being fed into the parallel
operation.  In the case of CREATE INDEX, the resulting index is an
output, not an input.  So if I were Peter and wanted to reuse the
existing GUCs, I'd reuse the one for the table size, because that's
what is being scanned.  No index is going to get scanned.

Of course, it's possible that the sensible amount of parallelism for
CREATE INDEX is higher or lower than for other sequential scans, so
that might not be the right thing to do.  It might need its own knob,
or some other refinement.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] removing tsearch2
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel Index Scans