Re: [HACKERS] Cost model for parallel CREATE INDEX - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Cost model for parallel CREATE INDEX
Date
Msg-id CA+TgmoY_99pjNvdJ2M=XA82Opq1577mVVt=yDxSDKD=d8Qw84Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Cost model for parallel CREATE INDEX  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] Cost model for parallel CREATE INDEX  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sat, Mar 4, 2017 at 2:17 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Sat, Mar 4, 2017 at 12:43 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> Oh.  But then I don't see why you need min_parallel_anything.  That's
>> just based on an estimate of the amount of data per worker vs.
>> maintenance_work_mem, isn't it?
>
> Yes -- and it's generally a pretty good estimate.
>
> I don't really know what minimum amount of memory to insist workers
> have, which is why I provisionally chose one of those GUCs as the
> threshold.
>
> Any better ideas?

I don't understand how min_parallel_anything is telling you anything
about memory.  It has, in general, nothing to do with that.

If you think parallelism isn't worthwhile unless the sort was going to
be external anyway, then it seems like the obvious thing to do is
divide the projected size of the sort by maintenance_work_mem, round
down, and cap the number of workers to the result.  If the result of
compute_parallel_workers() based on min_parallel_table_scan_size is
smaller, then use that value instead.  I must be confused, because I
actually though that was the exact algorithm you were describing, and
it sounded good to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Block level parallel vacuum WIP
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Patch to implement pg_current_logfile() function