Re: [HACKERS] Cost model for parallel CREATE INDEX - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Cost model for parallel CREATE INDEX
Date
Msg-id CAH2-Wzn3O=1NFP3epKsuuLXGuChmzcLVBSJeBDvgYZZRmHmm8A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Cost model for parallel CREATE INDEX  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] Cost model for parallel CREATE INDEX  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
On Sat, Mar 4, 2017 at 12:50 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> If you think parallelism isn't worthwhile unless the sort was going to
> be external anyway,

I don't -- that's just when it starts to look like a safe bet that
parallelism is worthwhile. There are quite a few cases where an
external sort is faster than an internal sort these days, actually.

> then it seems like the obvious thing to do is
> divide the projected size of the sort by maintenance_work_mem, round
> down, and cap the number of workers to the result.

I'm sorry, I don't follow.

> If the result of
> compute_parallel_workers() based on min_parallel_table_scan_size is
> smaller, then use that value instead.  I must be confused, because I
> actually though that was the exact algorithm you were describing, and
> it sounded good to me.

It is, but I was using that with index size, not table size. I can
change it to be table size, based on what you said. But the workMem
related cap, which probably won't end up being applied all that often
in practice, *should* still do something with projected index size,
since that really is what we're sorting, which could be very different
(e.g. with partial indexes).

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Patch to implement pg_current_logfile() function
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] WAL Consistency checking for hash indexes