Re: [HACKERS] Cost model for parallel CREATE INDEX - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: [HACKERS] Cost model for parallel CREATE INDEX
Date
Msg-id CAH2-WzkBsJ5tCrtV449n7-dTDSbWnEAtS2ST8t2uMnJx5JNQYA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Cost model for parallel CREATE INDEX  (Stephen Frost <sfrost@snowman.net>)
Responses Re: [HACKERS] Cost model for parallel CREATE INDEX  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Sat, Mar 4, 2017 at 6:00 AM, Stephen Frost <sfrost@snowman.net> wrote:
>> It is, but I was using that with index size, not table size. I can
>> change it to be table size, based on what you said. But the workMem
>> related cap, which probably won't end up being applied all that often
>> in practice, *should* still do something with projected index size,
>> since that really is what we're sorting, which could be very different
>> (e.g. with partial indexes).
>
> Isn't that always going to be very different, unless you're creating a
> single index across every column in the table..?  Or perhaps I've
> misunderstood what you're comparing as being 'very different' in your
> last sentence.

I mean: though a primary key index or similar is smaller than the
table by maybe 5X, they are still generally within an order of
magnitude. Given that the number of workers is determined at
logarithmic intervals, it may not actually matter that much whether
the scaling is based on heap size (input size) or index size (output
size), at a very high level. Despite a 5X difference. I'm referring to
the initial determination of the number of workers to be used, based
on the scan the parallel CREATE INDEX has to do. So, I'm happy to go
along with Robert's suggestion for V9, and have this number determined
based on heap input size rather than index output size. It's good to
be consistent with what we do for parallel seq scan (care about input
size), and it probably won't change things by much anyway. This is
generally the number that the cost model will end up going with, in
practice.

However, we then need to consider that since maintenance_work_mem is
doled out as maintenance_work_mem/nworkers slices for parallel CREATE
INDEX, there is a sensitivity to how much memory is left per worker as
workers are added. This clear needs to be based on projected/estimated
index size (output size), since that is what is being sorted, and
because partial indexes imply that the size of the index could be
*vastly* less than heap input size with still-sensible use of the
feature. This will be applied as a cap on the first number.

So, I agree with Robert that we should actually use heap size for the
main, initial determination of # of workers to use, but we still need
to estimate the size of the final index [1], to let the cost model cap
the initial determination when maintenance_work_mem is just too low.
(This cap will rarely be applied in practice, as I said.)

[1] https://wiki.postgresql.org/wiki/Parallel_External_Sort#bt_estimated_nblocks.28.29_function_in_pageinspect
-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Re: check failure with -DRELCACHE_FORCE_RELEASE -DCLOBBER_FREED_MEMORY
Next
From: Jim Mlodgenski
Date:
Subject: Re: [HACKERS] mat views stats