Home > mailing lists

Re: [HACKERS] Cost model for parallel CREATE INDEX - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Cost model for parallel CREATE INDEX
Date	March 2, 2017 16:50:31
Msg-id	CA+Tgmob16p5EN5TMQHzbNuA-BJtEPFj860XnegM2-3OW=1CEbA@mail.gmail.com Whole thread
In response to	[HACKERS] Cost model for parallel CREATE INDEX (Peter Geoghegan <pg@bowt.ie>)
Responses	Re: [HACKERS] Cost model for parallel CREATE INDEX
List	pgsql-hackers

Tree view

On Wed, Mar 1, 2017 at 12:58 AM, Peter Geoghegan <pg@bowt.ie> wrote:
> * This scales based on output size (projected index size), not input
> size (heap scan input). Apparently, that's what we always do right
> now.

Actually, I'm not aware of any precedent for that. I'd just pass the
heap size to compute_parallel_workers(), leaving the index size as 0,
and call it good.  What you're doing now seems exactly backwards from
parallel query generally.

> So, the main factor that
> discourages parallel sequential scans doesn't really exist for
> parallel CREATE INDEX.

Agreed.

> We could always defer the cost model to another release, and only
> support the storage parameter for now, though that has disadvantages,
> some less obvious [4].

I think it's totally counter-intuitive that any hypothetical index
storage parameter would affect the degree of parallelism involved in
creating the index and also the degree of parallelism involved in
scanning it.  Whether or not other systems do such crazy things seems
to me to beside the point.  I think if CREATE INDEX allows an explicit
specification of the degree of parallelism (a decision I would favor)
it should have a syntactically separate place for unsaved build
options vs. persistent storage parameters.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Amit Kapila
Date: 02 March 2017, 16:44:57
Subject: Re: [HACKERS] Proposal : For Auto-Prewarm.

From: David Steele
Date: 02 March 2017, 17:05:22
Subject: Re: [HACKERS] Indirect indexes

Re: [HACKERS] Cost model for parallel CREATE INDEX - Mailing list pgsql-hackers

Previous

Next