On Wed, Mar 8, 2017 at 5:33 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> pg_restore will avoid parallelism (that will happen by setting
>> "max_parallel_workers_maintenance = 0" when it runs), not because it
>> cannot trust the cost model, but because it prefers to parallelize
>> things its own way (with multiple restore jobs), and because execution
>> speed may not be the top priority for pg_restore, unlike a live
>> production system.
>
> This part I'm not sure about. I think people care quite a lot about
> pg_restore speed, because they are often down when they're running it.
> And they may have oodles mode CPUs that parallel restore can use
> without help from parallel query. I would be inclined to leave
> pg_restore alone and let the chips fall where they may.
I thought that we might want to err on the side of preserving the
existing behavior, but arguably that's actually what I failed to do.
That is, since we don't currently have a pg_restore flag that controls
the maintenance_work_mem used by pg_restore, "let the chips fall where
they may" is arguably the standard that I didn't uphold.
It might still make sense to take a leaf out of the parallel query
book on this question. That is, add an open item along the lines of
"review behavior of pg_restore with parallel CREATE INDEX" that we
plan to deal with close to the release of Postgres 10.0, when feedback
from beta testing is in. There are a number of options, none of which
are difficult to write code for. The hard part is determining what
makes most sense for users on balance.
--
Peter Geoghegan