Re: [HACKERS] Parallel Append implementation - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Parallel Append implementation
Date
Msg-id CA+Tgmob2YuUo5uJfim4ufFHsS+N5QqLErtFhjHVnDdp8tRfRxA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Append implementation  ("Tels" <nospam-pg-abuse@bloodgate.com>)
Responses Re: [HACKERS] Parallel Append implementation  ("Tels" <nospam-pg-abuse@bloodgate.com>)
List pgsql-hackers
On Fri, Mar 10, 2017 at 6:01 AM, Tels <nospam-pg-abuse@bloodgate.com> wrote:
> Just a question for me to understand the implementation details vs. the
> strategy:
>
> Have you considered how the scheduling decision might impact performance
> due to "inter-plan parallelism vs. in-plan parallelism"?
>
> So what would be the scheduling strategy? And should there be a fixed one
> or user-influencable? And what could be good ones?
>
> A simple example:
>
> E.g. if we have 5 subplans, and each can have at most 5 workers and we
> have 5 workers overall.
>
> So, do we:
>
>   Assign 5 workers to plan 1. Let it finish.
>   Then assign 5 workers to plan 2. Let it finish.
>   and so on
>
> or:
>
>   Assign 1 workers to each plan until no workers are left?

Currently, we do the first of those, but I'm pretty sure the second is
way better.  For example, suppose each subplan has a startup cost.  If
you have all the workers pile on each plan in turn, every worker pays
the startup cost for every subplan.  If you spread them out, then
subplans can get finished without being visited by all workers, and
then the other workers never pay those costs.  Moreover, you reduce
contention for spinlocks, condition variables, etc.  It's not
impossible to imagine a scenario where having all workers pile on one
subplan at a time works out better: for example, suppose you have a
table with lots of partitions all of which are on the same disk, and
it's actually one physical spinning disk, not an SSD or a disk array
or anything, and the query is completely I/O-bound.  Well, it could
be, in that scenario, that spreading out the workers is going to turn
sequential I/O into random I/O and that might be terrible.  In most
cases, though, I think you're going to be better off.  If the
partitions are on different spindles or if there's some slack I/O
capacity for prefetching, you're going to come out ahead, maybe way
ahead.  If you come out behind, then you're evidently totally I/O
bound and have no capacity for I/O parallelism; in that scenario, you
should probably just turn parallel query off altogether, because
you're not going to benefit from it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel Append implementation
Next
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] WIP: Faster Expression Processing v4