Re: [HACKERS] Parallel Hash take II - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Parallel Hash take II
Date
Msg-id CA+TgmoYinb5M0f+mhbQw3DAXmnJYjNw5ZEiTO+XeUz=1RRzYhQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Hash take II  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Mon, Jul 31, 2017 at 9:11 PM, Andres Freund <andres@anarazel.de> wrote:
> - Echoing concerns from other threads (Robert: ping): I'm doubtful that
>   it makes sense to size the number of parallel workers solely based on
>   the parallel scan node's size.  I don't think it's this patch's job to
>   change that, but to me it seriously amplifys that - I'd bet there's a
>   lot of cases with nontrivial joins where the benefit from parallelism
>   on the join level is bigger than on the scan level itself.  And the
>   number of rows in the upper nodes might also be bigger than on the
>   scan node level, making it more important to have higher number of
>   nodes.

Well, I feel like a broken record here but ... yeah, I agree we need
to improve that.  It's probably generally true that the more parallel
operators we add, the more potential benefit there is in doing
something about that problem.  But, like you say, not in this patch.

http://postgr.es/m/CA+TgmoYL-SQZ2gRL2DpenAzOBd5+SW30QB=A4CseWtOgejz4aQ@mail.gmail.com

I think we could improve things significantly by generating multiple
partial paths with different number of parallel workers, instead of
just picking a number of workers based on the table size and going
with it.  For that to work, though, you'd need something built into
the costing to discourage picking paths with too many workers.  And
you'd need to be OK with planning taking a lot longer when parallelism
is involved, because you'd be carrying around more paths for longer.
There are other problems to solve, too.

I still think, though, that it's highly worthwhile to get at least a
few more parallel operators - and this one in particular - done before
we attack that problem in earnest.  Even with a dumb calculation of
the number of workers, this helps a lot.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partitioning vs ON CONFLICT
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Update description of \d[S+] in \?