Re: [HACKERS] Parallel Hash take II - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] Parallel Hash take II
Date
Msg-id CAEepm=0th8Le2SDCv32zN7tMyCJYR9oGYJ52fXNYJz1hrpGW+Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Parallel Hash take II  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] Parallel Hash take II
List pgsql-hackers
On Tue, Oct 24, 2017 at 10:10 PM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> Here is an updated patch set that does that ^.

It's a bit hard to understand what's going on with the v21 patch set I
posted yesterday because EXPLAIN ANALYZE doesn't tell you anything
interesting.  Also, if you apply the multiplex_gather patch[1] I
posted recently and set multiplex_gather to off then it doesn't tell
you anything at all, because the leader has no hash table (I suppose
that could happen with unpatched master given sufficiently bad
timing).  Here's a new version with an extra patch that adds some
basic information about load balancing to EXPLAIN ANALYZE, inspired by
what commit bf11e7ee did for Sort.

Example output:

enable_parallel_hash = on, multiplex_gather = on:

 ->  Parallel Hash (actual rows=1000000 loops=3)
       Buckets: 131072  Batches: 16
       Leader:    Shared Memory Usage: 3552kB  Hashed: 396120  Batches Probed: 7
       Worker 0:  Shared Memory Usage: 3552kB  Hashed: 276640  Batches Probed: 6
       Worker 1:  Shared Memory Usage: 3552kB  Hashed: 327240  Batches Probed: 6
       ->  Parallel Seq Scan on simple s (actual rows=333333 loops=3)

 ->  Parallel Hash (actual rows=10000000 loops=8)
       Buckets: 131072  Batches: 256
       Leader:    Shared Memory Usage: 2688kB  Hashed: 1347720
Batches Probed: 36
       Worker 0:  Shared Memory Usage: 2688kB  Hashed: 1131360
Batches Probed: 33
       Worker 1:  Shared Memory Usage: 2688kB  Hashed: 1123560
Batches Probed: 38
       Worker 2:  Shared Memory Usage: 2688kB  Hashed: 1231920
Batches Probed: 38
       Worker 3:  Shared Memory Usage: 2688kB  Hashed: 1272720
Batches Probed: 34
       Worker 4:  Shared Memory Usage: 2688kB  Hashed: 1234800
Batches Probed: 33
       Worker 5:  Shared Memory Usage: 2688kB  Hashed: 1294680
Batches Probed: 37
       Worker 6:  Shared Memory Usage: 2688kB  Hashed: 1363240
Batches Probed: 35
       ->  Parallel Seq Scan on big s2 (actual rows=1250000 loops=8)

enable_parallel_hash = on, multiplex_gather = off (ie no leader participation):

 ->  Parallel Hash (actual rows=1000000 loops=2)
       Buckets: 131072  Batches: 16
       Worker 0:  Shared Memory Usage: 3520kB  Hashed: 475920  Batches Probed: 9
       Worker 1:  Shared Memory Usage: 3520kB  Hashed: 524080  Batches Probed: 8
       ->  Parallel Seq Scan on simple s (actual rows=500000 loops=2)

enable_parallel_hash = off, multiplex_gather = on:

 ->  Hash (actual rows=1000000 loops=3)
       Buckets: 131072  Batches: 16
       Leader:    Memory Usage: 3227kB
       Worker 0:  Memory Usage: 3227kB
       Worker 1:  Memory Usage: 3227kB
       ->  Seq Scan on simple s (actual rows=1000000 loops=3)

enable_parallel_hash = off, multiplex_gather = off:

 ->  Hash (actual rows=1000000 loops=2)
       Buckets: 131072  Batches: 16
       Worker 0:  Memory Usage: 3227kB
       Worker 1:  Memory Usage: 3227kB
       ->  Seq Scan on simple s (actual rows=1000000 loops=2)

parallelism disabled (traditional single-line output, unchanged):

 ->  Hash (actual rows=1000000 loops=1)
       Buckets: 131072  Batches: 16  Memory Usage: 3227kB
       ->  Seq Scan on simple s (actual rows=1000000 loops=1)

(It actually says "Tuples Hashed", not "Hashed" but I edited the above
to fit on a standard punchcard.)  Thoughts?

[1] https://www.postgresql.org/message-id/CAEepm%3D2U%2B%2BLp3bNTv2Bv_kkr5NE2pOyHhxU%3DG0YTa4ZhSYhHiw%40mail.gmail.com

-- 
Thomas Munro
http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] Current int & float overflow checking is slow.
Next
From: Kuntal Ghosh
Date:
Subject: Re: [HACKERS] Implementing pg_receivewal --no-sync