Re: [HACKERS] WIP: [[Parallel] Shared] Hash - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Date
Msg-id CAEepm=0fStoz2Bgu=zD6efXDYt17cNo-a5a6tWwooh=xrBvZRw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] WIP: [[Parallel] Shared] Hash  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Out of archeological curiosity, I was digging around in the hash join
code and RCS history from Postgres 4.2[1], and I was astounded to
discover that it had a parallel executor for Sequent SMP systems and
was capable of parallel hash joins as of 1991.  At first glance, it
seems to follow approximately the same design as I propose: share a
hash table and use a barrier to coordinate the switch from build phase
to probe phase and deal with later patches.  It uses mmap to get space
and then works with relative pointers.  See
src/backend/executor/n_hash.c and src/backend/executor/n_hashjoin.c.
Some of this might be described in Wei Hong's PhD thesis[2] which I
haven't had the pleasure of reading yet.

The parallel support is absent from the first commit in our repo
(1996), but there are some vestiges like RelativeAddr and ABSADDR used
to access the hash table (presumably needlessly) and also some
mentions of parallel machines in comments that survived up until
commit 26069a58 (1999).

[1] http://db.cs.berkeley.edu/postgres.html
[2] http://db.cs.berkeley.edu/papers/ERL-M93-28.pdf

-- 
Thomas Munro
http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: [HACKERS] gitlab post-mortem: pg_basebackup waiting for checkpoint
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] bytea_output vs make installcheck