Re: Parallel Full Hash Join - Mailing list pgsql-hackers

From Melanie Plageman
Subject Re: Parallel Full Hash Join
Date
Msg-id CAAKRu_bt-GCf0FmqmfB+fmDCmf7Kw=RQ38Fk1-kYNofi=jAHaw@mail.gmail.com
Whole thread Raw
In response to Parallel Full Hash Join  (Thomas Munro <thomas.munro@gmail.com>)
List pgsql-hackers
On Fri, Mar 5, 2021 at 8:31 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> On Tue, Mar 2, 2021 at 11:27 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > On Fri, Feb 12, 2021 at 11:02 AM Melanie Plageman
> > <melanieplageman@gmail.com> wrote:
> > > I just attached the diff.
> >
> > Squashed into one patch for the cfbot to chew on, with a few minor
> > adjustments to a few comments.
>
> I did some more minor tidying of comments and naming.  It's been on my
> to-do-list to update some phase names after commit 3048898e, and while
> doing that I couldn't resist the opportunity to change DONE to FREE,
> which somehow hurts my brain less, and makes much more obvious sense
> after the bugfix in CF #3031 that splits DONE into two separate
> phases.  It also pairs obviously with ALLOCATE.  I include a copy of
> that bugix here too as 0001, because I'll likely commit that first, so
> I rebased the stack of patches that way.  0002 includes the renaming I
> propose (master only).  Then 0003 is Melanie's patch, using the name
> SCAN for the new match bit scan phase.  I've attached an updated
> version of my "phase diagram" finger painting, to show how it looks
> with these three patches.  "scan*" is new.

Feedback on
v6-0002-Improve-the-naming-of-Parallel-Hash-Join-phases.patch

I like renaming DONE to FREE and ALLOCATE TO REALLOCATE in the grow
barriers. FREE only makes sense for the Build barrier if you keep the
added PHJ_BUILD_RUN phase, though, I assume you would change this patch
if you decided not to add the new build barrier phase.

I like the addition of the asterisks to indicate a phase is executed by
a single arbitrary process. I was thinking, shall we add one of these to
HJ_FILL_INNER since it is only done by one process in parallel right and
full hash join? Maybe that's confusing because serial hash join uses
that state machine too, though. Maybe **? Maybe we should invent a
complicated symbolic language :)

One tiny, random, unimportant thing: The function prototype for
ExecParallelHashJoinPartitionOuter() calls its parameter "node" and, in
the definition, it is called "hjstate". This feels like a good patch to
throw in that tiny random change to make the name the same.

static void ExecParallelHashJoinPartitionOuter(HashJoinState *node);

static void
ExecParallelHashJoinPartitionOuter(HashJoinState *hjstate)



pgsql-hackers by date:

Previous
From: John Naylor
Date:
Subject: Re: Force lookahead in COPY FROM parsing
Next
From: Mike Palmiotto
Date:
Subject: Re: Process initialization labyrinth