Re: Avoiding hash join batch explosions with extreme skew and weird stats - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Avoiding hash join batch explosions with extreme skew and weird stats
Date
Msg-id CA+hUKG+JHKQRJT_u+sYwX5MZfOVG_JKTtNAr0G2QvNZDL_+XGg@mail.gmail.com
Whole thread Raw
In response to Re: Avoiding hash join batch explosions with extreme skew and weird stats  (Melanie Plageman <melanieplageman@gmail.com>)
List pgsql-hackers
On Fri, May 1, 2020 at 2:30 AM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> I made a few edits to the message and threw it into a draft patch (on
> top of master, of course). I didn't want to junk up peoples' inboxes, so
> I didn't start a separate thread, but, it will be pretty hard to
> collaboratively edit the comment/ever register it for a commitfest if it
> is wedged into this thread. What do you think?

+1, this is a good description and I'm sure you're right about the
name of the algorithm.  It's a "hybrid" between a simple no partition
hash join, and partitioning like the Grace machine, since batch 0 is
processed directly without touching the disk.

You mention that PHJ finalises the number of batches during build
phase while SHJ can extend it later.  There's also a difference in the
probe phase: although inner batch 0 is loaded into the hash table
directly and not written to disk during the build phase (= classic
hybrid, just like the serial algorithm), outer batch 0 *is* written
out to disk at the start of the probe phase (unlike classic hybrid at
least as we have it for serial hash join).  That's because I couldn't
figure out how to begin emitting tuples before partitioning was
finished, without breaking the deadlock-avoidance programming rule
that you can't let the program counter escape from the node when
someone might wait for you.  So maybe it's erm, a hybrid between
hybrid and Grace...



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: design for parallel backup
Next
From: Robert Haas
Date:
Subject: Re: design for parallel backup