Re: pgsql: Add parallel-aware hash joins. - Mailing list pgsql-committers

From Thomas Munro
Subject Re: pgsql: Add parallel-aware hash joins.
Date
Msg-id CAEepm=2r1svRUfeR1c9Z4UegkPdMw4gxyok4jCWf-h7kdqHHAA@mail.gmail.com
Whole thread Raw
In response to Re: pgsql: Add parallel-aware hash joins.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-committers
On Fri, Jan 5, 2018 at 5:00 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The early returns indicate that that problem is fixed;

Thanks for your help and patience with that.  I've made a list over
here so we don't lose track of the various things that should be
improved in this area, and will start a new thread when I have patches
to propose:  https://wiki.postgresql.org/wiki/Parallel_Hash

> but now that the
> noise level is down, it's possible to see that brolga is showing an actual
> crash in the PHJ test, perhaps one time in four.  So we're not out of
> the woods yet.  It seems to consistently look like this:
>
> 2017-12-21 17:34:52.092 EST [2252:4] LOG:  background worker "parallel worker" (PID 3584) was terminated by signal
11
> 2017-12-21 17:34:52.092 EST [2252:5] DETAIL:  Failed process was running: select count(*) from foo
>           left join (select b1.id, b1.t from bar b1 join bar b2 using (id)) ss
>           on foo.id < ss.id + 1 and foo.id > ss.id - 1;
> 2017-12-21 17:34:52.092 EST [2252:6] LOG:  terminating any other active server processes

That is a test of a parallel-aware hash join with a rescan (ie workers
get restarted repeatedly by the gather node reusing the DSM; maybe I
misunderstood some detail of the protocol for that).  I'll go and
review that code and try to reproduce the failure.  On the off-chance,
Andrew, is there any chance you have a core dump you could pull a
backtrace out of, on brolga?

-- 
Thomas Munro
http://www.enterprisedb.com


pgsql-committers by date:

Previous
From: Robert Haas
Date:
Subject: pgsql: Minor preparatory refactoring for UPDATE row movement.
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Add missing includes