Home > mailing lists

Re: [HACKERS] WIP: [[Parallel] Shared] Hash - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Re: [HACKERS] WIP: [[Parallel] Shared] Hash
Date	March 18, 2017 07:30:23
Msg-id	CAEepm=2fE0UBOXzaBvvW4HsQZDQG4MpHBFai_T0iou0oA_VBPw@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses	Re: [HACKERS] WIP: [[Parallel] Shared] Hash (Thomas Munro <thomas.munro@enterprisedb.com>)
List	pgsql-hackers

Tree view

On Tue, Mar 14, 2017 at 8:03 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> On Mon, Mar 13, 2017 at 8:40 PM, Rafia Sabih
> <rafia.sabih@enterprisedb.com> wrote:
>> In an attempt to test v7 of this patch on TPC-H 20 scale factor I found a
>> few regressions,
>> Q21: 52 secs on HEAD and  400 secs with this patch
>
> Thanks Rafia.  Robert just pointed out off-list that there is a bogus
> 0 row estimate in here:
>
> ->  Parallel Hash Semi Join  (cost=1006599.34..1719227.30 rows=0
> width=24) (actual time=38716.488..100933.250 rows=7315896 loops=5)
>
> Will investigate, thanks.

There are two problems here.

1.  There is a pre-existing cardinality estimate problem for
semi-joins with <> filters.  The big Q21 regression reported by Rafia
is caused by that phenomenon, probably exacerbated by another bug that
allowed 0 cardinality estimates to percolate inside the planner.
Estimates have been clamped at or above 1.0 since her report by commit
1ea60ad6.

I started a new thread to discuss that because it's unrelated to this
patch, except insofar as it confuses the planner about Q21 (with or
without parallelism).  Using one possible selectivity tweak suggested
by Tom Lane, I was able to measure significant speedups on otherwise
unpatched master:

https://www.postgresql.org/message-id/CAEepm%3D11BiYUkgXZNzMtYhXh4S3a9DwUP8O%2BF2_ZPeGzzJFPbw%40mail.gmail.com

2.  If you compare master tweaked as above against the latest version
of my patch series with the tweak, then the patched version always
runs faster with 4 or more workers, but with only 1 or 2 workers Q21
is a bit slower... but not always.  I realised that there was a
bi-modal distribution of execution times.  It looks like my 'early
exit' protocol, designed to make tuple-queue deadlock impossible, is
often causing us to lose a worker.  I am working on that now.

I have code changes for Peter G's and Andres's feedback queued up and
will send a v8 series shortly, hopefully with a fix for problem 2
above.

-- 
Thomas Munro
http://www.enterprisedb.com

pgsql-hackers by date:

From: Robert Haas
Date: 18 March 2017, 06:10:02
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables

From: Erik Rijkers
Date: 18 March 2017, 11:37:36
Subject: [HACKERS] more on comments of snapbuild.c

Re: [HACKERS] WIP: [[Parallel] Shared] Hash - Mailing list pgsql-hackers

Previous

Next