Re: Costing bug in hash join logic for semi joins - Mailing list pgsql-hackers

From David Rowley
Subject Re: Costing bug in hash join logic for semi joins
Date
Msg-id CAKJS1f9Ky9gxKPdR5Azv65jgTb8rKAFN+j_QCA9WDoh0r+TuVw@mail.gmail.com
Whole thread Raw
In response to Costing bug in hash join logic for semi joins  (RK <korlapati@gmail.com>)
Responses Re: Costing bug in hash join logic for semi joins  (David Rowley <david.rowley@2ndquadrant.com>)
Re: Costing bug in hash join logic for semi joins  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 10 July 2018 at 11:44, RK <korlapati@gmail.com> wrote:
> There is a costing bug in hash join logic seems to have been introduced by
> the patch related to inner_unique enhancements(commit:
> 9c7f5229ad68d7e0e4dd149e3f80257893e404d4). Specifically, "hashjointuples"
> which tracks the number of matches for hash clauses is computed wrong for
> inner unique scenario. This leads to lot of semi-joins  incorrectly turn to
> inner joins with unique on inner side. Function "final_cost_hashjoin" has
> special handling to cost semi/anti joins to account for early stop after the
> first match. This is enhanced by the above said commit to be used for
> inner_unique scenario also. However, "hashjointuples" computation is not
> fixed to handle this new scenario which is incorrectly stepping into the
> anti join logic and assuming unmatched rows. Fix is to handle inner_unique
> case when computing "hashjointuples".  Here is the outline of the code that
> shows the bug.

Thanks for the analysis and the report. I agree the code is wrong.
Looks simple enough just to handle the semi and unique joins in the
else clause and make the if handle the ANTI join.

I've done that in the attached.  Also on reading the comment above, it
looks slightly incorrect. To me, it looks like it's applying a
twentieth of the cost and not a tenth as the comment claims. I
couldn't resist updating that too.

I didn't feel the need to add any regression tests around this. It
seems like an unlikely bug to ever appear again.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] PoC plpgsql - possibility to force custom or genericplan
Next
From: Ashutosh Bapat
Date:
Subject: Re: Expression errors with "FOR UPDATE" and postgres_fdw withpartition wise join enabled.