Costing bug in hash join logic for semi joins - Mailing list pgsql-hackers

From RK Korlapati
Subject Costing bug in hash join logic for semi joins
Date
Msg-id CAHupG3gMjnOFpnL=zPaMhOibJ5J4qJ5yH6GjY0dJV5uQAdJcRA@mail.gmail.com
Whole thread Raw
List pgsql-hackers
There is a costing bug in hash join logic seems to have been introduced by the patch related to inner_unique enhancements(commit: 9c7f5229ad68d7e0e4dd149e3f80257893e404d4). Specifically, "hashjointuples" which tracks the number of matches for hash clauses is computed wrong for inner unique scenario. This leads to lot of semi-joins  incorrectly turn to inner joins with unique on inner side. Function "final_cost_hashjoin" has special handling to cost semi/anti joins to account for early stop after the first match. This is enhanced by the above said commit to be used for inner_unique scenario also. However, "hashjointuples" computation is not fixed to handle this new scenario which is incorrectly stepping into the anti join logic and assuming unmatched rows. Fix is to handle inner_unique case when computing "hashjointuples".  Here is the outline of the code that shows the bug.

void
final_cost_hashjoin(PlannerInfo *root, HashPath *path,
                              JoinCostWorkspace *workspace,
                            JoinPathExtraData *extra)
{
         .....
        /* CPU costs */
        if (path->jpath.jointype == JOIN_SEMI ||
            path->jpath.jointype == JOIN_ANTI ||
            extra->inner_unique)

       {
                 ......
                                           
                 /* Get # of tuples that will pass the basic join */
               if (path->jpath.jointype == JOIN_SEMI)
                  hashjointuples = outer_matched_rows;
              else
                 hashjointuples = outer_path_rows - outer_matched_rows; 

        }
       else
       {
         .....
       }
}

Thanks, RK (Salesforce)

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: cursors with prepared statements
Next
From: Emre Hasegeli
Date:
Subject: Re: [PATCH] Improve geometric types