Re: [HACKERS] Range Merge Join v1 - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: [HACKERS] Range Merge Join v1
Date
Msg-id CAMp0ube+NEJsJimo32N118F=qcq1GkHEwMpHLT9vEZC+dmGZZA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Range Merge Join v1  (Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru>)
Responses Re: [HACKERS] Range Merge Join v1  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Fri, Aug 25, 2017 at 10:19 AM, Alexander Kuzmenkov
<a.kuzmenkov@postgrespro.ru> wrote:
> Hi Jeff,

Hi,

Thank you for the review and suggestions!

> * At the moment, "mergejoinable clause" and "equality clause" mean the same
> thing to the planner, and those clauses are used both to create equivalence
> classes and to perform merge joins. If we start performing merge joins for
> different kinds of clauses, such as comparison or range intersection, it
> makes sense to separate these two meanings. I tried this in my patch and it
> didn't require many changes. I use RestrictInfo.equivopfamilies to build
> equivalence classes, and RestrictInfo.mergeopfamilies for mergejoinable
> clauses.

I like the idea. I will look in more detail and I can either change my
patch or piggyback on yours.

> * Semantically, MJCompare() is a function that determines whether you should
> emit a join result or advance one of the pointers. This meaning is not
> explicit in the code and is not well encapsulated: the function returns and
> int and 'compareNoMatch' flag, and the calling code combines them in various
> ways to derive the final result. This meaning can be made explicit by making
> MJCompare return enum values {Join, NextInner, NextOuter}, and putting
> inside it all the logic that decides whether we join or advance.
> ExecMergeJoin looks intimidating already, and I think this change would help
> readability. Again, you can find an illustration in my patch.

I actually tried doing something similar in my patch, but there are
four cases to consider in EXEC_MJ_SKIP_TEST:

1. Overlaps: mark and then join the tuples
2. left-of: SKIPOUTER_ADVANCE
3. right-of: SKIPINNER_ADVANCE
4. None of the above: mark and NEXTINNER

However, you are right that the current code is ugly, and I should
refactor into 4 explicit cases. I don't think I will call them by the
same names as the join states, though, because there's not a direct
mapping.

Updated patch attached. Changelog:

* Rebased
* Changed MJCompare to return an enum as suggested, but it has 4
possible values rather than 3.
* Added support for joining on contains or contained by (@> or <@) and
updated tests.

Regards,
    Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Amit Khandekar
Date:
Subject: Re: [HACKERS] UPDATE of partition key
Next
From: Richard Guo
Date:
Subject: [HACKERS] CurrentUserId may be invalid during the rest of a session