Proposal for Merge Join for Non '=' Operators - Mailing list pgsql-hackers

From Dilip kumar
Subject Proposal for Merge Join for Non '=' Operators
Date
Msg-id 4205E661176A124FAF891E0A6BA913526595F303@SZXEML507-MBS.china.huawei.com
Whole thread Raw
Responses Re: Proposal for Merge Join for Non '=' Operators
List pgsql-hackers
<div class="WordSection1"><p class="MsoNormal" style="margin-left:0cm;mso-para-margin-left:0gd">I would like to propose
aNew merge join algorithm for optimizing non ‘=’ operators. (‘<’, ‘<=’, ‘>’, ‘>=’)<p class="MsoNormal"
style="margin-left:0cm;mso-para-margin-left:0gd"> <pclass="MsoListParagraphCxSpFirst"
style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l4level1 lfo1"><span
style="mso-list:Ignore">-<spanstyle="font:7.0pt "Times New Roman"">          </span></span>Currently  Merge join is
onlysupported for ‘=’ operator. For ‘<’ or ‘>’ operator it chooses Nest Loop Join, or Nest loop with
materialization.<pclass="MsoListParagraphCxSpMiddle"
style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l4level1 lfo1"><span
style="mso-list:Ignore">-<spanstyle="font:7.0pt "Times New Roman"">          </span></span>I think when tuple from
lowernode is sorted or sorting cost is very less, then we can use Merge Join for Non Equal operator also and which will
givebetter performance than NLJ (for selecting this new cost calculation can be implemented in planner).<p
class="MsoListParagraphCxSpMiddle"style="margin-left:36.0pt;mso-para-margin-left:0gd">  <p
class="MsoListParagraphCxSpMiddle"style="margin-left:36.0pt;mso-para-margin-left:0gd"><b>Example for using merge Join
for< operator.</b><p class="MsoListParagraphCxSpMiddle" style="margin-left:36.0pt;mso-para-margin-left:0gd">
                       T1                    T2<p class="MsoListParagraphCxSpMiddle"
style="margin-left:36.0pt;mso-para-margin-left:0gd">                        3                      1          <p
class="MsoListParagraphCxSpMiddle"style="margin-left:36.0pt;mso-para-margin-left:0gd">                        
4                     2<p class="MsoListParagraphCxSpMiddle" style="margin-left:36.0pt;mso-para-margin-left:0gd">
                       5                      4                      <p class="MsoListParagraphCxSpMiddle"
style="margin-left:36.0pt;mso-para-margin-left:0gd">Outer tuple (3) need to be compared with inner tuple one by one, so
itwill satisfy condition at third inner tuple (as 3<4). So here we can save this point of inner tuple so that next
outertuple can directly start comparison from this tuple.<p class="MsoListParagraphCxSpMiddle"
style="margin-left:57.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l2level1 lfo3"><span
style="mso-list:Ignore">1.<spanstyle="font:7.0pt "Times New Roman"">       </span></span>In this algorithm we can put
onemore optimization: Once outer tuple satisfies the Merge QUAL it can skip the Merge QUAL test with remaining inner
tupleand directly apply Other QUALs, as merge QUAL will always satisfy for remaining tuples.<p
class="MsoListParagraphCxSpMiddle"style="margin-left:36.0pt;mso-para-margin-left:0gd">  <p
class="MsoListParagraphCxSpMiddle"style="margin-left:36.0pt;mso-para-margin-left:0gd"><b>Implementation Detail:</b><p
class="MsoListParagraphCxSpMiddle"style="margin-left:72.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l1
level1lfo4"><span style="mso-list:Ignore">1.<span style="font:7.0pt "Times New Roman"">       </span></span>Need to add
newcost calculation mechanism for this. I still have to work on this part.<p class="MsoListParagraphCxSpMiddle"
style="margin-left:72.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l1level1 lfo4"><span
style="mso-list:Ignore">2.<spanstyle="font:7.0pt "Times New Roman"">       </span></span>Implementing in Executor<p
class="MsoListParagraphCxSpMiddle"style="margin-left:108.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l1
level2lfo4"><span style="mso-list:Ignore">a.<span style="font:7.0pt "Times New Roman"">       </span></span>This
algorithmis almost same as normal merge Join with some changes.<p class="MsoListParagraphCxSpLast"
style="margin-left:108.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l1level2 lfo4"><span
style="mso-list:Ignore">b.<spanstyle="font:7.0pt "Times New Roman"">       </span></span>Both Inner and Outer Data
Sourcesshould be sorted, same as normal merge Join.<p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd"><b>ALGORITHM:</b><pclass="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>MergeQual (R.A < Q.A)</i><p
class="MsoNormal"style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:6.0pt;line-height:normal"><i>r = first
tuplefrom R (Outer Relation)</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:6.0pt;line-height:normal"><i>q= first tuple in Q( Inner
Relation)</i><pclass="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:6.0pt;line-height:normal"><i>save_pos= q;  /* Position
tostart scanning in relation Q*/</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>While(fetch tuple r from R till relation
end)</i><pclass="MsoNormal" style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>{</i><p
class="MsoNormal"style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>            for each tuple q
inQ starting from save_pos</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>           {</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                            Merge Qual
Satisfy</i><pclass="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                            {</i><p
class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                                               
save_pos= q; </i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                                               
Consumeall subsequent tuples and project(just need to match Other Quals if any.)</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                            }</i><p
class="MsoNormal"style="margin-left:78.0pt;mso-para-margin-left:0gd;text-indent:6.0pt;line-height:normal"><i>Else
                                   </i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>                                   Fetch Next
tuplefrom Q;                                                 </i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>           }</i><p class="MsoNormal"
style="margin-left:36.0pt;mso-para-margin-left:0gd;line-height:normal"><i>}</i><pclass="MsoListParagraphCxSpFirst"
style="margin-left:36.0pt;mso-para-margin-left:0gd"> <p class="MsoListParagraphCxSpLast"
style="margin-left:36.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l4level1 lfo1"><span
style="mso-list:Ignore">-<spanstyle="font:7.0pt "Times New Roman"">          </span></span>Performance Comparison:<p
class="MsoNormal"style="margin-left:39.0pt;mso-para-margin-left:0gd">Suppose tuples of inner and outer is already
sortedor Index scan on inner and outer.<p class="MsoNormal"
style="margin-left:39.0pt;mso-para-margin-left:0gd">          <p class="MsoListParagraphCxSpFirst"
style="margin-left:75.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l0level1 lfo5"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>Thencost of NLJ is always O (r*q).<p class="MsoListParagraphCxSpMiddle"
style="margin-left:75.0pt;mso-para-margin-left:0gd;text-indent:-18.0pt;mso-list:l0level1 lfo5"><span
style="font-family:Symbol"><spanstyle="mso-list:Ignore">·<span style="font:7.0pt "Times New Roman"">        
</span></span></span>Thecost of this MJ will be b/w: O (n) to O (r*q).<p class="MsoListParagraphCxSpLast"
style="margin-left:75.0pt;mso-para-margin-left:0gd"> <p class="MsoNormal"
style="margin-left:57.0pt;mso-para-margin-left:0gd">Wherer is number of tuple in R (outer relation) and q is number of
tuplein Q (inner Relation).<p class="MsoNormal" style="margin-left:57.0pt;mso-para-margin-left:0gd"> <p
class="MsoNormal"style="margin-left:21.0pt;mso-para-margin-left:0gd">Please provide your feedback/suggestions.<p
class="MsoNormal"style="margin-left:21.0pt;mso-para-margin-left:0gd"> <p class="MsoNormal"
style="margin-left:21.0pt;mso-para-margin-left:0gd">Thanks& Regards,<p class="MsoNormal"
style="margin-left:21.0pt;mso-para-margin-left:0gd">DilipKumar<p class="MsoNormal" style="margin-left:21.0pt"><span
style="font-size:11.0pt;line-height:150%;font-family:"Calibri","sans-serif""> </span></div>

pgsql-hackers by date:

Previous
From: Rajeev rastogi
Date:
Subject: Re: Autonomous Transaction (WIP)
Next
From: Rajeev rastogi
Date:
Subject: Re: Autonomous Transaction (WIP)