Re: polyphase merge? - Mailing list pgsql-hackers

From Don Marvick
Subject Re: polyphase merge?
Date
Msg-id d18e24870902041852t4f69a91ay54409fcf3297ab26@mail.gmail.com
Whole thread Raw
In response to Re: polyphase merge?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
This is what I understand from existing implementation:<br />- fix the merge fan in F, and number of tapes=F+1<br />-
foreach sorted run generated, we have a formula to determine which tape it belongs to<br />- the sorted run is added to
theend of the tape<br /> - all sorted runs have been added to tape. There is always an empty tape called output tape<br
/>-add dummy runs if necessary, to each tape<br />- merge the input tapes, write result to output tape, until one of
theinput tape is empty<br /> - repeat this process until 1 sorted run remains<br /><br /><br />I think the main
differenceis that at each step, the current impl does not merge the smallest k-runs. It merges from the beginning of
eachtape. For 1 pass, this does not make any difference. For 2 passes onwards there are some differences. In some
complexquery, where the memory is eaten by some other operators, more passes are needed and the differences become
larger.This is the only improvement that can be achieved. Let me know if this does not make any sense.<br /><br
/>Regardingdisk layout, I am open to suggestion whether we should reuse the tape module or not. <br /><br /><br /><br
/><divclass="gmail_quote">On Wed, Feb 4, 2009 at 11:20 PM, Tom Lane <span dir="ltr"><<a
href="mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>></span>wrote:<br /><blockquote class="gmail_quote"
style="border-left:1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">Don
Marvick<<a href="mailto:donmarvick@gmail.com">donmarvick@gmail.com</a>> writes:<br /> > Since nobody replied,
Iwould give it a try. I am going to implement the<br /> > merge pattern described in Knuth Page 365 (5.4.9),
essentiallyit is as<br /> > follow:<br /> > - create initial runs using replacement selection (basically this is
asin<br /> > the current implementation)<br /> > - add enough dummy runs of size 0 so that the number of sorted
runminus one<br /> > can be divided by k-1 (k is merge fan in)<br /> > - repetitively merge k smallest runs until
only1 run left<br /><br /></div>How is this better than, or even different from, what is there now?<br /><br />        
              regards, tom lane<br /></blockquote></div><br /> 

pgsql-hackers by date:

Previous
From: Don Marvick
Date:
Subject: Re: polyphase merge?
Next
From: Don Marvick
Date:
Subject: Re: polyphase merge?