Re: Replacement Selection - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Replacement Selection
Date
Msg-id 4430.1196115427@sss.pgh.pa.us
Whole thread Raw
In response to Re: Replacement Selection  (<mac_man2005@hotmail.it>)
Responses Re: Replacement Selection  (Gregory Stark <stark@enterprisedb.com>)
Re: Replacement Selection  (<mac_man2005@hotmail.it>)
Re: Replacement Selection  (<mac_man2005@hotmail.it>)
List pgsql-hackers
<mac_man2005@hotmail.it> writes:
> Anyway, even in my RS implementation a longer run is created. The first M 
> initialization elements will surely form part of the current run. M is the 
> memory size so at least a run sized M will be created. After initialization, 
> the elements are not suddenly output, but an element from heap is output 
> into run as soon as I get an element from stream. In other words, for each 
> element from stream, the root element of the heap is output, and the input 
> element takes the root place into the heap. If that element is a "good 
> record" I just heapify (since the element will be placed at the now free 
> root place). If that input element is a dead record I swap it with the last 
> leaf and reduce the heap size.

AFAICS that produces runs that are *exactly* the same length as Knuth's
method --- you're just using a different technique for detecting when
the run is over, to wit "record is not in heap" vs "record is in heap
but with a higher run number".  I guess you would save some comparisons
while the heap is shrinking, but it's not at all clear that you'd save
more than what it will cost you to re-heapify all the dead records once
the run is over.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: 8.3devel slower than 8.2 under read-only load
Next
From: Mark Cave-Ayland
Date:
Subject: Re: Locating sharedir in PostgreSQL on Windows