Re: [HACKERS] Tuplesort merge pre-reading - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [HACKERS] Tuplesort merge pre-reading
Date
Msg-id 5db9d579-efdb-857e-82a8-7763313e7931@iki.fi
Whole thread Raw
In response to Re: [HACKERS] Tuplesort merge pre-reading  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On 04/14/2017 08:19 AM, Peter Geoghegan wrote:
> BTW, I'm skeptical of the idea of Heikki's around killing polyphase
> merge itself at this point. I think that keeping most tapes active per
> pass is useful now that our memory accounting involves handing over an
> even share to each maybe-active tape for every merge pass, something
> established by Heikki's work on external sorting.

The pre-read buffers are only needed for input tapes; the total number 
of tapes doesn't matter.

For comparison, imagine that you want to perform a merge, such that you 
always merge 7 runs into one. With polyphase merge, you would need 8 
tapes, so that you always read from 7 of them, and write onto one. With 
balanced merge, you would need 14 tapes: you always read from 7 tapes, 
and you would need up to 7 output tapes, of which one would be active at 
any given time.

Those extra idle output tapes are practically free in our 
implementation. The "pre-read buffers" are only needed for input tapes, 
the number of output tapes doesn't matter. Likewise, maintaining the 
heap is cheaper if you only merge a small number of tapes at a time, but 
that's also dependent on the number of *input* tapes, not the total 
number of tapes.

- Heikki




pgsql-hackers by date:

Previous
From: Kang Yuzhe
Date:
Subject: Re: [HACKERS] On How To Shorten the Steep Learning Curve Towards PG Hacking...
Next
From: Jan Michálek
Date:
Subject: Re: [HACKERS] Other formats in pset like markdown, rst, mediawiki