Re: Memory usage during sorting - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Memory usage during sorting
Date
Msg-id 6206.1332208630@sss.pgh.pa.us
Whole thread Raw
In response to Re: Memory usage during sorting  (Greg Stark <stark@mit.edu>)
Responses Re: Memory usage during sorting  (Greg Stark <stark@mit.edu>)
List pgsql-hackers
Greg Stark <stark@mit.edu> writes:
> On Mon, Mar 19, 2012 at 7:23 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> There's no real reason why the tuples destined for the next run need
>> to be maintained in heap order; we could just store them unordered and
>> heapify the whole lot of them when it's time to start the next run.

> This sounded familiar....
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=cf627ab41ab9f6038a29ddd04dd0ff0ccdca714e

Yeah, see also the pgsql-hackers thread starting here:
http://archives.postgresql.org/pgsql-hackers/1999-10/msg00384.php

That was a long time ago, of course, but I have some vague recollection
that keeping next-run tuples in the current heap achieves a net savings
in the total number of comparisons needed to heapify both runs.
Robert's point about integer comparisons being faster than data
comparisons may or may not be relevant.  Certainly they are faster, but
there are never very many run numbers in the heap at once (possibly no
more than 2, I forget; and in any case often only 1).  So I'd expect
most tuple comparisons to end up having to do a data comparison anyway.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Command Triggers, patch v11
Next
From: Joachim Wieland
Date:
Subject: Re: patch for parallel pg_dump