Re: Memory usage during sorting - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Memory usage during sorting
Date
Msg-id 22449.1334325087@sss.pgh.pa.us
Whole thread Raw
In response to Re: Memory usage during sorting  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
> On Sun, 2012-03-18 at 11:25 -0400, Tom Lane wrote:
>> Yeah, that was me, and it came out of actual user complaints ten or more
>> years back.  (It's actually not 2X growth but more like 4X growth
>> according to the comments in logtape.c, though I no longer remember the
>> exact reasons why.)  We knew when we put in the logtape logic that we
>> were trading off speed for space, and we accepted that.

> I skimmed through TAOCP, and I didn't find the 4X number you are
> referring to, and I can't think what would cause that, either. The exact
> wording in the comment in logtape.c is "4X the actual data volume", so
> maybe that's just referring to per-tuple overhead?

My recollection is that that was an empirical measurement using the
previous generation of code.  It's got nothing to do with per-tuple
overhead IIRC, but with the fact that the same tuple can be sitting on
multiple "tapes" during a polyphase merge, because some of the tapes can
be lying fallow waiting for future use --- but data on them is still
taking up space, if you do nothing to recycle it.  The argument in the
comment shows why 2X is the minimum space growth for a plain merge
algorithm, but that's only a minimum.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Last gasp
Next
From: Robert Haas
Date:
Subject: Re: Parameterized-path cost comparisons need some work