Re: Memory usage during sorting - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Memory usage during sorting
Date
Msg-id 4F68E7F7.6080004@nasby.net
Whole thread Raw
In response to Re: Memory usage during sorting  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 3/18/12 10:25 AM, Tom Lane wrote:
> Jeff Janes<jeff.janes@gmail.com>  writes:
>> >  On Wed, Mar 7, 2012 at 11:55 AM, Robert Haas<robertmhaas@gmail.com>  wrote:
>>> >>  On Sat, Mar 3, 2012 at 4:15 PM, Jeff Janes<jeff.janes@gmail.com>  wrote:
>>>> >>>  Anyway, I think the logtape could use redoing.
>> >  The problem there is that none of the files can be deleted until it
>> >  was entirely read, so you end up with all the data on disk twice.  I
>> >  don't know how often people run their databases so close to the edge
>> >  on disk space that this matters, but someone felt that that extra
>> >  storage was worth avoiding.
> Yeah, that was me, and it came out of actual user complaints ten or more
> years back.  (It's actually not 2X growth but more like 4X growth
> according to the comments in logtape.c, though I no longer remember the
> exact reasons why.)  We knew when we put in the logtape logic that we
> were trading off speed for space, and we accepted that.  It's possible
> that with the growth of hard drive sizes, real-world applications would
> no longer care that much about whether the space required to sort is 4X
> data size rather than 1X.  Or then again, maybe their data has grown
> just as fast and they still care.
>

I believe the case of tape sorts that fit entirely in filesystem cache is a big one as well... doubling or worse the
amountof data that needed to live "on disk" at once would likely suck in that case.
 

Also, it's not uncommon to be IO-bound on a database server... so even if we're not worried about storing everything 2
ormore times from a disk space standpoint, we should be concerned about the IO bandwidth.
 
-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Error trying to compile a simple C trigger
Next
From: Tom Lane
Date:
Subject: Re: Memory usage during sorting