Re: Threaded Sorting - Mailing list pgsql-hackers

From Greg Copeland
Subject Re: Threaded Sorting
Date
Msg-id 1033760681.13005.112.camel@mouse.copelandconsulting.net
Whole thread Raw
In response to Re: Threaded Sorting  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Threaded Sorting
List pgsql-hackers
On Fri, 2002-10-04 at 14:31, Bruce Momjian wrote:
> We use tape sorts, ala Knuth, meaning we sort in memory as much as
> possible, but when there is more data than fits in memory, rather than
> swapping, we write to temp files then merge the temp files (aka tapes).

Right, which is what I originally assumed.  On lower end systems, that
works great.  Once you allow that people may actually have high-end
systems with multiple CPUs and lots of memory, wouldn't it be nice to
allow for huge improvements on large sorts?  Basically, you have two
ends of the spectrum.  One, where you don't have enough memory and
become I/O bound.  The other is where you have enough memory but are CPU
bound; where potentially you have extra CPUs to spare.  Seems to me they
are not mutually exclusive.

Unless I've missed something, the ideal case is to never use tapes for
sorting.  Which is saying, you're trying to optimize an already less an
ideal situation (which is of course good).  I'm trying to discuss making
it a near ideal use of available resources.  I can understand why
addressing the seemingly more common I/O bound case would receive
priority, however, I'm at a loss as to why the other would be completely
ignored.  Seems to me, implementing both would even work in a
complimentary fashion on the low-end cases and yield more breathing room
for the high-end cases.

What am I missing for the other case to be completely ignored?

Greg


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Threaded Sorting
Next
From: Tom Lane
Date:
Subject: Re: Threaded Sorting