> I didn't know much about it either until the comments in psort.c led me
> to read the relevant parts of Knuth. Tape rewind and disk seek times
Yes, that's what I did too.
> are not remotely comparable, especially when you're talking about N tape
> units versus one disk unit --- using more files (tapes) to minimize
> rewind time can be a big win on tapes, but put those same files on a
> disk and you're just spreading the seek time around differently. More
> to the point, though, tape-oriented algorithms tend to assume that tape
> space is free. Polyphase merge doesn't care in the least that it has
> several copies of any given record laying about on different tapes, only
> one of which is of interest anymore.
>
> I know how to get the sort space requirement down to 2x the actual data
> volume (just use a balanced merge instead of the "smarter" polyphase)
> but I am thinking about ways to get it down to data volume + a few
> percent with an extra layer of bookkeeping. The trick is to release
> and recycle space as soon as we have read in the tuples stored in it...
The thing I liked about the existing algorithm is that it did the
sorting without the page faulting/thrashing caused by many sort
algorithms. I can see better space reuse as being a big win because we
would get even less page faulting because the replaced page would
already be in memory.
-- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026