Re: Merge algorithms for large numbers of "tapes" - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Merge algorithms for large numbers of "tapes"
Date
Msg-id 18919.1141859825@sss.pgh.pa.us
Whole thread Raw
In response to Re: Merge algorithms for large numbers of "tapes"  ("Dann Corbit" <DCorbit@connx.com>)
List pgsql-hackers
"Dann Corbit" <DCorbit@connx.com> writes:
> Here are some suggestions of things that I know work really, really
> well:
> #1.  Two pass merge (none of that silly poly-tape merge goo)

This amounts to an assumption that you have infinite work_mem, in which
case you hardly need an external sort at all.  If your work_mem is in
fact finite, then at some point you need more than two passes.  I'm not
really interested in ripping out support for sort operations that are
much larger than work_mem.

> #2.  Load ONLY the keys that are to be sorted into memory.  Use a
> pointer exchange sort, and do not move the physical rows of data at all.

This suggestion isn't a whole lot better; in general the rows to be
sorted don't exist until we compute them, and so proposing that we
"don't load them until later" is pretty much irrelevant.  Also, in
a lot of common cases the keys to be sorted are the bulk of the data
anyway.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [COMMITTERS] pgsql: Remove Christof Petig copyright on
Next
From: "Dann Corbit"
Date:
Subject: Re: Merge algorithms for large numbers of "tapes"