Mark Lewis <mark.lewis@mir3.com> writes:
> operations != passes. If you were clever, you could probably write a
> modified bubble-sort algorithm that only made 2 passes. A pass is a
> disk scan, operations are then performed (hopefully in memory) on what
> you read from the disk. So there's no theoretical log N lower-bound on
> the number of disk passes.
Given infinite memory that might be true, but I don't think I believe it
for limited memory. If you have room for K tuples in memory then it's
impossible to perform more than K*N useful comparisons per pass (ie, as
each tuple comes off the disk you can compare it to all the ones
currently in memory; anything more is certainly redundant work). So if
K < logN it's clearly not gonna work.
It's possible that you could design an algorithm that works in a fixed
number of passes if you are allowed to assume you can hold O(log N)
tuples in memory --- and in practice that would probably work fine,
if the constant factor implied by the O() isn't too big. But it's not
really solving the general external-sort problem.
regards, tom lane