Ron Peacetree wrote:
> AFAIK, there are only 3 general purpose internal sorting techniques
> that have O(n) behavior:
Hum? NO "general purpose" sorting technique has O(n) behaviour.
The theoretical best scenario, _in general_, is O(n log n).
Insertion sort is expected to provide O(n^2) behaviour, and radix-like
schemes get arbitrarily memory hungry and have bad pathological results.
> All of the above have potentially nasty trade-offs in comparision to
> the standard heavily tuned median-of-three quicksort used by the sort
> unix command. Nonetheless, I could see some value in providing all of
> these with PostgeSQL (including a decent port of the unix sort routine
> for the Win folks). I'll note in passing that quicksort's Achille's
> heel is that it's unstable (all of the rest are stable), which can be
> a problem in a DB.
Making one sort algorithm work very efficiently is quite likely to be a
lot more effective than frittering away time trying to get some special
cases (that you can't regularly use) to work acceptably.
> All of this seems to imply that instead of mergesort (which is
> stable), PostgreSQL might be better off with the 4 sorts I've listed
> plus a viciously efficient merge utility for combining partial results
> that do fit into memory into result files on disk that don't.
>
> Or am I crazy?
More than likely. It is highly likely that it will typically take more
computational effort to figure out that one of the 4 sorts provided
/any/ improvement than any computational effort they would save.
That's a /very/ common problem. There's also a fair chance, seen in
practice, that the action of collecting additional statistics to improve
query optimization will cost more than the savings provided by the
optimizations.
--
(concatenate 'string "cbbrowne" "@acm.org")
http://www3.sympatico.ca/cbbrowne/wp.html
When ever in doubt consult a song. --JT Fletcher