A worst case for qsort - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | A worst case for qsort |
Date | |
Msg-id | CAM3SWZS3obANDaPqDy4rPSA962O7Tb1x-HDqE+AMkD8FwaFPrQ@mail.gmail.com Whole thread Raw |
Responses |
Re: A worst case for qsort
Re: A worst case for qsort Re: A worst case for qsort |
List | pgsql-hackers |
There was some informal discussion of worst case performance of my text sort support patch at the most recent developer's meeting in Ottawa. There is no need to repeat the details here, which aren't terribly interesting - I made a point about putting worst case performance in context. I also made an argument around quicksort, which despite being a widely used sorting algorithm is known to go quadratic in the worst case, a really bad outcome. I've heard stories about GUIs that hit the worst case and froze when a user clicked on a column header to sort its contents. This kind of thing occurred until surprisingly recently, and was one reason why Postgres eventually adopted its own qsort() rather than using the OS-supplied implementation. A lot of the measures proposed by Sedgewick in particular made these outcomes occur very infrequently in practice. Our quicksort routine is almost a verbatim copy of the implementation that appeared in NetBSD, which is itself almost a verbatim copy of the implementation that appears in the 1993 paper by J. L. Bentley and M. D. McIlroy, ‘Engineering a sort function’. McIlroy (with some input from Bentley) subsequently described a way of making many high quality implementations perform very badly, including his own: http://www.cs.dartmouth.edu/~doug/mdmspe.pdf I don't want to belabor the point here. I do not mean to imply that everything I've proposed to date wrt sort support for text is self-evidently reasonable, in the same way that our use of Bentley and McIlroy's qsort() seems reasonable on balance. I expect some interesting discussion there. Anyway, the paper says: "The adversarial method works for almost any polymorphic program recognizable as quicksort. The subject quicksort may copy values at will, or work with lists rather than arrays. It may even pick the pivot at random. The quicksort will be vulnerable provided only that it satisfies some mild assumptions that are met by every implementation I have seen". IMHO, while worst case performance is a very useful tool for analyzing algorithms (particularly their worst case time complexity), a worst case should be put in its practical context. For example, if we had reason to be concerned about *adversarial* inputs, I think that there is a good chance that our qsort() actually would be problematic to the point of driving us to prefer some generally slower alternative. -- Peter Geoghegan
pgsql-hackers by date: