On Thu, Aug 20, 2015 at 1:16 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Thu, Aug 20, 2015 at 12:42 PM, Feng Tian <ftian@vitessedata.com> wrote: > Just a quick anecdotal evidence. I did similar experiment about three years > ago. The conclusion was that if you have SSD, just do quick sort and > forget the longer runs, but if you are using hard drives, longer runs is the > winner (and safer, to avoid cliffs). I did not experiment with RAID0/5 on > many spindles though. > > Not limited to sort, more generally, SSD is different enough from HDD, > therefore it may worth the effort for backend to "guess" what storage device > it has, then choose the right thing to do.
The devil is in the details. I cannot really comment on such a general statement.
I would be willing to believe that that's true under unrealistic/unrepresentative conditions. Specifically, when multiple passes are required with a sort-merge strategy where that isn't the case with replacement selection. This could happen with a tiny work_mem setting (tiny in an absolute sense more than a relative sense). With an HDD, where sequential I/O is so much faster, this could be enough to make replacement selection win, just as it would have in the 1970s with magnetic tapes.
As I've said, the solution is to simply avoid multiple passes, which should be possible in virtually all cases because of the quadratic growth in a classic hybrid sort-merge strategy's capacity to avoid multiple passes (growth relative to work_mem's growth). Once you ensure that, then you probably have a mostly I/O bound workload, which can be made faster by adding sequential I/O capacity (or, on the Postgres internals side, adding asynchronous I/O, or with memory prefetching). You cannot really buy a faster CPU to make a degenerate heapsort faster. -- Peter Geoghegan
Agree everything in principal,except one thing -- no, random IO on HDD in 2010s (relative to CPU/Memory/SSD), is not any faster than tape in 1970s. :-)