Re: [PERFORM] A Better External Sort? - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: [PERFORM] A Better External Sort?
Date
Msg-id 1128118876.4045.53.camel@localhost.localdomain
Whole thread Raw
In response to Re: [PERFORM] A Better External Sort?  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Fri, 2005-09-30 at 13:41 -0700, Josh Berkus wrote:
> Yeah, that's what I thought too.   But try sorting an 10GB table, and
> you'll see: disk I/O is practically idle, while CPU averages 90%+.   We're
> CPU-bound, because sort is being really inefficient about something. I
> just don't know what yet.
>
> If we move that CPU-binding to a higher level of performance, then we can
> start looking at things like async I/O, O_Direct, pre-allocation etc. that
> will give us incremental improvements.   But what we need now is a 5-10x
> improvement and that's somewhere in the algorithms or the code.

I'm trying to keep an open mind about what the causes are, and I think
we need to get a much better characterisation of what happens during a
sort before we start trying to write code. It is always too easy to jump
in and tune the wrong thing, which is not a good use of time.

The actual sort algorithms looks damn fine to me and the code as it
stands is well optimised. That indicates to me that we've come to the
end of the current line of thinking and we need a new approach, possibly
in a number of areas.

For myself, I don't wish to be drawn further on solutions at this stage
but I am collecting performance data, so any test results are most
welcome.

Best Regards, Simon Riggs




pgsql-hackers by date:

Previous
From: Michael Fuhr
Date:
Subject: Re: Expression index ignores column statistics target
Next
From: Simon Riggs
Date:
Subject: Re: effective SELECT from child tables