On Tue, 4 Dec 2007, Gregory Stark wrote:
> "Matthew" <matthew@flymine.org> writes:
>
> > Does Postgres issue requests to each random access in turn, waiting for
> > each one to complete before issuing the next request (in which case the
> > performance will not exceed that of a single disc), or does it use some
> > clever asynchronous access method to send a queue of random access
> > requests to the OS that can be distributed among the available discs?
>
> Sorry, it does the former, at least currently.
>
> That said, this doesn't really come up nearly as often as you might think.
Shame. It comes up a *lot* in my project. A while ago we converted a task
that processes a queue of objects to processing groups of a thousand
objects, which sped up the process considerably. So we run an awful lot of
queries with IN lists with a thousand values. They hit the indexes, then
fetch the rows by random access. A full table sequential scan would take
much longer. It'd be awfully nice to have those queries go twelve times
faster.
> Normally queries fit mostly in either the large batch query domain or the
> small quick oltp query domain. For the former Postgres tries quite hard to do
> sequential i/o which the OS will do readahead for and you'll get good
> performance. For the latter you're normally running many simultaneous such
> queries and the raid array helps quite a bit.
Having twelve discs will certainly improve the sequential IO throughput!
However, if this was implemented (and I have *no* idea whatsoever how hard
it would be), then large index scans would scale with the number of discs
in the system, which would be quite a win, I would imagine. Large index
scans can't be that rare!
Matthew
--
Software suppliers are trying to make their software packages more
'user-friendly'.... Their best approach, so far, has been to take all
the old brochures, and stamp the words, 'user-friendly' on the cover.
-- Bill Gates