Re: RAID arrays and performance - Mailing list pgsql-performance

From Mark Mielke
Subject Re: RAID arrays and performance
Date
Msg-id 47557519.6090506@mark.mielke.cc
Whole thread Raw
In response to Re: RAID arrays and performance  (Matthew <matthew@flymine.org>)
Responses Re: RAID arrays and performance
Re: RAID arrays and performance
List pgsql-performance
Matthew wrote:
On Tue, 4 Dec 2007, Gregory Stark wrote: 
Also, it's true, you need to preread more than 12 blocks to handle a 12-disk
raid. My offhand combinatorics analysis seems to indicate you would expect to
need to n(n-1)/2 blocks on average before you've hit all the blocks. There's
little penalty to prereading unless you use up too much kernel resources or
you do unnecessary i/o which you never use, so I would expect doing n^2 capped
at some reasonable number like 1,000 pages (enough to handle a 32-disk raid)
would be reasonable.   
It's better than that, actually. Let's assume a RAID 0 striped set of
twelve discs. If you spread out twelve requests to twelve discs, then the
expected number of requests to each disc is one. The probablility that any
single disc receives more than say three requests is rather small. As you
increase the number of requests, the longest reasonably-expected queue
length for a particular disc gets closer to the number of requests divided
by the number of discs, as the requests get spread more and more evenly
among the discs.

The larger the set of requests, the closer the performance will scale to
the number of discs

This assumes that you can know which pages to fetch ahead of time - which you do not except for sequential read of a single table.

I think it would be possible that your particular case could be up to 6X faster, but that most other people will see little or no speed up. If it does read the wrong pages, it is wasting it's time.

I am not trying to discourage you - only trying to ensure that you have reasonable expectations. 12X is far too optimistic.

Please show one of your query plans and how you as a person would design which pages to request reads for.

Cheers,
mark

-- 
Mark Mielke <mark@mielke.cc>

pgsql-performance by date:

Previous
From: Gregory Stark
Date:
Subject: Re: RAID arrays and performance
Next
From: James Mansion
Date:
Subject: Re: RAID arrays and performance