Re: RAID arrays and performance - Mailing list pgsql-performance

From Matthew
Subject Re: RAID arrays and performance
Date
Msg-id Pine.LNX.4.58.0712041349380.3731@aragorn.flymine.org
Whole thread Raw
In response to Re: RAID arrays and performance  (Mark Mielke <mark@mark.mielke.cc>)
Responses Re: RAID arrays and performance
List pgsql-performance
On Tue, 4 Dec 2007, Mark Mielke wrote:
> The bitmap scan method does ordered reads of the table, which can
> partially take advantage of sequential reads. Not sure whether bitmap
> scan is optimal for your situation or whether your situation would allow
> this to be taken advantage of.

Bitmap scan may help where several randomly-accessed pages are next to
each other. However, I would expect the operating system's readahead and
buffer cache to take care of that one.

> Do you know that there is a problem, or are you speculating about one? I
> think your case would be far more compelling if you could show a
> problem. :-)
>
> I would think that at a minimum, having 12 disks with RAID 0 or RAID 1+0
> would allow your insane queries to run concurrent with up to 12 other
> queries.

Yeah, I don't really care about concurrency. It's pretty obvious that
running x concurrent queries on an x-disc RAID system will allow the
utilisation of all the discs at once, therefore allowing the performance
to scale by x. What I'm talking about is a single query running on an
x-disc RAID array.

> Unless your insane query is the only query in use on the system...

That's exactly the case.

> I recall talk of more intelligent table scanning algorithms, and the use
> of asynchronous I/O to benefit from RAID arrays, but the numbers
> prepared to convince people that the change would have effect have been
> less than impressive.

I think a twelve-times speed increase is impressive. Granted, given
greatly-concurrent access, the benefits go away, but I think it'd be worth
it for when there are few queries running on the system.

I don't think you would have to create a more intelligent table scanning
algorithm. What you would need to do is take the results of the index,
convert that to a list of page fetches, then pass that list to the OS as
an asynchronous "please fetch all these into the buffer cache" request,
then do the normal algorithm as is currently done. The requests would then
come out of the cache instead of from the disc. Of course, this is from a
simple Java programmer who doesn't know the OS interfaces for this sort of
thing.

Matthew

--
Here we go - the Fairy Godmother redundancy proof.
                                        -- Computer Science Lecturer

pgsql-performance by date:

Previous
From: Mark Mielke
Date:
Subject: Re: RAID arrays and performance
Next
From: Mark Mielke
Date:
Subject: Re: RAID arrays and performance