Re: RAID arrays and performance - Mailing list pgsql-performance

From Matthew
Subject Re: RAID arrays and performance
Date
Msg-id Pine.LNX.4.58.0712041551410.3731@aragorn.flymine.org
Whole thread Raw
In response to Re: RAID arrays and performance  (Mark Mielke <mark@mark.mielke.cc>)
Responses Re: RAID arrays and performance
List pgsql-performance
On Tue, 4 Dec 2007, Mark Mielke wrote:
> > The larger the set of requests, the closer the performance will scale to
> > the number of discs
>
> This assumes that you can know which pages to fetch ahead of time -
> which you do not except for sequential read of a single table.

There are circumstances where it may be hard to locate all the pages ahead
of time - that's probably when you're doing a nested loop join. However,
if you're looking up in an index and get a thousand row hits in the index,
then there you go. Page locations to load.

> Please show one of your query plans and how you as a person would design
> which pages to request reads for.

How about the query that "cluster <skrald@amossen.dk>" was trying to get
to run faster a few days ago? Tom Lane wrote about it:

| Wouldn't help, because the accesses to "questions" are not the problem.
| The query's spending nearly all its time in the scan of "posts", and
| I'm wondering why --- doesn't seem like it should take 6400msec to fetch
| 646 rows, unless perhaps the data is just horribly misordered relative
| to the index.

Which is exactly what's going on. The disc is having to seek 646 times
fetching a single row each time, and that takes 6400ms. He obviously has a
standard 5,400 or 7,200 rpm drive with a seek time around 10ms.

Or on a similar vein, fill a table with completely random values, say ten
million rows with a column containing integer values ranging from zero to
ten thousand. Create an index on that column, analyse it. Then pick a
number between zero and ten thousand, and

"SELECT * FROM table WHERE that_column = the_number_you_picked"

Matthew

--
Experience is what allows you to recognise a mistake the second time you
make it.

pgsql-performance by date:

Previous
From: James Mansion
Date:
Subject: Re: RAID arrays and performance
Next
From: Pallav Kalva
Date:
Subject: Optimizer Not using the Right plan