On Tue, 4 Dec 2007, Mark Mielke wrote:
> > The larger the set of requests, the closer the performance will scale to
> > the number of discs
>
> This assumes that you can know which pages to fetch ahead of time -
> which you do not except for sequential read of a single table.
There are circumstances where it may be hard to locate all the pages ahead
of time - that's probably when you're doing a nested loop join. However,
if you're looking up in an index and get a thousand row hits in the index,
then there you go. Page locations to load.
> Please show one of your query plans and how you as a person would design
> which pages to request reads for.
How about the query that "cluster <skrald@amossen.dk>" was trying to get
to run faster a few days ago? Tom Lane wrote about it:
| Wouldn't help, because the accesses to "questions" are not the problem.
| The query's spending nearly all its time in the scan of "posts", and
| I'm wondering why --- doesn't seem like it should take 6400msec to fetch
| 646 rows, unless perhaps the data is just horribly misordered relative
| to the index.
Which is exactly what's going on. The disc is having to seek 646 times
fetching a single row each time, and that takes 6400ms. He obviously has a
standard 5,400 or 7,200 rpm drive with a seek time around 10ms.
Or on a similar vein, fill a table with completely random values, say ten
million rows with a column containing integer values ranging from zero to
ten thousand. Create an index on that column, analyse it. Then pick a
number between zero and ten thousand, and
"SELECT * FROM table WHERE that_column = the_number_you_picked"
Matthew
--
Experience is what allows you to recognise a mistake the second time you
make it.