Home > mailing lists

Re: question about index cost estimates - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: question about index cost estimates
Date	May 17, 2000 21:08:56
Msg-id	18977.958601273@sss.pgh.pa.us Whole thread Raw
In response to	question about index cost estimates (Jeff Hoffmann <jeff@propertykey.com>)
List	pgsql-hackers

Tree view

Jeff Hoffmann <jeff@propertykey.com> writes:
> calculation of pages fetched from the relation seems high.  i'm not sure
> what the actual should be, but some queries have it at 2-4x the number
> of pages in the relation, which seemed high, so i started looking at
> that.

If the table is bigger than the disk cache, and we are hitting pages
more than once, then we are likely to have to fetch some pages more than
once.  The model is assuming that the tuples we want are scattered more
or less randomly over the whole table.  If #tuples to be fetched
approaches or exceeds #pages in table, then clearly we are going to be
touching some pages more than once because we want more than one tuple
from them.  The $64 question is did the page fall out of buffer cache
between references?  Worst case (if your table vastly exceeds your
available buffer space), practically every tuple fetch will require a
separate physical read, and then the number of page fetches is
essentially the number of tuples returned --- which of course can be
a lot more than the number of pages in the table.

Right now these considerations are divided between cost_index() and
cost_nonsequential_access() in a way that might well be wrong.
I've been intending to dig into the literature and try to find a better
cost model but haven't gotten to it yet.

> it just seems to me that there should be some notion of tuple size
> figured in to know how many tuples fit in a page.

It seemed to me that the critical ratios are #tuples fetched vs #pages
in table and table size vs. cache size.  I could be wrong though...
        regards, tom lane

pgsql-hackers by date:

From: Peter Eisentraut
Date: 17 May 2000, 21:04:00
Subject: Re: type conversion discussion

From: Michael Schout
Date: 17 May 2000, 21:53:52
Subject: Re: 7.0 RPMS and syslog problem. (more)

Re: question about index cost estimates - Mailing list pgsql-hackers

Previous

Next