question about index cost estimates - Mailing list pgsql-hackers

From Jeff Hoffmann
Subject question about index cost estimates
Date
Msg-id 3922F6C6.63483AF1@propertykey.com
Whole thread Raw
Responses Re: question about index cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
RE: question about index cost estimates  ("Hiroshi Inoue" <Inoue@tpf.co.jp>)
List pgsql-hackers
i know tom lane is probably sick of me, but i'm trying to figure out how
the cost estimators work for an index scan.  i'm logging a lot of the
calculations that go into the cost estimate for some sample queries to
see what factors are most important in a cost estimate.  it seems to me
that in the queries that i'm running, a vast majority of the cost comes
from reading the tuples from the relation.  i think the cost of
nonsequential access seems reasonable (it's about 2), but the
calculation of pages fetched from the relation seems high.  i'm not sure
what the actual should be, but some queries have it at 2-4x the number
of pages in the relation, which seemed high, so i started looking at
that.  i don't understand why the function that is used would be a good
model of what is actually happening.  here's the comment & the function:
* Estimate number of main-table tuples and pages fetched.** If the number of tuples is much smaller than the number of
pagesin* the relation, each tuple will cost a separate nonsequential fetch.* If it is comparable or larger, then
probablywe will be able to* avoid some fetches.  We use a growth rate of log(#tuples/#pages +* 1) --- probably totally
bogus,but intuitively it gives the right* shape of curve at least.
 
pages_fetched = ceil(baserel->pages * log(tuples_fetched /
baserel->pages + 1.0));

i'm at a total loss to explain how this works.  for all i know, it's
correct and it is that costly, i don't know.  it just seems to me that
there should be some notion of tuple size figured in to know how many
tuples fit in a page.  can somebody explain it to me?

thanks,

jeff


pgsql-hackers by date:

Previous
From: "Michael A. Olson"
Date:
Subject: RE: Berkeley DB license
Next
From: Peter Eisentraut
Date:
Subject: Re: OO Patch