Hiroshi Inoue wrote:
>
> > -----Original Message-----
> > From: pgsql-hackers-owner@hub.org [mailto:pgsql-hackers-owner@hub.org]On
> > Behalf Of Jeff Hoffmann
> >
>
> [snip]
>
> >
> > pages_fetched = ceil(baserel->pages * log(tuples_fetched /
> > baserel->pages + 1.0));
> >
>
> Unfortunately I didn't understand this well either.
>
> pages_fetched seems to be able to be greater than
> baserel->pages.
not only does it seem that way, you can expect it to happen fairly
frequently, even if you're pulling only 1-2% of the records with a
query. if you don't believe it, check the actual performance of a few
queries.
> But if there's sufficiently large buffer
> space pages_fetched would be <= baserel->pages.
> Are there any assupmtions about buffer space ?
>
the # of pages fetched would be the same, it'd just be cheaper to pull
them from the buffer instead of from disk. that's what isn't being
taken into consideration properly in the estimate.
the real question is what assumptions can you make about buffer space?
you don't know how many concurrent accesses there are (all sharing
buffer space). i also don't think you can count on knowing the size of
the buffer space. therefore, the buffer space is set to some constant
intermediate value & it is taken account of, at least in the
cost_nonsequential_tuple.
the question is this: shouldn't you be able to make an educated guess at
this by dividing the total buffer space allocated by the backend by the
number of postmaster processes running at the time? or don't you know
those things?
jeff