Re: COUNT(*) and index-only scans - Mailing list pgsql-hackers

From Greg Stark
Subject Re: COUNT(*) and index-only scans
Date
Msg-id CAM-w4HMft5Mx-_dmff5Xk_icyXMgEHYV2_+MWd49MajzqJgqxA@mail.gmail.com
Whole thread Raw
In response to Re: COUNT(*) and index-only scans  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: COUNT(*) and index-only scans  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Oct 12, 2011 at 3:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> What I suggest as a first cut for that is: simply derate the visibility fraction as the fraction
>>> of the table expected to be scanned gets smaller.
>
>> I think there's a statistically more rigorous way of accomplishing the
>> same thing. If you treat the pages we estimate we're going to read as
>> a random sample of the population of pages then your expected value is
>> the fraction of the overall population that is all-visible but your
>> 95th percentile confidence interval will be, uh, a simple formula we
>> can compute but I don't recall off-hand.
>
> The problem is precisely that the pages a query is going to read are
> likely to *not* be a random sample, but to be correlated with
> recently-dirtied pages.

Sure, but I was suggesting aiming for the nth percentile rather than a
linear factor which I don't know has any concrete meaning.


-- 
greg


pgsql-hackers by date:

Previous
From: Aidan Van Dyk
Date:
Subject: Re: COUNT(*) and index-only scans
Next
From: Heikki Linnakangas
Date:
Subject: Re: [BUGS] *.sql contrib files contain unresolvable MODULE_PATHNAME