Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Improving N-Distinct estimation by ANALYZE
Date
Msg-id 25000.1136570880@sss.pgh.pa.us
Whole thread Raw
In response to Re: Improving N-Distinct estimation by ANALYZE  ("Jim C. Nasby" <jnasby@pervasive.com>)
List pgsql-hackers
"Jim C. Nasby" <jnasby@pervasive.com> writes:
> Before we start debating merits of proposals based on random reads, can
> someone confirm that the sampling code actually does read randomly?

Well, it's not so much that it's not "random", as that it's not
sequential --- it skips blocks, and therefore you'd expect that
kernel-level read-ahead would not kick in, or at least not be very
effective.

If there weren't much else going on, you could still assume that
you'd be paying less seek cost than in a genuinely random-order
fetching of the same number of blocks.

Not sure how these effects would add up.  I agree that some
investigation would be wise before making any claims about how
expensive the current method actually is.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: catalog corruption bug
Next
From: Greg Stark
Date:
Subject: Re: Improving N-Distinct estimation by ANALYZE