Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Improving N-Distinct estimation by ANALYZE
Date
Msg-id 87hd8bzqu5.fsf@stark.xeocode.com
Whole thread Raw
In response to Re: Improving N-Distinct estimation by ANALYZE  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:

> On Mon, 2006-01-09 at 22:08 -0500, Greg Stark wrote:
> 
> > So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
> > It seems 32k readahead is the default for Linux, or perhaps it's the
> > sequential access pattern that's triggering it.
> 
> Nah, Linux 2.6 uses flexible readahead logic. It increases slowly when
> you read sequentially, but halves the readahead if you do another access
> type. Can't see that would give an average readahead size of 32k.

I've actually read this code at one point in the past. IIRC the readahead is
capped at 32k, which I find interesting given the results. Since this is
testing sequential access patterns perhaps what's happening is the test for
readahead is too liberal.  

All the numbers I'm getting are consistent with a 32k readahead. Even I run my
program with a 4k block size I get performance equivalent to a full table scan
very quickly. If I use a 32k block size then the breakeven point is just over
20%.

I suppose what I really ought to do is make some pretty graphs.

-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Question about Postgresql time fields(possible bug)
Next
From: Jaime Casanova
Date:
Subject: Re: [COMMITTERS] A question about index internals