Home > mailing lists

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: Improving N-Distinct estimation by ANALYZE
Date	January 10, 2006 18:00:17
Msg-id	87hd8bzqu5.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: Improving N-Distinct estimation by ANALYZE (Simon Riggs <simon@2ndquadrant.com>)
List	pgsql-hackers

Tree view

Simon Riggs <simon@2ndquadrant.com> writes:

> On Mon, 2006-01-09 at 22:08 -0500, Greg Stark wrote:
> 
> > So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
> > It seems 32k readahead is the default for Linux, or perhaps it's the
> > sequential access pattern that's triggering it.
> 
> Nah, Linux 2.6 uses flexible readahead logic. It increases slowly when
> you read sequentially, but halves the readahead if you do another access
> type. Can't see that would give an average readahead size of 32k.

I've actually read this code at one point in the past. IIRC the readahead is
capped at 32k, which I find interesting given the results. Since this is
testing sequential access patterns perhaps what's happening is the test for
readahead is too liberal.  

All the numbers I'm getting are consistent with a 32k readahead. Even I run my
program with a 4k block size I get performance equivalent to a full table scan
very quickly. If I use a 32k block size then the breakeven point is just over
20%.

I suppose what I really ought to do is make some pretty graphs.

-- 
greg

pgsql-hackers by date:

From: Tom Lane
Date: 10 January 2006, 17:12:20
Subject: Re: Question about Postgresql time fields(possible bug)

From: Jaime Casanova
Date: 10 January 2006, 18:55:35
Subject: Re: [COMMITTERS] A question about index internals

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

Previous

Next