Home > mailing lists

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

From	Greg Stark
Subject	Re: Improving N-Distinct estimation by ANALYZE
Date	January 10, 2006 02:08:52
Msg-id	874q4c230p.fsf@stark.xeocode.com Whole thread Raw
In response to	Re: Improving N-Distinct estimation by ANALYZE (Greg Stark <gsstark@mit.edu>)
Responses	Re: Improving N-Distinct estimation by ANALYZE
List	pgsql-hackers

Tree view

Greg Stark <gsstark@MIT.EDU> writes:

> Well my theory was sort of half right. It has nothing to do with fooling Linux
> into thinking it's a sequential read. Apparently this filesystem was created
> with 32k blocks. I don't remember if that was intentional or if ext2/3 did it
> automatically based on the size of the filesystem.
> 
> So it doesn't have wide-ranging implications for Postgres's default 8k block
> size. But it is a good lesson about the importance of not using a larger
> filesystem block than Postgres's block size. The net effect is that if the
> filesystem block is N*8k then your random_page_cost goes up by a factor of N.
> That could be devastating for OLTP performance.

Hm, apparently I spoke too soon. tune2fs says the block size is in fact 4k.
Yet the performance of the block reading test program with 4k or 8k blocks
behaves as if Linux is reading 32k blocks. And in fact when I run it with 32k
blocks I get kind of behaviour we were expecting where the breakeven point is
around 20%.

So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
It seems 32k readahead is the default for Linux, or perhaps it's the
sequential access pattern that's triggering it.

I'm trying to test it with posix_fadvise() set to random access but I'm having
trouble compiling. Do I need a special #define to get posix_fadvise from
glibc?

-- 
greg

pgsql-hackers by date:

From: Tom Lane
Date: 10 January 2006, 00:20:40
Subject: Re: plperl vs LC_COLLATE (was Re: Possible savepoint bug)

From: Peter Eisentraut
Date: 10 January 2006, 10:24:27
Subject: current_setting returns 'unset'

Re: Improving N-Distinct estimation by ANALYZE - Mailing list pgsql-hackers

Previous

Next