Gavin Sherry wrote:
> On Mon, 5 Mar 2007, Mark Kirkwood wrote:
>
>> To add a little to this - forgetting the scan resistant point for the
>> moment... cranking down shared_buffers to be smaller than the L2 cache
>> seems to help *any* sequential scan immensely, even on quite modest HW:
>>
> (snipped)
>> When I've profiled this activity, I've seen a lot of time spent
>> searching for/allocating a new buffer for each page being fetched.
>> Obviously having less of them to search through will help, but having
>> less than the L2 cache-size worth of 'em seems to help a whole lot!
>
> Could you demonstrate that point by showing us timings for shared_buffers
> sizes from 512K up to, say, 2 MB? The two numbers you give there might
> just have to do with managing a large buffer.
Yeah - good point:
PIII 1.26 Ghz 512Kb L2 cache 2G RAM
Test is elapsed time for: SELECT count(*) FROM lineitem
lineitem has 1535724 pages (11997 MB)
Shared Buffers Elapsed IO rate (from vmstat)
-------------- ------- ---------------------
400MB 101 s 122 MB/s
2MB 100 s
1MB 97 s
768KB 93 s
512KB 86 s
256KB 77 s
128KB 74 s 166 MB/s
I've added the observed IO rate for the two extreme cases (the rest can
be pretty much deduced via interpolation).
Note that the system will do about 220 MB/s with the now (in)famous dd
test, so we have a bit of headroom (not too bad for a PIII).
Cheers
Mark