Introducing a new linux readahead framework - Mailing list pgsql-performance

From Wu Fengguang
Subject Introducing a new linux readahead framework
Date
Msg-id 20060420020853.GA4979@mail.ustc.edu.cn
Whole thread Raw
List pgsql-performance
Greetings,

I'd like to introduce a new readahead framework of the linux kernel:
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1021.html

HOW IT WORKS

In adaptive readahead, the context based method may be of particular
interest to postgresql users. It works by peeking into the file cache
and check if there are any history pages present or accessed. In this
way it can detect almost all forms of sequential / semi-sequential read
patterns, e.g.
    - parallel / interleaved sequential scans on one file
    - sequential reads across file open/close
    - mixed sequential / random accesses
    - sparse / skimming sequential read

It also have methods to detect some less common cases:
    - reading backward
    - seeking all over reading N pages

WAYS TO BENEFIT FROM IT

As we know, postgresql relies on the kernel to do proper readahead.
The adaptive readahead might help performance in the following cases:
    - concurrent sequential scans
    - sequential scan on a fragmented table
      (some DBs suffer from this problem, not sure for pgsql)
    - index scan with clustered matches
    - index scan on majority rows (in case the planner goes wrong)

TUNABLE PARAMETERS

There are two parameters which are described in this email:
http://www.ussg.iu.edu/hypermail/linux/kernel/0603.2/1024.html

Here are the more oriented guidelines for postgresql users:

- /proc/sys/vm/readahead_ratio
Since most DB servers are bounty of memory, the danger of readahead
thrashing is near to zero. In this case, you can set readahead_ratio to
100(or even 200:), which helps the readahead window to scale up rapidly.

- /proc/sys/vm/readahead_hit_rate
Sparse sequential reads are read patterns like {0, 2, 4, 5, 8, 11, ...}.
In this case we might prefer to do readahead to get good I/O performance
with the overhead of some useless pages. But if you prefer not to do so,
set readahead_hit_rate to 1 will disable this feature.

- /sys/block/sd<X>/queue/read_ahead_kb
Set it to a large value(e.g. 4096) as you used to do.
RAID users might want to use a bigger number.

TRYING IT OUT

The latest patch for stable kernels can be downloaded here:
http://www.vanheusden.com/ara/

Before compiling, make sure that the following options are enabled:
Processor type and features -> Adaptive file readahead
Processor type and features ->   Readahead debug and accounting


The patch is open to fine tuning advices.
Comments and benchmarking results are highly appreciated.

Thanks,
Wu

pgsql-performance by date:

Previous
From: Mark Kirkwood
Date:
Subject: Re: Hardware: HP StorageWorks MSA 1500
Next
From: Will Reese
Date:
Subject: Slow deletes in 8.1 when FKs are involved