On Thu, Nov 14, 2013 at 9:09 AM, KONDO Mitsumasa
<kondo.mitsumasa@lab.ntt.co.jp> wrote:
> I create a patch that is improvement of disk-read and OS file caches. It can
> optimize kernel readahead parameter using buffer access strategy and
> posix_fadvice() in various disk-read situations.
>
> In general OS, readahead parameter was dynamically decided by disk-read
> situations. If long time disk-read was happened, readahead parameter becomes big.
> However it is based on experienced or heuristic algorithm, it causes waste
> disk-read and throws out useful OS file caches in some case. It is bad for
> disk-read performance a lot.
It would be relevant to know which kernel did you use for those tests.
@@ -677,6 +677,7 @@ mdread(SMgrRelation reln, ForkNumber forknum,
BlockNumber blocknum, errmsg("could not seek to block %u in file \"%s\": %m",
blocknum,FilePathName(v->mdfd_vfd))));
+ BufferHintIOAdvise(v->mdfd_vfd, buffer, BLCKSZ, strategy); nbytes = FileRead(v->mdfd_vfd, buffer, BLCKSZ);
TRACE_POSTGRESQL_SMGR_MD_READ_DONE(forknum, blocknum,
A while back, I tried to use posix_fadvise to prefetch index pages. I
ended up finding out that interleaving posix_fadvise with I/O like
that severly hinders (ie: completely disables) the kernel's read-ahead
algorithm.
How exactly did you set up those benchmarks? pg_bench defaults?
pg_bench does not exercise heavy sequential access patterns, or long
index scans. It performs many single-page index lookups per
transaction and that's it. You may want to try your patch with more
real workloads, and maybe you'll confirm what I found out last time I
messed with posix_fadvise. If my experience is still relevant, those
patterns will have suffered a severe performance penalty with this
patch, because it will disable kernel read-ahead on sequential index
access. It may still work for sequential heap scans, because the
access strategy will tell the kernel to do read-ahead, but many other
access methods will suffer.
Try OLAP-style queries.