> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Hm. The theory about simple sequential reads is that we expect the
> >> kernel to optimize the disk access, since it'll recognize that we are
> >> doing sequential access to the table file and do read-aheads. Or that's
> >> the theory, anyway.
>
> > If it is Linux, they turn off read-ahead on the first fseek() and it
> > never gets turned on again, or so I am told. And because we have
> > virtual file descriptors, that table remains open after random access
> > and the readahead bit doesn't get set for the next sequential scan.
>
> Ugh. And even if we hacked the VFD code to close/reopen the file, the
> shared disk buffers might still have some entries for some blocks of
> the file, causing those blocks not to be requested during the seq scan,
> thus disabling read-ahead again.
>
> It sounds like we really ought to try to get this Linux behavior fixed
> to work more like BSD (ie, some reasonably small number of consecutive
> reads turns on read-ahead). Red Hat guys, are you listening?
I hit them with this yesterday, and sent an email this morning.
The solution is to have the readahead throttle based on the number of
cache hits from previous read-aheads. Basically, increase readahead on
sequential reads and turn off on read cache lookup failures (meaning it
doesn't have the requested block in its cache / random access). This
works in cases where the app does alternating reads from two parts of a
file.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026