On Mon, Jan 22, 2007 at 05:11:03PM -0600, Jim C. Nasby wrote:
> On Mon, Jan 22, 2007 at 12:17:39PM -0800, Ron Mayer wrote:
> > Gregory Stark wrote:
> > >
> > > Actually no. A while back I did experiments to see how fast reading a file
> > > sequentially was compared to reading the same file sequentially but skipping
> > > x% of the blocks randomly. The results were surprising (to me) and depressing.
> > > The breakeven point was about 7%. [...]
> > >
> > > The theory online was that as long as you're reading one page from each disk
> > > track you're going to pay the same seek overhead as reading the entire track.
> >
> > Could one take advantage of this observation in designing the DSM?
> >
> > Instead of a separate bit representing every page, having each bit
> > represent 20 or so pages might be a more useful unit. It sounds
> > like the time spent reading would be similar; while the bitmap
> > would be significantly smaller.
>
> If we extended relations by more than one page at a time we'd probably
> have a better shot at the blocks on disk being contiguous and all read
> at the same time by the OS.
> --
> Jim Nasby jim@nasby.net
> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>
Yes, most OS have some read-ahead when reading a file from disk. Any
increment over 1 would be an improvement. If you used a counter with
a time-based decrement function, you could increase the amount that
the relation is extended based on temporal proximity. If you have
extended it several times recently, increase the size of the new
extension to reduce the overhead even further. The default should
be approximately the OS standard read-ahead amount.
Ken