Re: O_DIRECT use - Mailing list pgsql-hackers

From Matthew Kirkwood
Subject Re: O_DIRECT use
Date
Msg-id Pine.LNX.4.33.0201050011500.16532-100000@sphinx.mythic-beasts.com
Whole thread Raw
In response to Re: O_DIRECT use  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
On Fri, 4 Jan 2002, Bruce Momjian wrote:

> > >> For that matter, I would expect that O_DIRECT also defeats readahead,
> > >> so I'd fully expect it to be a loser for seqscans too.

> And this for FreeBSD 4.4:

>    The O_DIRECT flag has been added to open(2) and fcntl(2). Specifying this
>    flag for open files will attempt to minimize the cache effects of reading
>    and writing.

This seems rather vague.  Can any FreeBSD person here say
whether the semantics are any stronger?

>     http://www.ukuug.org/events/linux2001/papers/html/AArcangeli-o_direct.html
>
> These later ones seem to indicate there isn't read-ahead, meaning we
> would have to do our own prefetches.  Eck.  I am unclear if that is
> true on all OS's.

The Linux O_DIRECT semantics are intended to be harder.
In essence, the kernel _will not cache_ data read from
or written to such a file or device.

The point of this, incidentally, was to be able to run
things like Oracle Parallel Server and other shared-
disk setups.  It's use as an "I don't need this cached"
mechanism is secondary, and rather sub-optimal, as seen
here; you disable software read-ahead and introduce
coherence issues with non-O_DIRECT openers of the file.
(I'm not sure of the precise Linux semantics of this,
but it's probably fair to say that you may as well
consider them undefined.)

Linux 2.4 has "madvise", but unfortunately no matching
"fadvise".  A quick Google implied that FreeBSD is in
the same boat.

Matthew.



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: LWLock contention: I think I understand the problem
Next
From: Bruce Momjian
Date:
Subject: Re: LWLock contention: I think I understand the problem