Thread: O_DIRECT, or madvise and/or posix_fadvise

O_DIRECT, or madvise and/or posix_fadvise

From
markwkm@gmail.com
Date:
I caught this thread about O_DIRECT on kerneltrap.org: http://kerneltrap.org/node/7563

It sounds like there is much to be gained here in terms of reducing
the number of user/kernel space copies in the operating system.  I got
the impression that posix_fadvise in the Linux kernel isn't as good as
it could be.  I noticed in xlog.c that the use of posix_fadvise is
disabled.  Maybe it's time to do some more experimenting and working
with the Linux kernel developers.  Or perhaps there is another OS that
would be better to experiment with?

Not sure where to start but do people think this is worth taking a stab at?

Regards,
Mark


Re: O_DIRECT, or madvise and/or posix_fadvise

From
Martijn van Oosterhout
Date:
On Thu, Jan 11, 2007 at 02:35:13PM -0800, markwkm@gmail.com wrote:
> I caught this thread about O_DIRECT on kerneltrap.org:
>  http://kerneltrap.org/node/7563
>
> It sounds like there is much to be gained here in terms of reducing
> the number of user/kernel space copies in the operating system.  I got
> the impression that posix_fadvise in the Linux kernel isn't as good as
> it could be.  I noticed in xlog.c that the use of posix_fadvise is
> disabled.  Maybe it's time to do some more experimenting and working
> with the Linux kernel developers.  Or perhaps there is another OS that
> would be better to experiment with?

Postgres doesn't use O_DIRECT and probably never will. The system is
esigned to use the system cache, not bypass it.

What recent discussions have highlighted is the need to more accurately
control the flow of data to disk. Apparently currently kernel try to
hold data back much longer than is useful.

Not that I'm volunterring to deal with this.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: O_DIRECT, or madvise and/or posix_fadvise

From
markwkm@gmail.com
Date:
On 1/12/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> On Thu, Jan 11, 2007 at 02:35:13PM -0800, markwkm@gmail.com wrote:
> > I caught this thread about O_DIRECT on kerneltrap.org:
> >  http://kerneltrap.org/node/7563
> >
> > It sounds like there is much to be gained here in terms of reducing
> > the number of user/kernel space copies in the operating system.  I got
> > the impression that posix_fadvise in the Linux kernel isn't as good as
> > it could be.  I noticed in xlog.c that the use of posix_fadvise is
> > disabled.  Maybe it's time to do some more experimenting and working
> > with the Linux kernel developers.  Or perhaps there is another OS that
> > would be better to experiment with?
>
> Postgres doesn't use O_DIRECT and probably never will. The system is
> esigned to use the system cache, not bypass it.
>
> What recent discussions have highlighted is the need to more accurately
> control the flow of data to disk. Apparently currently kernel try to
> hold data back much longer than is useful.

Right, so my understanding is that.PostgreSQL needs to provide the OS
with information with how it wants it to control the flow with
posix_fadvise, and it sounds like the Linux folks believe their
implementation of posix_fadvise needs some work.

> Not that I'm volunterring to deal with this.
>
> Have a nice day,

Regards,
Mark