Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Mel Gorman
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 20140120142703.GS4963@suse.de
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Dave Chinner <david@fromorbit.com>)
List pgsql-hackers
On Mon, Jan 20, 2014 at 10:51:41AM +1100, Dave Chinner wrote:
> On Sun, Jan 19, 2014 at 03:37:37AM +0200, Marti Raudsepp wrote:
> > On Wed, Jan 15, 2014 at 5:34 AM, Jim Nasby <jim@nasby.net> wrote:
> > > it's very common to create temporary file data that will never, ever, ever
> > > actually NEED to hit disk. Where I work being able to tell the kernel to
> > > avoid flushing those files unless the kernel thinks it's got better things
> > > to do with that memory would be EXTREMELY valuable
> > 
> > Windows has the FILE_ATTRIBUTE_TEMPORARY flag for this purpose.
> > 
> > ISTR that there was discussion about implementing something analogous
> > in Linux when ext4 got delayed allocation support, but I don't think
> > it got anywhere and I can't find the discussion now. I think the
> > proposed interface was to create and then unlink the file immediately,
> > which serves as a hint that the application doesn't care about
> > persistence.
> 
> You're thinking about O_TMPFILE, which is for making temp files that
> can't be seen in the filesystem namespace, not for preventing them
> from being written to disk.
> 
> I don't really like the idea of overloading a namespace directive to
> have special writeback connotations. What we are getting into the
> realm of here is generic user controlled allocation and writeback
> policy...
> 

Such overloading would be unwelcome. FWIW, I assumed this would be an
fadvise thing. Initially something that controlled writeback on an inode
and not an fd context that ignored the offset and length parameters.
Granded, someone will probably throw a fit about adding a Linux-specific
flag to the fadvise64 syscall. POSIX_FADV_NOREUSE is currently unimplemented
and it could be argued that it could be used to flag temporary files that
have a different writeback policy but it's not clear if that matches the
original intent of the posix flag.

> > Postgres is far from being the only application that wants this; many
> > people resort to tmpfs because of this:
> > https://lwn.net/Articles/499410/
> 
> Yes, we covered the possibility of using tmpfs much earlier in the
> thread, and came to the conclusion that temp files can be larger
> than memory so tmpfs isn't the solution here. :)
> 

And swap IO patterns blow chunks because people rarely want to touch
that area of the code with a 50 foot pole. It gets filed under "if you're
swapping, you already lost"

-- 
Mel Gorman
SUSE Labs



pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: plpgsql.warn_shadow
Next
From: Jov
Date:
Subject: change alter user to be a true alias for alter role