Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 52D56DE1.6070009@vmware.com
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 01/14/2014 06:08 PM, Tom Lane wrote:
> Trond Myklebust <trondmy@gmail.com> writes:
>> On Jan 14, 2014, at 10:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> "Don't be aggressive" isn't good enough.  The prohibition on early write
>>> has to be absolute, because writing a dirty page before we've done
>>> whatever else we need to do results in a corrupt database.  It has to
>>> be treated like a write barrier.
>
>> Then why are you dirtying the page at all? It makes no sense to tell the kernel “we’re changing this page in the
pagecache, but we don’t want you to change it on disk”: that’s not consistent with the function of a page cache.
 
>
> As things currently stand, we dirty the page in our internal buffers,
> and we don't write it to the kernel until we've written and fsync'd the
> WAL data that needs to get to disk first.  The discussion here is about
> whether we could somehow avoid double-buffering between our internal
> buffers and the kernel page cache.

To be honest, I think the impact of double buffering in real-life 
applications is greatly exaggerated. If you follow the usual guideline 
and configure shared_buffers to 25% of available RAM, at worst you're 
wasting 25% of RAM to double buffering. That's significant, but it's not 
the end of the world, and it's a problem that can be compensated by 
simply buying more RAM.

Of course, if someone can come up with an easy way to solve that, that'd 
be great, but if it means giving up other advantages that we get from 
relying on the OS page cache, then -1 from me. The usual response to the 
"why don't you just use O_DIRECT?" is that it'd require reimplementing a 
lot of I/O infrastructure, but misses an IMHO more important point: it 
would require setting shared_buffers a lot higher to get the same level 
of performance you get today. That has a number of problems:

1. It becomes a lot more important to tune shared_buffers correctly. Set 
it too low, and you're not taking advantage of all the RAM available. 
Set it too high, and you'll start swapping, totally killing performance. 
I can already hear consultants rubbing their hands, waiting for the rush 
of customers that will need expert help to determine the optimal 
shared_buffers setting.

2. Memory spent on the buffer cache can't be used for other things. For 
example, an index build can temporarily allocate several gigabytes of 
memory; if that memory is allocated to the shared buffer cache, it can't 
be used for that purpose. Yeah, we could change that, and allow 
borrowing pages from the shared buffer cache for other purposes, but 
that means more work and more code.

3. Memory used for the shared buffer cache can't be used by other 
processes (without swapping). It becomes a lot harder to be a good 
citizen on a system that's not entirely dedicated to PostgreSQL.

So not only would we need to re-implement I/O infrastructure, we'd also 
need to make memory management a lot smarter and a lot more flexible. 
We'd need a lot more information on what else is running on the system 
and how badly they need memory.

> I personally think there is no chance of using mmap for that; the
> semantics of mmap are pretty much dictated by POSIX and they don't work
> for this.

Agreed. It would be possible to use mmap() for pages that are not 
modified, though. When you're not modifying, you could mmap() the data 
you need, and bypass the PostgreSQL buffer cache that way. The 
interaction with the buffer cache becomes complicated, because you 
couldn't use the buffer cache's locks etc., and some pages might have a 
never version in the buffer cache than on-disk, but it might be doable.

- Heikki



pgsql-hackers by date:

Previous
From: Claudio Freire
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Next
From: Kevin Grittner
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance