Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Dave Chinner
Subject Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Date
Msg-id 20140114222352.GF3431@dastard
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Kevin Grittner <kgrittn@ymail.com>)
Responses Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Jonathan Corbet <corbet@lwn.net>)
Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Robert Haas <robertmhaas@gmail.com>)
Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance  (Jeremy Harris <jgh@wizmail.org>)
List pgsql-hackers
On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote:
> Robert Haas <robertmhaas@gmail.com> wrote:
> > Jan Kara <jack@suse.cz> wrote:
> >
> >> Just to get some idea about the sizes - how large are the
> >> checkpoints we are talking about that cause IO stalls?
> >
> > Big.
> 
> To quantify that, in a production setting we were seeing pauses of
> up to two minutes with shared_buffers set to 8GB and default dirty
 ^^^^^^^^^^^^^
 
> page settings for Linux, on a machine with 256GB RAM and 512MB ^^^^^^^^^^^^^
There's your problem.

By default, background writeback doesn't start until 10% of memory
is dirtied, and on your machine that's 25GB of RAM. That's way to
high for your workload.

It appears to me that we are seeing large memory machines much more
commonly in data centers - a couple of years ago 256GB RAM was only
seen in supercomputers. Hence machines of this size are moving from
"tweaking settings for supercomputers is OK" class to "tweaking
settings for enterprise servers is not OK"....

Perhaps what we need to do is deprecate dirty_ratio and
dirty_background_ratio as the default values as move to the byte
based values as the defaults and cap them appropriately.  e.g.
10/20% of RAM for small machines down to a couple of GB for large
machines....

> non-volatile cache on the RAID controller.  To eliminate stalls we
> had to drop shared_buffers to 2GB (to limit how many dirty pages
> could build up out-of-sight from the OS), spread checkpoints to 90%
> of allowed time (almost no gap between finishing one checkpoint and
> starting the next) and crank up the background writer so that no
> dirty page sat unwritten in PostgreSQL shared_buffers for more than
> 4 seconds. Less aggressive pushing to the OS resulted in the
> avalanche of writes I previously described, with the corresponding
> I/O stalls.  We approached that incrementally, and that's the point
> where stalls stopped occurring.  We did not adjust the OS
> thresholds for writing dirty pages, although I know of others who
> have had to do so.

Essentially, changing dirty_background_bytes, dirty_bytes and
dirty_expire_centiseconds to be much smaller should make the kernel
start writeback much sooner and so you shouldn't have to limit the
amount of buffers the application has to prevent major fsync
triggered stalls...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com



pgsql-hackers by date:

Previous
From: Dave Chinner
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
Next
From: Dave Chinner
Date:
Subject: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance