Re: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17) - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)
Date
Msg-id 20140121163608.GB5325@momjian.us
Whole thread Raw
In response to Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)  (Mel Gorman <mgorman@suse.de>)
Responses Re[2]: [HACKERS] Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance (summary v2 2014-1-17)
List pgsql-hackers
On Fri, Jan 17, 2014 at 04:31:48PM +0000, Mel Gorman wrote:
> NUMA Optimisations
> ------------------
> 
> The primary one that showed up was zone_reclaim_mode. Enabling that parameter
> is a disaster for many workloads and apparently Postgres is one. It might
> be time to revisit leaving that thing disabled by default and explicitly
> requiring that NUMA-aware workloads that are correctly partitioned enable it.
> Otherwise NUMA considerations are not that much of a concern right now.

Here is a blog post about our zone_reclaim_mode-disable recommendations:
http://frosty-postgres.blogspot.com/2012/08/postgresql-numa-and-zone-reclaim-mode.html

> Direct IO, buffered IO, double buffering and wishlists
> ------------------------------------------------------
>    6. Only writeback pages if explicitly synced. Postgres has strict write
>       ordering requirements. In the words of Tom Lane -- "As things currently
>       stand, we dirty the page in our internal buffers, and we don't write
>       it to the kernel until we've written and fsync'd the WAL data that
>       needs to get to disk first". mmap() would avoid double buffering but
>       it has no control about the write ordering which is a show-stopper.
>       As Andres Freund described;

What was not explicitly stated here is that the Postgres design is
taking advantage of the double-buffering "feature" here and writing to a
memory copy of the page while there is still an unmodified copy in the
kernel cache, or on disk.  In the case of a crash, we rely on the fact
that the disk page is unchanged.  Certainly any design that requires the
kernel to mange two different copies of the same page is going to be
confusing.

One larger question is how many of these things that Postgres needs are
needed by other applications?  I doubt Postgres is large enough to
warrant changes on its own.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Add %z support to elog/ereport?
Next
From: Robert Haas
Date:
Subject: Re: dynamic shared memory and locks