Re: Linux kernel impact on PostgreSQL performance - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Linux kernel impact on PostgreSQL performance
Date
Msg-id 52D475B6.3020305@agliodbs.com
Whole thread Raw
In response to Linux kernel impact on PostgreSQL performance  (Mel Gorman <mgorman@suse.de>)
Responses Re: Linux kernel impact on PostgreSQL performance
Re: [Lsf-pc] Linux kernel impact on PostgreSQL performance
List pgsql-hackers
On 01/13/2014 02:26 PM, Mel Gorman wrote:
> Really?
> 
> zone_reclaim_mode is often a complete disaster unless the workload is
> partitioned to fit within NUMA nodes. On older kernels enabling it would
> sometimes cause massive stalls. I'm actually very surprised to hear it
> fixes anything and would be interested in hearing more about what sort
> of circumstnaces would convince you to enable that thing.

So the problem with the default setting is that it pretty much isolates
all FS cache for PostgreSQL to whichever socket the postmaster is
running on, and makes the other FS cache unavailable.  This means that,
for example, if you have two memory banks, then only one of them is
available for PostgreSQL filesystem caching ... essentially cutting your
available cache in half.

And however slow moving cached pages between memory banks is, it's an
order of magnitude faster than moving them from disk.  But this isn't
how the NUMA stuff is configured; it seems to assume that it's less
expensive to get pages from disk than to move them between banks, so
whatever you've got cached on the other bank, it flushes it to disk as
fast as possible.  I understand the goal was to make memory usage local
to the processors stuff was running on, but that includes an implicit
assumption that no individual process will ever want more than one
memory bank worth of cache.

So disabling all of the NUMA optimizations is the way to go for any
workload I personally deal with.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Florian Pflug
Date:
Subject: Re: Standalone synchronous master
Next
From: Craig Ringer
Date:
Subject: Re: Disallow arrays with non-standard lower bounds