Re: MusicBrainz postgres performance issues - Mailing list pgsql-performance

From Andres Freund
Subject Re: MusicBrainz postgres performance issues
Date
Msg-id 20150315224756.GE29732@awork2.anarazel.de
Whole thread Raw
In response to Re: MusicBrainz postgres performance issues  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: MusicBrainz postgres performance issues  ("michael@sqlexec.com" <michael@sqlexec.com>)
Re: MusicBrainz postgres performance issues  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-performance
On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
> Here's the problem with a large shared_buffers on a machine that's
> getting pushed into swap. It starts to swap BUFFERs. Once buffers
> start getting swapped you're not just losing performance, that huge
> shared_buffers is now working against you because what you THINK are
> buffers in RAM to make things faster are in fact blocks on a hard
> drive being swapped in and out during reads. It's the exact opposite
> of fast. :)

IMNSHO that's tackling things from the wrong end. If 12GB of shared
buffers drive your 48GB dedicated OLTP postgres server into swapping out
actively used pages, the problem isn't the 12GB of shared buffers, but
that you require so much memory for other things. That needs to be
fixed.

But! We haven't even established that swapping is an actual problem
here. The ~2GB of swapped out memory could just as well be the java raid
controller management monstrosity or something similar. Those pages
won't ever be used and thus can better be used to buffer IO.

You can check what's actually swapped out using:
grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'

For swapping to be actually harmful you need to have pages that are
regularly swapped in. vmstat will tell.

In a concurrent OLTP workload (~450 established connections do suggest
that) with a fair amount of data keeping the hot data set in
shared_buffers can significantly reduce problems. Constantly searching
for victim buffers isn't a nice thing, and that will happen if your most
frequently used data doesn't fit into s_b.  On the other hand, if your
data set is so large that even the hottest part doesn't fit into memory
(perhaps because there's no hottest part as there's no locality at all),
a smaller shared buffers can make things more efficient, because the
search for replacement buffers is cheaper with a smaller shared buffers
setting.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject: Re: MusicBrainz postgres performance issues
Next
From: "michael@sqlexec.com"
Date:
Subject: Re: MusicBrainz postgres performance issues