Re: MusicBrainz postgres performance issues - Mailing list pgsql-performance

From Tomas Vondra
Subject Re: MusicBrainz postgres performance issues
Date
Msg-id 550618C2.3030802@2ndquadrant.com
Whole thread Raw
In response to Re: MusicBrainz postgres performance issues  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: MusicBrainz postgres performance issues
List pgsql-performance
On 15.3.2015 23:47, Andres Freund wrote:
> On 2015-03-15 12:25:07 -0600, Scott Marlowe wrote:
>> Here's the problem with a large shared_buffers on a machine that's
>> getting pushed into swap. It starts to swap BUFFERs. Once buffers
>> start getting swapped you're not just losing performance, that huge
>> shared_buffers is now working against you because what you THINK are
>> buffers in RAM to make things faster are in fact blocks on a hard
>> drive being swapped in and out during reads. It's the exact opposite
>> of fast. :)
>
> IMNSHO that's tackling things from the wrong end. If 12GB of shared
> buffers drive your 48GB dedicated OLTP postgres server into swapping
> out actively used pages, the problem isn't the 12GB of shared
> buffers, but that you require so much memory for other things. That
> needs to be fixed.

I second this opinion.

As was already pointed out, the 500 connections is rather insane
(assuming the machine does not have hundreds of cores).

If there are memory pressure issues, it's likely because many queries
are performing memory-expensive operations at the same time (might even
be a bad estimate causing hashagg to use much more than work_mem).


> But! We haven't even established that swapping is an actual problem
> here. The ~2GB of swapped out memory could just as well be the java raid
> controller management monstrosity or something similar. Those pages
> won't ever be used and thus can better be used to buffer IO.
>
> You can check what's actually swapped out using:
> grep ^VmSwap /proc/[0-9]*/status|grep -v '0 kB'
>
> For swapping to be actually harmful you need to have pages that are
> regularly swapped in. vmstat will tell.

I've already asked for vmstat logs, so let's wait.

> In a concurrent OLTP workload (~450 established connections do
> suggest that) with a fair amount of data keeping the hot data set in
> shared_buffers can significantly reduce problems. Constantly
> searching for victim buffers isn't a nice thing, and that will happen
> if your most frequently used data doesn't fit into s_b. On the other
> hand, if your data set is so large that even the hottest part doesn't
> fit into memory (perhaps because there's no hottest part as there's
> no locality at all), a smaller shared buffers can make things more
> efficient, because the search for replacement buffers is cheaper with
> a smaller shared buffers setting.

I've met many systems with max_connections values this high, and it was
mostly idle connections because of separate connection pools on each
application server. So mostly idle (90% of the time), but at peak time
all the application servers want to od stuff at the same time. And it
all goes KABOOOM! just like here.


--
Tomas Vondra                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-performance by date:

Previous
From: Andres Freund
Date:
Subject: Re: MusicBrainz postgres performance issues
Next
From: "michael@sqlexec.com"
Date:
Subject: Re: MusicBrainz postgres performance issues