Re: MusicBrainz postgres performance issues - Mailing list pgsql-performance

From Scott Marlowe
Subject Re: MusicBrainz postgres performance issues
Date
Msg-id CAOR=d=1Axcbj2VcUO=5tM1nv6RjuA0D+av5+Mt7RTztgTxaMnA@mail.gmail.com
Whole thread Raw
In response to Re: MusicBrainz postgres performance issues  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re: MusicBrainz postgres performance issues
Re: MusicBrainz postgres performance issues
List pgsql-performance
On Sun, Mar 15, 2015 at 10:43 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Sun, Mar 15, 2015 at 7:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:
>> On 2015-03-15 13:07:25 +0100, Robert Kaye wrote:
>>>
>>> > On Mar 15, 2015, at 12:13 PM, Josh Krupka <jkrupka@gmail.com> wrote:
>>> >
>>> > It sounds like you've hit the postgres basics, what about some of the linux check list items?
>>> >
>>> > what does free -m show on your db server?
>>>
>>>              total       used       free     shared    buffers     cached
>>> Mem:         48295      31673      16622          0          5      12670
>>> -/+ buffers/cache:      18997      29298
>>> Swap:        22852       2382      20470
>>
>> Could you post /proc/meminfo instead? That gives a fair bit more
>> information.
>>
>> Also:
>> * What hardware is this running on?
>> * Why do you need 500 connections (that are nearly all used) when you
>>   have a pgbouncer in front of the database? That's not going to be
>>   efficient.
>> * Do you have any data tracking the state connections are in?
>>   I.e. whether they're idle or not? The connections graph on you linked
>>   doesn't give that information?
>> * You're apparently not graphing CPU usage. How busy are the CPUs? How
>>   much time is spent in the kernel (i.e. system)?
>
> htop is a great tool for watching the CPU cores live. Red == kernel btw.
>
>> * Consider installing perf (linux-utils-$something) and doing a
>>   systemwide profile.
>>
>> 3.2 isn't the greatest kernel around, efficiency wise. At some point you
>> might want to upgrade to something newer. I've seen remarkable
>> differences around this.
>
> That is an understatement. Here's a nice article on why it's borked:
>
> http://www.databasesoup.com/2014/09/why-you-need-to-avoid-linux-kernel-32.html
>
> Had a 32 core machine with big RAID BBU and 512GB memory that was
> dying using 3.2 kernel. went to 3.11 and it went from a load of 20 to
> 40 to a load of 5.
>
>> You really should upgrade postgres to a newer major version one of these
>> days. Especially 9.2. can give you a remarkable improvement in
>> performance with many connections in a read mostly workload.
>
> Agreed. ubuntu 12.04 with kernel 3.11/3.13 with pg 9.2 has been a
> great improvement over debian squeeze and pg 8.4 that we were running
> at work until recently.
>
> As for the OP. if you've got swap activity causing issues when there's
> plenty of free space just TURN IT OFF.
>
> swapoff -a
>
> I do this on all my big memory servers that don't really need swap,
> esp when I was using hte 3.2 kernel which seems broken as regards swap
> on bigger memory machines.

OK I've now read your blog post. A few pointers I'd make.

shared_mem of 12G is almost always too large. I'd drop it down to ~1G or so.

64MB work mem AND max_connections = 500 is a recipe for disaster. No
db can actively process 500 queries at once without going kaboom, ad
having 64MB work_mem means it will go kaboom long before it reaches
500 active connections. Lower that and let pgbouncer handle the extra
connections for you.

Get some monitoring installed if you don't already have it so you can
track memory usage, cpu usage, disk usage etc. Zabbix or Nagios work
well. Without some kind of system monitoring you're missing half the
information you need to troubleshoot with.

Install iotop, sysstat, and htop. Configure sysstat to collect data so
you can use sar to see what the machine's been doing in the past few
days etc. Set it to 1 minute intervals in the /etc/cron.d/sysstat
file.

Do whatever you have to to get kernel 3.11 or greater on that machine
(or a new one). You don't have to upgrade pg just yet but the upgrade
of the kernel is essential.

Good luck. Let us know what you find and if you can get that machine
back on its feet.

--
To understand recursion, one must first understand recursion.


pgsql-performance by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: MusicBrainz postgres performance issues
Next
From: Scott Marlowe
Date:
Subject: Re: MusicBrainz postgres performance issues