Re: Context switch storm - Mailing list pgsql-performance

From Andreas Kostyrka
Subject Re: Context switch storm
Date
Msg-id 20061114101311.GN8410@andi-lap.la.revver.com
Whole thread Raw
In response to Re: Context switch storm  (Cosimo Streppone <cosimo@streppone.it>)
Responses Re: Context switch storm
List pgsql-performance
* Cosimo Streppone <cosimo@streppone.it> [061114 10:52]:
> Richard Huxton wrote:
> >Cosimo Streppone wrote:
> >>Richard Huxton wrote:
> >>
> >>>>The average context switching for this server as vmstat shows is 1
> >>>>but when the problem occurs it goes to 250000.
> >>>
> >>I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server
> >upgrade from 8.0.1 - the most recent is 8.0.9 iirc
> >[...]
> >Are you seeing a jump in context-switching in top? You'll know when you do - it's a *large* jump. That's the key
diagnosis.Otherwise it might simply be your configuration settings  
> >aren't ideal for that workload.
>
> Sorry for the delay.
>
> I have logged vmstat results for the last 3 days.
> Max context switches figure is 20500.
>
> If I understand correctly, this does not mean a "storm",
Nope, 20500 is a magnitude to low to the storms we were experiencing.

> but only that the 2 Xeons are overloaded.
> Probably, I can do a good thing switching off the HyperThreading.
> I get something like 12/15 *real* concurrent processes hitting
> the server.

Actually, for the storms we had, the number of concurrent processes
AND the workload is important:

many processes that do all different things => overloaded server
many processes that do all the same queries => storm.

Basically, it seems that postgresql implementation of locking is on
quite unfriendly standings with the Xeon memory subsystems. googling
around might provide more details.

>
> I must say I lowered "shared_buffers" to 8192, as it was before.
> I tried raising it to 16384, but I can't seem to find a relationship
> between shared_buffers and performance level for this server.
>
> >Well, the client I saw it with just bought a dual-opteron server and used their quad-Xeon for something else.
However,I do remember that 8.1 seemed better than 7.4 before they  
> >switched. Part of that might just have been better query-planning and other efficiences though.
>
> An upgrade to 8.1 is definitely the way to go.
> Any 8.0 - 8.1 migration advice?
Simple, there are basically two ways:
a) you can take downtime: pg_dump + restore
b) you cannot take downtime: install slony, install your new 8.1
server, replicate into it, switchover to the new server.

If you can get new hardware for the 8.1 box, you have two benefits:
a) order Opterons. That doesn't solve the overload problem as such,
but these pesky cs storms seems to have gone away this way.
(that was basically the "free" advice from an external consultant,
which luckily matched with my ideas what the problem could be. Cheap
solution at $3k :) )
b) you can use the older box still as readonly replica.
c) you've got a hot backup of your db.

Andreas

pgsql-performance by date:

Previous
From: Cosimo Streppone
Date:
Subject: Re: Context switch storm
Next
From: "Merlin Moncure"
Date:
Subject: Re: Context switch storm