Re: Context switch storm - Mailing list pgsql-performance
From | Richard Huxton |
---|---|
Subject | Re: Context switch storm |
Date | |
Msg-id | 454B4438.9080105@archonet.com Whole thread Raw |
In response to | Context switch storm (creimer@brturbo.com.br) |
Responses |
Re: Context switch storm
(Andreas Kostyrka <andreas@kostyrka.org>)
Re: Context switch storm (Cosimo Streppone <cosimo@streppone.it>) |
List | pgsql-performance |
Cosimo Streppone wrote: > Richard Huxton wrote: > >> creimer@brturbo.com.br wrote: >>> >>> The average context switching for this server as vmstat shows is 1 >>> but when the problem occurs it goes to 250000. >> >> You'll tend to see it when you have multiple clients and most queries >> can use RAM rather than disk I/O. My understanding of what happens is >> that PG requests data from RAM - it's not in cache so the process gets >> suspended to wait. The next process does the same, with the same >> result. You end up with lots of processes all fighting over what >> data is in the cache and no-one gets much work done. > > Does this happen also with 8.0, or is specific to 8.1 ? All versions suffer to a degree - they just push the old Xeon in the wrong way. However, more recent versions *should* be better than older versions. I believe some work was put in to prevent contention on various locks which should reduce context-switching across the board. > I seem to have the same exact behaviour for an OLTP-loaded 8.0.1 server upgrade from 8.0.1 - the most recent is 8.0.9 iirc > when I raise `shared_buffers' from 8192 to 40000. > I would expect an increase in tps/concurrent clients, but I see an average > performance below a certain threshold of users, and when concurrent users > get above that level, performance starts to drop, no matter what I do. Are you seeing a jump in context-switching in top? You'll know when you do - it's a *large* jump. That's the key diagnosis. Otherwise it might simply be your configuration settings aren't ideal for that workload. > Server logs and io/vm statistics seem to indicate that there is little > or no disk activity but machine loads increases to 7.0/8.0. > After some minutes, the problem goes away, and performance returns > to acceptable levels. That sounds like it. Query time increases across the board as all the clients fail to get any data back. > When the load increases, *random* database queries show this "slowness", > even if they are perfectly planned and indexed. > > Is there anything we can do? Well, the client I saw it with just bought a dual-opteron server and used their quad-Xeon for something else. However, I do remember that 8.1 seemed better than 7.4 before they switched. Part of that might just have been better query-planning and other efficiences though. -- Richard Huxton Archonet Ltd
pgsql-performance by date: