On Wed, May 23, 2012 at 5:29 AM, Andrzej Krawiec
<a.krawiec@focustelecom.pl> wrote:
> Cannot strace or gdb on a production system under heavy load (about 100
> transactions per second).
> It's in kernel space not user, so we are unable to anything at this
> particular moment (sometimes even the ssh connection seems to hang for a
> while).
> We suspect neither autovacuum (although suspected primarily) nor regular
> backend. It is system time. The question is: what's the reasone for that?
> We've dug through system and postgres logs, cleared out most of the long
> query problems, idle in transaction, optimized queries, vacuumed, reindexed
> and such.
> For a while it seemed like the particular kernel version is causing majority
> of problems. We have downgraded to 2.6.32.-71.29.1.el6.x86_64 and those
> problems went mostly! away. For few days we had no situations, but it
> happened again.
perf can tell you about problems in kernel-space, but I'm not sure it
exists that far back.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company