postgres processes spending most of their time in the kernel - Mailing list pgsql-general

From Jeffrey W. Baker
Subject postgres processes spending most of their time in the kernel
Date
Msg-id Pine.LNX.4.33.0112281001050.23655-100000@windmill.gghcwest.com
Whole thread Raw
Responses Re: postgres processes spending most of their time in the kernel
List pgsql-general
I have a moderately loaded postgres server running 7.2beta4 (i wanted to
try out the live vacuum) that turns out to spend the majority of its cpu
time in kernel land.  With only a handful of running processes, postgres
induces tens of thousands of context switches per second.  Practically the
only thing postgres does with all this CPU time is semop() in a tight
loop.  Here is a snippet of strace:

[pid 11410]      0.000064 <... semop resumed> ) = 0
[pid 11409]      0.000020 <... semop resumed> ) = 0
[pid 11410]      0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11409]      0.000027 semop(1179648, 0xbfffe488, 1 <unfinished ...>
[pid 11407]      0.000027 semop(1179648, 0xbfffe8b8, 1 <unfinished ...>
[pid 11409]      0.000022 <... semop resumed> ) = 0
[pid 11406]      0.000018 <... semop resumed> ) = 0
[pid 11409]      0.000023 semop(1179648, 0xbfffe468, 1 <unfinished ...>
[pid 11406]      0.000026 semop(1179648, 0xbfffe958, 1) = 0
[pid 11406]      0.000057 semop(1179648, 0xbfffe9f8, 1 <unfinished ...>
[pid 11408]      0.000037 <... semop resumed> ) = 0
[pid 11408]      0.000029 semop(1179648, 0xbfffe4d8, 1) = 0
[pid 11411]      0.000038 <... semop resumed> ) = 0
[pid 11408]      0.000023 semop(1179648, 0xbfffe4d8, 1 <unfinished ...>
[pid 11411]      0.000026 semop(1179648, 0xbfffe498, 1) = 0
[pid 11407]      0.000040 <... semop resumed> ) = 0
[pid 11411]      0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11407]      0.000027 semop(1179648, 0xbfffe8a8, 1) = 0
[pid 11410]      0.000038 <... semop resumed> ) = 0
[pid 11407]      0.000024 semop(1179648, 0xbfffe918, 1 <unfinished ...>
[pid 11410]      0.000026 semop(1179648, 0xbfffe618, 1) = 0
[pid 11410]      0.000058 semop(1179648, 0xbfffe6a8, 1 <unfinished ...>
[pid 11409]      0.000024 <... semop resumed> ) = 0
[pid 11409]      1.214166 semop(1179648, 0xbfffe428, 1) = 0
[pid 11406]      0.000063 <... semop resumed> ) = 0
[pid 11406]      0.000031 semop(1179648, 0xbfffe9f8, 1) = 0
[pid 11406]      0.000051 semop(1179648, 0xbfffe8f8, 1 <unfinished ...>

Performance on this database kind of sucks.  Since there is little or no
block I/O, I assume this is because postgres is wasting its CPU
allocations.

Does anyone else see this?  Is there a config option to tune the locking
behavior?  Any other workarounds?

The machine is a 2-way x86 running Linux 2.4.  I brought this up on
linux-kernel and they don't seem to think it is the scheduler's problem.

-jwb


pgsql-general by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: First decent PostgreSQL CBT now on techdocs.postgresql.org
Next
From: Tom Lane
Date:
Subject: Re: First decent PostgreSQL CBT now on techdocs.postgresql.org