Thread: postgres processes spending most of their time in the kernel

postgres processes spending most of their time in the kernel

From
"Jeffrey W. Baker"
Date:
I have a moderately loaded postgres server running 7.2beta4 (i wanted to
try out the live vacuum) that turns out to spend the majority of its cpu
time in kernel land.  With only a handful of running processes, postgres
induces tens of thousands of context switches per second.  Practically the
only thing postgres does with all this CPU time is semop() in a tight
loop.  Here is a snippet of strace:

[pid 11410]      0.000064 <... semop resumed> ) = 0
[pid 11409]      0.000020 <... semop resumed> ) = 0
[pid 11410]      0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11409]      0.000027 semop(1179648, 0xbfffe488, 1 <unfinished ...>
[pid 11407]      0.000027 semop(1179648, 0xbfffe8b8, 1 <unfinished ...>
[pid 11409]      0.000022 <... semop resumed> ) = 0
[pid 11406]      0.000018 <... semop resumed> ) = 0
[pid 11409]      0.000023 semop(1179648, 0xbfffe468, 1 <unfinished ...>
[pid 11406]      0.000026 semop(1179648, 0xbfffe958, 1) = 0
[pid 11406]      0.000057 semop(1179648, 0xbfffe9f8, 1 <unfinished ...>
[pid 11408]      0.000037 <... semop resumed> ) = 0
[pid 11408]      0.000029 semop(1179648, 0xbfffe4d8, 1) = 0
[pid 11411]      0.000038 <... semop resumed> ) = 0
[pid 11408]      0.000023 semop(1179648, 0xbfffe4d8, 1 <unfinished ...>
[pid 11411]      0.000026 semop(1179648, 0xbfffe498, 1) = 0
[pid 11407]      0.000040 <... semop resumed> ) = 0
[pid 11411]      0.000024 semop(1179648, 0xbfffe658, 1 <unfinished ...>
[pid 11407]      0.000027 semop(1179648, 0xbfffe8a8, 1) = 0
[pid 11410]      0.000038 <... semop resumed> ) = 0
[pid 11407]      0.000024 semop(1179648, 0xbfffe918, 1 <unfinished ...>
[pid 11410]      0.000026 semop(1179648, 0xbfffe618, 1) = 0
[pid 11410]      0.000058 semop(1179648, 0xbfffe6a8, 1 <unfinished ...>
[pid 11409]      0.000024 <... semop resumed> ) = 0
[pid 11409]      1.214166 semop(1179648, 0xbfffe428, 1) = 0
[pid 11406]      0.000063 <... semop resumed> ) = 0
[pid 11406]      0.000031 semop(1179648, 0xbfffe9f8, 1) = 0
[pid 11406]      0.000051 semop(1179648, 0xbfffe8f8, 1 <unfinished ...>

Performance on this database kind of sucks.  Since there is little or no
block I/O, I assume this is because postgres is wasting its CPU
allocations.

Does anyone else see this?  Is there a config option to tune the locking
behavior?  Any other workarounds?

The machine is a 2-way x86 running Linux 2.4.  I brought this up on
linux-kernel and they don't seem to think it is the scheduler's problem.

-jwb


Re: postgres processes spending most of their time in the kernel

From
Tom Lane
Date:
"Jeffrey W. Baker" <jwbaker@acm.org> writes:
> I have a moderately loaded postgres server running 7.2beta4 (i wanted to
> try out the live vacuum) that turns out to spend the majority of its cpu
> time in kernel land.  With only a handful of running processes, postgres
> induces tens of thousands of context switches per second.  Practically the
> only thing postgres does with all this CPU time is semop() in a tight
> loop.

It sounds like you have a build that's using SysV semaphores in place of
test-and-set instructions.  That should not happen on x86 hardware,
since we have assembly TAS code for x86.  Please look at your port
header file (src/include/pg_config_os.h symlink) and
src/include/storage/s_lock.h to figure out why it's misbuilt.

            regards, tom lane

Re: postgres processes spending most of their time in the

From
"Jeffrey W. Baker"
Date:

On Fri, 28 Dec 2001, Tom Lane wrote:

> "Jeffrey W. Baker" <jwbaker@acm.org> writes:
> > I have a moderately loaded postgres server running 7.2beta4 (i wanted to
> > try out the live vacuum) that turns out to spend the majority of its cpu
> > time in kernel land.  With only a handful of running processes, postgres
> > induces tens of thousands of context switches per second.  Practically the
> > only thing postgres does with all this CPU time is semop() in a tight
> > loop.
>
> It sounds like you have a build that's using SysV semaphores in place of
> test-and-set instructions.  That should not happen on x86 hardware,
> since we have assembly TAS code for x86.  Please look at your port
> header file (src/include/pg_config_os.h symlink) and
> src/include/storage/s_lock.h to figure out why it's misbuilt.

Well, it seems that one of __i386__ or __GNUC__ isn't set at compile time.
I'm using GCC on i386 so I don't see how that is possible.  It should be
safe for me to simply define these two things in pg_config.h, I suspect.

-jwb


Re: postgres processes spending most of their time in the kernel

From
Tom Lane
Date:
"Jeffrey W. Baker" <jwbaker@acm.org> writes:
>> It sounds like you have a build that's using SysV semaphores in place of
>> test-and-set instructions.  That should not happen on x86 hardware,
>> since we have assembly TAS code for x86.  Please look at your port
>> header file (src/include/pg_config_os.h symlink) and
>> src/include/storage/s_lock.h to figure out why it's misbuilt.

> Well, it seems that one of __i386__ or __GNUC__ isn't set at compile time.
> I'm using GCC on i386 so I don't see how that is possible.

I don't either.

> It should be
> safe for me to simply define these two things in pg_config.h, I suspect.

That is not a solution.  If it's broken for you then it's likely to be
broken for other people.  We need to figure out what went wrong and
provide a permanent fix.

What gcc version are you running, exactly, and what symbols does it
predefine?  (I seem to recall that there's a way to find that out,
though I'm not recalling how at the moment.  Anyone?)

            regards, tom lane

Re: postgres processes spending most of their time in the

From
Bruce Momjian
Date:
> What gcc version are you running, exactly, and what symbols does it
> predefine?  (I seem to recall that there's a way to find that out,
> though I'm not recalling how at the moment.  Anyone?)

pgsql/src/tools/ccsym shows compiler symbols.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026