Thread: postmaster proc running CPU to 100% and spinning.

postmaster proc running CPU to 100% and spinning.

From
Andrew Kelly
Date:
Dear all,

my apologies right up front for what is probably going to be a poorly
stated plea. I'm such a novice I'm not even sure how to properly ask the
question.

I'll just dive in...

I've been recently forced to accept the migration of an application that
I don't yet even properly fundamentally understand. The former host and
developer has been a 'hostile' partner in the migration and has not been
forthcoming during any part. I've had to prepare a platform with
incomplete configuration information, and work around what has turned
out to be purposely misleading information. It's been pretty much a
worst case senario from the get-go.
It's hard not to toss off a multi-page rant about this 'organisation',
but I'll just bite my tongue and move on because in spite of it all I've
pretty much got it all whooped.

In a nutshell, the application is a multi-byte encoded database in
postgresql 7.2.4 with a php 4.2.3 front end, hosted on a multiproc
RedHat 7.3 box.

We are about 99% certain we have succeeded in duplicating the platform
and installing the application. So far everything has worked wonderfully
with one single exception.
The application provides the option of performing a new search within
the result set of the previous search. This worked well in the
application on the previous host, but not on the new host. Now, when one
tries to search with the results of a previous search, the postmaster
process drives CPU usage to 100% and the application never returns a
result. Strangely enough, it doesn't max out both processors at the same
time, but the states do change. The maxed out CPU will suddenly drop
under 1% as the second CPU blows out to 100%.
Only kill seems to release the CPU.

Now, I realize there is not enough info here to debug this, but that is
mostly why I'm writing. I know nothing about this app and will have to
plod through the source to learn it. I'm also a cherry to postgresql (to
sql in general).
My stomache tells me this is config issue with the host in general,
maybe a permissions issue in cache space or something, but I'm only
guessing. The databases were a straightforward dump at the old site and
restore on the new site, and nothing has changed in the source code, and
the fact that everything else runs perfectly seems to indicate it was a
good duplication.

Can anybody give me a hint on where to start looking and what tools to
use? I'm a flipping nervous wreck suddenly being responsible for
something I don't yet have mastery of.

Andy
--


Re: postmaster proc running CPU to 100% and spinning.

From
Stephan Szabo
Date:
On Mon, 22 Mar 2004, Andrew Kelly wrote:

> We are about 99% certain we have succeeded in duplicating the platform
> and installing the application. So far everything has worked wonderfully
> with one single exception.
> The application provides the option of performing a new search within
> the result set of the previous search. This worked well in the
> application on the previous host, but not on the new host. Now, when one
> tries to search with the results of a previous search, the postmaster
> process drives CPU usage to 100% and the application never returns a
> result. Strangely enough, it doesn't max out both processors at the same
> time, but the states do change. The maxed out CPU will suddenly drop
> under 1% as the second CPU blows out to 100%.
> Only kill seems to release the CPU.
>
> Now, I realize there is not enough info here to debug this, but that is
> mostly why I'm writing. I know nothing about this app and will have to
> plod through the source to learn it. I'm also a cherry to postgresql (to
> sql in general).
> My stomache tells me this is config issue with the host in general,
> maybe a permissions issue in cache space or something, but I'm only
> guessing. The databases were a straightforward dump at the old site and
> restore on the new site, and nothing has changed in the source code, and
> the fact that everything else runs perfectly seems to indicate it was a
> good duplication.
>
> Can anybody give me a hint on where to start looking and what tools to
> use? I'm a flipping nervous wreck suddenly being responsible for

Well, the first thing to look at is to check the two postgresql.conf
files.  It's also possible that they were sending options on the command
line if they were really trying to be annoying, so if you have access to
the startup scripts they were using, there might be information there.

I don't have a 7.2 box to check configuration options against, but there
should be something like log_statement which if set to true will log the
statements sent to the database.  You should make sure you're doing
something with the log output (either sending it to syslog or making sure
that you send stderr to a file somewhere).  That'll let you see what
statements are happening.

If it's consistently happening on a particular query, you can see any
things like NOTICEs that come up and you can use EXPLAIN to get the query
plan for the query.  With that and the table schema of the affected tables
we can probably help a bit more.  (We'd normally ask for EXPLAIN ANALYZE
but if the query is taking way too long, it will also since it actually
runs the query while a plain EXPLAIN does not).

Finally, 7.2.4 is fairly old. Once you have the thing working, you
might want to play with upgrading. :)