Ridiculous load - Mailing list pgsql-general

From Peter Haworth
Subject Ridiculous load
Date
Msg-id PGM.20041209152340.29210.5148@edison.ioppublishing.com
Whole thread Raw
Responses Re: Ridiculous load  (Wes <wespvp@syntegra.com>)
List pgsql-general
On Monday we upgraded one of our PostgreSQL instances from v7.2 to v7.4.

Yesterday the box (sabine) on which this runs became very unhappy:

It runs RHEL ES v3, kernel 2.4.21-20.ELsmp
It's generally a very stable box which runs a number of postgresql
instances.  But last night we hit very high low averages - 10+, vs the
normal 0-2.
The culprit appeared to be kswapd, which was using huge amounts of cpu.
I'd like to know why!

We resolved the problem last night via a reboot.  And so far all is well,
but I am concerned that the problem may reoccur.

I wondered if you could help.  In particular the only recent change is the
postgresql upgrade above, so the timing seems suspect.

We used to run two postgresql instances, each with the following
parameters:
Command line was:
/usr/bin/postmaster '-p' '5679' '-i' '-N' '256' '-B' '128000' '-o' '-S
64000'

As we only upgraded one database we split up the 2nd instance into two,
each with the following parameters:
/usr/bin/postmaster '-p' '5679' '-i' '-N' '256' '-B' '64000' '-o' '-S
32000'
(different ports, obviously)

The ram on the box should be generous.  Right now (whilst it's behaving,
and all 3 instances are running) free gives the following output:
sabine% free
~
             total       used       free     shared    buffers     cached
Mem:       3857312    3834676      22636          0      50880    3339568
-/+ buffers/cache:     444228    3413084
Swap:      2096472     319460    1777012
sabine%

We plan to revert back to two instances when we upgrade the remaining
databases next week.  Should we change the amount of ram we allocate to
these with the move to v7.4?
Are there other settings we should be altering?

All this is assuming that it's a postgres problem, rather than something
else. Another leading candidate is this, which we are looking into:
  https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=132155
It would be good to eliminate all possible causes, though.


--
    Peter Haworth    pmh@edison.ioppublishing.com
"Of course, if you were to print(1..Inf), you'd have plenty of time to go and
 get a cup of coffee.  And even then (given the comparatively imminent heat
 death of the universe) that coffee would be really cold before the output
 was complete. So there will probably be a warning when you try to do that."
        -- Damian Conway in Exegesis 3

pgsql-general by date:

Previous
From: Werdin Jens
Date:
Subject: Re: Performance
Next
From: Ron Peterson
Date:
Subject: information schema extra fields