Thread: postgres eating CPU on HP9000

postgres eating CPU on HP9000

From
Fabio Esposito
Date:
Hello fellow PostgreSQL users.

We've been working on this interesting issue for some time now, and we're
hoping that someone can help.

We've recently integrated postgres into an existing mature app.  Its a
time sensitive 24x7 system.  It runs on HP9000, a K370 Dual Processor
system.  Postgres is version 7.3.2.  Its spawned as a child from a parent
supervisory process, and they communicate to eachother via shared memory.

We preform 9-12K selects per hour
           6-8K inserts per hour (a few updates here as well)
           1-1.5K Deletes per hour.

It maintains 48hours of data, so its not a large database; roughly
<600mbs.  We do this by running a housekeeping program in a cron job.
It deletes all data older then 48hours, then vaccuum analyzes.  It will
also preform a reindex if the option is set before it vaccuum's.

Postgres initially worked wonderfully, fast and solid.  It
preformed complex joins in 0.01secs, and was able to keep up with our
message queue.  It stayed this way for almost a year during our
development.

Recently it started eating up the cpu, and cannot keepup with the system
like it used to.  The interesting thing here is that it still runs great
on an older system with less ram, one slower cpu, and an older disk.

We tried the following with no success:

running VACCUUM FULL
dropping all tables and staring anew
reinstalling postgres
tweaking kernel parameters (various combos)
tweaking postgres parameters (various combos)
a number of other ideas

A final note, we have our app on two systems ready for hot backup.  The
hot backup system is that older slower system that I mentioned before. The
two communicate with eachother via rpc's.

Any help anyone can give to steer us in the right direction would be much
appreciated.

Thanks again

Fabio E.



Just in case:

vmstat
         procs           memory                   page
faults       cpu
    r     b     w      avm    free   re   at    pi   po    fr   de    sr
in     sy    cs  us sy id
    1     0     0     7631  124955   30   31     1    0     0    0     1
566    964   138  25  2 73

top

System: prokyon                                       Tue Mar 23 19:12:54
2004
Load averages: 0.36, 0.33, 0.31
170 processes: 169 sleeping, 1 running
Cpu states:
CPU   LOAD   USER   NICE    SYS   IDLE  BLOCK  SWAIT   INTR   SSYS
 0    0.07   8.9%   0.0%   0.0%  91.1%   0.0%   0.0%   0.0%   0.0%
 1    0.72  71.3%   0.0%   1.0%  27.7%   0.0%   0.0%   0.0%   0.0%
 2    0.29  29.7%   1.0%   5.0%  64.4%   0.0%   0.0%   0.0%   0.0%
---   ----  -----  -----  -----  -----  -----  -----  -----  -----
avg   0.36  36.3%   1.0%   2.0%  60.8%   0.0%   0.0%   0.0%   0.0%

Memory: 33180K (22268K) real, 38868K (28840K) virtual, 499708K free  Page#
1/17

CPU TTY   PID USERNAME PRI NI   SIZE    RES STATE    TIME %WCPU  %CPU
COMMAND
 0 pty/ttyp1 18631 am       154 20  6096K  2412K sleep    3:17 93.84 93.68
postg
 0 rroot 18622 am       154 20  1888K  1192K sleep    0:01  0.78  0.78
amcodecon


ipcs

IPC status from /dev/kmem as of Tue Mar 23 19:19:19 2004
T      ID     KEY        MODE        OWNER     GROUP
Message Queues:
q       0 0x3c180239 -Rrw--w--w-      root      root
q       1 0x3e180239 --rw-r--r--      root      root
Shared Memory:
m       0 0x2f100002 --rw-------      root       sys
m       1 0x4118020d --rw-rw-rw-      root      root
m       2 0x4e0c0002 --rw-rw-rw-      root      root
m       3 0x4114006c --rw-rw-rw-      root      root
m       4 0x4118387e --rw-rw-rw-        am        am
m    3805 0x0052e2c1 --rw-------      postgres   postgres
m    8606 0x0c6629c9 --rw-r-----      root       sys
m     407 0x06347849 --rw-rw-rw-      root       sys
Semaphores:
s       0 0x2f100002 --ra-ra-ra-      root       sys
s       1 0x4118020d --ra-ra-ra-      root      root
s       2 0x4e0c0002 --ra-ra-ra-      root      root
s       3 0x4114006c --ra-ra-ra-      root      root
s       4 0x00446f6e --ra-r--r--      root      root
s       5 0x00446f6d --ra-r--r--      root      root
s       6 0x01090522 --ra-r--r--      root      root
s       7 0x61142e7c --ra-ra-ra-      root      root
s       8 0x73142e7c --ra-ra-ra-      root      root
s       9 0x70142e7c --ra-ra-ra-      root      root
s      10 0x69142e7c --ra-ra-ra-      root      root
s      11 0x75142e7c --ra-ra-ra-      root      root
s      12 0x63142e7c --ra-ra-ra-      root      root
s      13 0x64142e7c --ra-ra-ra-      root      root
s      14 0x66142e7c --ra-ra-ra-      root      root
s      15 0x6c142e7c --ra-ra-ra-      root      root
s    1168 0x0052e2c1 --ra-------      postgres  postgres
s     401 0x0052e2c2 --ra-------      postgres  postgres
s     402 0x0052e2c3 --ra-------      postgres  postgres



Re: postgres eating CPU on HP9000

From
Josh Berkus
Date:
Fabio,

> Postgres initially worked wonderfully, fast and solid.  It
> preformed complex joins in 0.01secs, and was able to keep up with our
> message queue.  It stayed this way for almost a year during our
> development.
>
> Recently it started eating up the cpu, and cannot keepup with the system
> like it used to.  The interesting thing here is that it still runs great
> on an older system with less ram, one slower cpu, and an older disk.

This really points to a maintenance problem.  How often do you run VACUUM
ANALYZE?   You have a very high rate of data turnover, and should need to
VACUUM frequently.

Also, what's you max_fsm_pages setting.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: postgres eating CPU on HP9000

From
"scott.marlowe"
Date:
On Fri, 26 Mar 2004, Fabio Esposito wrote:

>
> On Fri, 26 Mar 2004, scott.marlowe wrote:
>
> > > It maintains 48hours of data, so its not a large database; roughly
> > > <600mbs.  We do this by running a housekeeping program in a cron job.
> > > It deletes all data older then 48hours, then vaccuum analyzes.  It will
> > > also preform a reindex if the option is set before it vaccuum's.
> > >
> > This almost sounds like a problem (fixed in 7.4 I believe) where some
> > system catalog indexes would get huge over time, and couldn't be vacuumed
> > or reindexed while the database was up in multi-user mode.
> >
> > I'll defer to Tom or Bruce or somebody to say if my guess is even close...
> >
> We haven't tried 7.4,  I will experiment with it next week,  I hope it
> will be that simple.

In the meantime, a simple dump - reload into a test box running your
current version may provide some insight.  If it fixes the problem, then
you likely do have some kind of issue with index / table growth that isn't
being addressed by vacuuming.


Re: postgres eating CPU on HP9000

From
Josh Berkus
Date:
Fabio,

> I'll have to get back to you on that, but I'm 90% sure its the default out
> of the box.

Raise it, a lot.   Perhaps to 30,000 or 50,000.   VACUUM VERBOSE ANALYZE
should show you how many data pages are being reclaimed between vacuums.

Because of your very high rate of updates and deletes, you need to hold a lot
of data pages open.

You would also benefit a great deal by upgrading to 7.4.   7.3 will require
you to to REINDEXes several times a day with your current setup; 7.4 will
not.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: postgres eating CPU on HP9000

From
Tom Lane
Date:
Fabio Esposito <nfesposi@sourceweave.net> writes:
>> Did you start from a fresh initdb, or just drop and recreate user
>> tables?  I'm wondering about index bloat on the system tables ...

> I don't think I re initdb it, just dropped.  We did try a reindex command
> in the interactive editor, with no success.

Reindex of what?  I'd suggest looking to see the actual sizes of all the
indexes on system tables.  If my guess is right, some of them may be way
out of line (like larger than their associated tables).

            regards, tom lane

Re: postgres eating CPU on HP9000

From
"scott.marlowe"
Date:
On Tue, 23 Mar 2004, Fabio Esposito wrote:

>
> Hello fellow PostgreSQL users.
>
> We've been working on this interesting issue for some time now, and we're
> hoping that someone can help.
>
> We've recently integrated postgres into an existing mature app.  Its a
> time sensitive 24x7 system.  It runs on HP9000, a K370 Dual Processor
> system.  Postgres is version 7.3.2.  Its spawned as a child from a parent
> supervisory process, and they communicate to eachother via shared memory.
>
> We preform 9-12K selects per hour
>            6-8K inserts per hour (a few updates here as well)
>            1-1.5K Deletes per hour.
>
> It maintains 48hours of data, so its not a large database; roughly
> <600mbs.  We do this by running a housekeeping program in a cron job.
> It deletes all data older then 48hours, then vaccuum analyzes.  It will
> also preform a reindex if the option is set before it vaccuum's.
>
> Postgres initially worked wonderfully, fast and solid.  It
> preformed complex joins in 0.01secs, and was able to keep up with our
> message queue.  It stayed this way for almost a year during our
> development.
>
> Recently it started eating up the cpu, and cannot keepup with the system
> like it used to.  The interesting thing here is that it still runs great
> on an older system with less ram, one slower cpu, and an older disk.
>
> We tried the following with no success:
>
> running VACCUUM FULL
> dropping all tables and staring anew
> reinstalling postgres
> tweaking kernel parameters (various combos)
> tweaking postgres parameters (various combos)
> a number of other ideas

This almost sounds like a problem (fixed in 7.4 I believe) where some
system catalog indexes would get huge over time, and couldn't be vacuumed
or reindexed while the database was up in multi-user mode.

I'll defer to Tom or Bruce or somebody to say if my guess is even close...


Re: postgres eating CPU on HP9000

From
Tom Lane
Date:
Fabio Esposito <nfesposi@sourceweave.net> writes:
> We've recently integrated postgres into an existing mature app.  Its a
> time sensitive 24x7 system.  It runs on HP9000, a K370 Dual Processor
> system.  Postgres is version 7.3.2.  Its spawned as a child from a parent
> supervisory process, and they communicate to eachother via shared memory.

You would be well advised to update to 7.3.6, though I'm not sure if any
of the post-7.3.2 fixes have anything to do with your speed problem.

> Recently it started eating up the cpu, and cannot keepup with the system
> like it used to.  The interesting thing here is that it still runs great
> on an older system with less ram, one slower cpu, and an older disk.

> We tried the following with no success:

> running VACCUUM FULL
> dropping all tables and staring anew

Did you start from a fresh initdb, or just drop and recreate user
tables?  I'm wondering about index bloat on the system tables ...

            regards, tom lane

Re: postgres eating CPU on HP9000

From
"Marcus Andree S. Magalhaes"
Date:
<snip>

We are experiencing exactly the same problem here, and we use 7.4 on
Linux/i386 SMP (2 processors). Our databases does even more access:
about 30k selects per hour, 10k updates and inserts per hour

Vacuum analyze is done daily.

We migrated our database to a new server. Initially, everything was fine,
and pretty fast. In a week or so, Vacuum performance is pretty slow. What
was done in 15 minutes now takes 2 hours. Postgres is consuming a lot of
CPU power and, when the system is in peak period, it's even worse.

Sure, we have a large database. 3 tables have more than 10M records, but
more or less suddenly, we're having a heavy performance prejudice.

>
> This almost sounds like a problem (fixed in 7.4 I believe) where some
> system catalog indexes would get huge over time, and couldn't be
> vacuumed  or reindexed while the database was up in multi-user mode.
>
> I'll defer to Tom or Bruce or somebody to say if my guess is even
> close...
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)




Re: postgres eating CPU on HP9000

From
Josh Berkus
Date:
Marcus,

> We are experiencing exactly the same problem here, and we use 7.4 on
> Linux/i386 SMP (2 processors). Our databases does even more access:
> about 30k selects per hour, 10k updates and inserts per hour
>
> Vacuum analyze is done daily.

What is your max_fsm_pages setting?     If you are getting 10,000 updates per
hour, daily VACUUM ANALYZE may not be enough.

Also do you run VACUUM ANALYZE as a superuser, or as a regular user?

--
Josh Berkus
Aglio Database Solutions
San Francisco

Re: postgres eating CPU on HP9000

From
"Marcus Andree S. Magalhaes"
Date:
> Marcus,
>
>> We are experiencing exactly the same problem here, and we use 7.4 on
>> Linux/i386 SMP (2 processors). Our databases does even more access:
>> about 30k selects per hour, 10k updates and inserts per hour
>>
>> Vacuum analyze is done daily.
>
> What is your max_fsm_pages setting?     If you are getting 10,000
> updates per  hour, daily VACUUM ANALYZE may not be enough.
>

max_fsm_pages is set to 500000

> Also do you run VACUUM ANALYZE as a superuser, or as a regular user?
>

As a regular user (database owner). Is thery any difference when vacuuming
as a super user?

> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if
> your
>       joining column's datatypes do not match




Re: postgres eating CPU on HP9000

From
Fabio Esposito
Date:


On Fri, 26 Mar 2004, Josh Berkus wrote:

> Fabio,
>
> > Recently it started eating up the cpu, and cannot keepup with the system
> > like it used to.  The interesting thing here is that it still runs great
> > on an older system with less ram, one slower cpu, and an older disk.
>
> This really points to a maintenance problem.  How often do you run VACUUM
> ANALYZE?   You have a very high rate of data turnover, and should need to
> VACUUM frequently.
>
> Also, what's you max_fsm_pages setting.
>

We run VACUUM ANALYZE after we remove about 1000 rows every hour on the
halh hour.  Our max_fsm_pages is set to 10000

Thanks again

Fabio


Re: postgres eating CPU on HP9000

From
Tom Lane
Date:
"Marcus Andree S. Magalhaes" <marcus.magalhaes@vlinfo.com.br> writes:
>> Also do you run VACUUM ANALYZE as a superuser, or as a regular user?

> As a regular user (database owner). Is thery any difference when vacuuming
> as a super user?

That's your problem.  A regular user won't have permissions to vacuum
any tables but his own ... in particular, not the system tables.

            regards, tom lane

Re: postgres eating CPU on HP9000

From
Fabio Esposito
Date:
I'm sorry all, when you say regular user as opposed to superuser are you
talking about the user that postgres is installed and running as?  Should
this be done as the os's root?

Fabio

On Mon, 29 Mar 2004, Tom Lane wrote:

> "Marcus Andree S. Magalhaes" <marcus.magalhaes@vlinfo.com.br> writes:
> >> Also do you run VACUUM ANALYZE as a superuser, or as a regular user?
>
> > As a regular user (database owner). Is thery any difference when vacuuming
> > as a super user?
>
> That's your problem.  A regular user won't have permissions to vacuum
> any tables but his own ... in particular, not the system tables.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings
>


Re: postgres eating CPU on HP9000

From
Mark Lubratt
Date:
On Mar 29, 2004, at 9:36 AM, Tom Lane wrote:

> "Marcus Andree S. Magalhaes" <marcus.magalhaes@vlinfo.com.br> writes:
>>> Also do you run VACUUM ANALYZE as a superuser, or as a regular user?
>
>> As a regular user (database owner). Is thery any difference when
>> vacuuming
>> as a super user?
>
> That's your problem.  A regular user won't have permissions to vacuum
> any tables but his own ... in particular, not the system tables.
>
>             regards, tom lane

If I vacuum as the superuser, are the system tables automatically
vacuumed?  Or, does using -a from the vacuumdb command accomplish this?
  Or, is there something else I have to specify on the vacuumdb command
line?

Thanks!
Mark


Re: postgres eating CPU on HP9000

From
Bruno Wolff III
Date:
On Mon, Mar 29, 2004 at 12:00:16 -0500,
  Fabio Esposito <nfesposi@sourceweave.net> wrote:
>
> I'm sorry all, when you say regular user as opposed to superuser are you
> talking about the user that postgres is installed and running as?  Should
> this be done as the os's root?

The os user used for creating the cluster with initdb is a superuser.
Any accounts created with the permission to create more users are also
superusers.

Re: postgres eating CPU on HP9000

From
Josh Berkus
Date:
Fabio,

> We run VACUUM ANALYZE after we remove about 1000 rows every hour on the
> halh hour.  Our max_fsm_pages is set to 10000

Have you checked how long these vacuums take?   If they are starting to
overlap, that would explain your high CPU usage and poor performance.    You
might want to consider raising FSM_pages and vacuuming less frequently.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: postgres eating CPU on HP9000

From
Fabio Esposito
Date:

The Vacuum's don't take too long, 10 minutes at most.  I can tell from ps
-ef | grep and top that its the selects/inserts/updates from the postgres
related to our app that take all that time up.  If we rerun initdb and
reload the data, it works great for about two days, then goes bad again.

We are in the process of trying out 7.4.2 right now, just waiting on the
reload of pg_dump.

Fabio

> Fabio,
>
> > We run VACUUM ANALYZE after we remove about 1000 rows every hour on the
> > halh hour.  Our max_fsm_pages is set to 10000
>
> Have you checked how long these vacuums take?   If they are starting to
> overlap, that would explain your high CPU usage and poor performance.    You
> might want to consider raising FSM_pages and vacuuming less frequently.
>
> --
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Re: postgres eating CPU on HP9000

From
Josh Berkus
Date:
Fabio,

> The Vacuum's don't take too long, 10 minutes at most.  I can tell from ps
> -ef | grep and top that its the selects/inserts/updates from the postgres
> related to our app that take all that time up.  If we rerun initdb and
> reload the data, it works great for about two days, then goes bad again.
>
> We are in the process of trying out 7.4.2 right now, just waiting on the
> reload of pg_dump.

Well, test running VACUUM ANALYZE as the "postgres" superuser and see if that
fixes the issue.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco