Re: Still problems with memory swapping and server load - Mailing list pgsql-general

From Curt Sampson
Subject Re: Still problems with memory swapping and server load
Date
Msg-id Pine.NEB.4.43.0206262344350.1069-100000@angelic.cynic.net
Whole thread Raw
In response to Still problems with memory swapping and server load  ("Markus Wollny" <Markus.Wollny@computec.de>)
Responses Re: Still problems with memory swapping and server load  (Alvar Freude <alvar@a-blast.org>)
List pgsql-general
On Wed, 26 Jun 2002, Markus Wollny wrote:

> the same machine and database: 1GB RAM, 4xPIII550Xeon, dumpall.sql is
> ~300MB (see "[GENERAL] Urgent: Tuning strategies?"). It all starts with
> a humble 8MB swap being used (I expect that's just the empty swap with
> nothing in it but some system overhead). Then after a short time, memory
> usage climbs slow but continuously until it hits physical RAM ceiling
> and starts using swap - with not very nice results for the database.
> Swap sometimes amounts to 200MB or more.

Also use "vmstat", "systat vmstat" or whatever your system's
equivalant is to see just how much swapping you're doing. I wouldn't
be surprised to see some unused programs being pushed out to swap
as you do a lot of I/O, but if you're pushing stuff out to swap
and bringing it back in on a regular basis, you've still got
problems.

Also, remember, your OS may consider reading a program binary when
you run a program to be "page in" activity, so don't get to worried
about that, unless you also see page out activity.

> max_connections = 128
> shared_buffers = 32768
> sort_mem = 8192 (16384 or 32768 didn't help either)
> wal_files = 32
> wal_buffers = 32
> fsync = false

That looks good. Nothing should be using terribly much memory now.

> Mem:  1029400K av, 1023264K used,    6136K free,    0K shrd,    7176K buff

Ok, with only 7176K allocated to buffers, you've definitely got
some programs eating up your RAM, I'd say. Looking at your postmasters
below, sorted by size:

>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
>  6837 postgres   9   0  251M 251M  250M S     9.8 25.0   0:37 postmaster
>  6894 postgres   9   0  247M 247M  246M S     2.1 24.6   0:27 postmaster
>  6848 postgres  16   0  247M 247M  246M R    93.6 24.6   4:06 postmaster
>  6852 postgres   9   0  227M 227M  226M R     7.8 22.6   0:12 postmaster
>  6903 postgres  12   0  204M 204M  203M R    10.2 20.3   0:27 postmaster
>  6911 postgres   9   0 66840  65M 65728 S    19.4  6.4   0:01 postmaster
>  6845 postgres   9   0 52344  51M 50916 S     3.6  5.0   0:09 postmaster
>  6874 postgres   9   0 49408  48M 43168 S    19.8  4.7   3:57 postmaster
>  6875 postgres  11   0 41564  40M 35324 R    18.7  4.0   3:31 postmaster
>  6834 postgres   9   0 25456  24M 24356 S     3.0  2.4   0:26 postmaster
>  6889 postgres   9   0 24844  24M 23632 S    15.8  2.4   0:17 postmaster
>  6893 postgres   9   0 18396  17M 17332 S     0.1  1.7   0:07 postmaster
>  6838 postgres   9   0 18364  17M 17304 S     5.6  1.7   0:04 postmaster
>  6904 postgres   9   0 16604  16M 15528 S     1.0  1.6   0:13 postmaster
>  6907 postgres   9   0 16020  15M 14992 S     1.8  1.5   0:03 postmaster
>  6897 postgres   9   0 14988  14M 13948 S     6.0  1.4   0:01 postmaster
>  6926 postgres   9   0 14572  14M 13756 S    23.8  1.4   0:13 postmaster
>  6920 postgres  10   0 14296  13M 13476 R    21.1  1.3   0:13 postmaster
>  6927 postgres  10   0 14148  13M 13328 R    17.4  1.3   0:12 postmaster
>  6928 postgres   9   0 13836  13M 13016 S    25.8  1.3   0:13 postmaster
>  6917 postgres   9   0  9108 9104  8204 R    19.4  0.8   0:13 postmaster
>  6916 postgres   9   0  8940 8936  8020 S     0.1  0.8   0:08 postmaster
>  4799 root       9   0  1820 1444  1300 S     0.1  0.1   0:07 sshd
>  6934 root      16   0   976  976   732 R     8.0  0.0   0:07 top
>  5929 postgres  15   0   940  884   668 R     8.9  0.0   8:23 top

Some of your backends are getting pretty darn big. I wonder what
they're doing? It can't be sort memory at this point. But as you
can see, those five 200-250MB backends are killing you.

If you can figure out what they're doing, and put a stop to that
memory usage, that would help you. Alternatively, perhaps just
dropping another 1-2 GB of RAM in the machine would fix your problem.

Also, for this kind of thing, it's better to provide a "ps aux" or
"ps -ef" than a top, unless you're sure that that display above is
all of the processes.

cjs
--
Curt Sampson  <cjs@cynic.net>   +81 90 7737 2974   http://www.netbsd.org
    Don't you know, in this new Dark Age, we're all light.  --XTC




pgsql-general by date:

Previous
From: Curt Sampson
Date:
Subject: Re: extremely slow disk access (using SCSI, RAID)
Next
From: "Markus Wollny"
Date:
Subject: Re: Still problems with memory swapping and server load