Re: One source of constant annoyance identified - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: One source of constant annoyance identified
Date
Msg-id 20020704010022.A8722@svana.org
Whole thread Raw
In response to Re: One source of constant annoyance identified  ("Markus Wollny" <Markus.Wollny@computec.de>)
Responses Re: One source of constant annoyance identified
List pgsql-general
On Wed, Jul 03, 2002 at 03:23:39PM +0200, Markus Wollny wrote:
> Okay, now it's even more annoying...
>
> We just upgraded the server from 1GB to 2GB of RAM because it kept
> swapping out about 300MB.

Well, you've solved the swapping problem.

>   PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME COMMAND
>  1652 postgres   9   0  469M 469M  467M S     0.0 23.3   2:21 postmaster
>  2041 postgres   9   0  431M 431M  429M S     1.2 21.4   1:32 postmaster
>  1588 postgres   9   0  411M 411M  410M S    15.6 20.4   1:25 postmaster
>  1597 postgres   9   0  289M 289M  287M S     0.1 14.3   0:31 postmaster

That %CPU refers to the average CPU over the lifetime of the process. For
example, that first process has done nearly 2 and a half minutes of work.
That's either a lot of queries or one really big one. Now I can imagine
that many queries mapping in all of shared memory. Note that RSS is around
the size of the shared memory you have. So all those 460M processes seem to
be all the same memory, all shared.

What exactly is the problem here? You're not swapping.

> And still the major problem in finding the cause of all this is to
> identify the query which causes these huge backends. At the moment I
> tend to suspect that there certainly are several queries which need a
> bit of RAM, not that much however, and the backend fails to free up the
> memory used by processed queries. What puzzles me here is that all these
> 400MB+ backend-processes are marked as idle (ps-output) respectively
> sleeping, so they don't actually do much. But how come a sleeping
> process grabs 20% of 2GB of RAM and 15% of processing time?

My theory is that it's not using RAM at all, it's simply mapping the shared
memory in, which inflates the RSS. If you reduce the amount of shared
memory, does the RSS of the processes go down too?

> Another fact that hints at backends just eating memory without reason is
> that when I stop the database, processes keep lingering - kill just
> doesn't help, I have to kill -9 the last of the lot to get red of them
> before restarting the database:

Never kill -9. Recipie for disaster. If you want those processes to die,
perhaps you should stop the processes that are accesses the DB. They're the
ones that are idle. Are you using persistant connections at all? If so, how
long for?

Note that pg_ctl has various stop modes: smart, fast and immediate. Make
you're using the one you want.

>  1689 pts/0    S      0:00 postgres: postgres phppgadmin 212.123.109.25
> idle
>  1690 pts/0    S      0:00 postgres: postgres pszone 212.123.109.25 idle
>
>  1691 pts/0    S      0:00 postgres: postgres saturn 212.123.109.25 idle
>
>  1693 pts/0    S      0:00 postgres: postgres template1 212.123.109.25
> idle
>  1780 pts/0    S      0:00 postgres: postgres template1 212.123.108.149
> idle
>  1781 pts/0    S      0:00 postgres: postgres template1 212.123.108.149
> idle
>  1784 pts/0    S      0:00 postgres: postgres pcgames 212.123.108.149
> idle

Note that these are not the processes you listed above. The process times
don't match. You also have quite a lot of connections to template1.

>   3:11pm  up  2:03,  3 users,  load average: 3.24, 1.65, 1.81

Look at that load average. You have a whole CPU idle. CPU power is not your
limitation. No swap so that's not the problem. You really need to work out
what is slow.

> Which isn't quite as bad anymore - although it cannot be the optimum
> performance for this machine, I can't imagine.

So is it actually slow or are you looking for some ethereal "faster".

> Now what on earth can I do to get this DB running, and when I say
> "running" I don't mean "sort of crawling uphills"? Are there any

Look. 90% of performance improvement comes from modifying queries. Tweaking
the config settings can really only account for so much. There is nothing in
any of your messages where you have provided anything that we can use to
help you. There are no optimal settings, they depend entirely on what your
queries are.

> OS-patches or environment-settings I need to consider? Are there known
> memory-leaks? I just doesn't seem to matter whichever settings I take in
> postgresql.conf, the database just eats up any available memory in any
> case. I cannot use it for production purposes this way. I am quite
> dispaired right now - and I am definitely running out of time. Which
> bits of linux (glibc, kernel, whatnot) does postgresql 7.2.1 need in
> which version to run smoothly? As I mentioned before, we don't use any
> "fancy" features of the database like foreign keys, triggers (except
> one) or whatever, it's just basic functionality that seems to fall down
> on us...

By the way, you do notice the little figure in your top output labeled
"cached" that is around 1.7GB. I do beleive that means the kernel is using
that much memory for disk cache. You are not running out of memory by any
stretch of the imagination (unless one of your queries is using a lot, but
it appears to be getting freed at the end).

> I am willing to try almost anything - but I need to squeeze more
> performance out of this thing and I need some hints on which tools to
> use to identify the problem. And unfortunately I need all this as soon
> as possible... Hints and help are very, very much appreciated. Thank
> you!

Work out where your bottleneck is. It's not memory and it doesn't appear to
be CPU. With that much cache I can't imagine it's disk transfer rate either.
So the only explanation is the individual queries.

As for hints, the straight forward method is the best. Find whatever action
is the slowest and profile it. How much in the front end, how much in the
database, how much in client latency. Only once you understand where the
time is going can you do any meaningful optimisation. Tweaking config
settings will not help you here.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> There are 10 kinds of people in the world, those that can do binary
> arithmetic and those that can't.



pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: One source of constant annoyance identified
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Why is index disregarded when querying a timestamp?