Thread: Is IDLE session really idle?

Is IDLE session really idle?

From

Igor Polishchuk

Date:

15 June 2009, 17:22:57

Hello everybody!

I have many app servers using connection pooling. At any time, there are about 1000 total connection to the database from all the app servers; however, only few random connections are active.
The application often executes relatively big sorts. The work_mem size is 32MB, and eventually many sessions have a chance to run a sort and allocate a big sort area. I see hundreds of postgres processes with DATA segment > 15MB.
Eventually, it consumes all the available memory. Most of this memory is allocated to the sessions that are idle. I cannot change the connection pooling on the application side, and the big sorts cannot be eliminated. I need a solution on the DB side.

I see two ways to solve the problem.:

Reducing the work_mem to something like 10MB will cap the total memory consumption. However, the memory will still be allocated to the mostly idle sessions. Also, reduced work_mem will cause increased disk I/O, which is already high.
We can kill the idle sessions periodically. This will free up a big chunk of memory already allocated to the sessions. The application will gradually reestablish the connections, and the new sessions will start with small memory foot-print.

The question is, how safe it is to kill an idle session? If a session just became idle in pg_stats_activity, is it possible that it is still returning data to the client, or doing some other useful work?

In case it makes a difference: The Postgres version is 8.2.4 on Linux Suse 10.

Thank you.

Re: Is IDLE session really idle?

From

Tom Lane

Date:

15 June 2009, 17:32:07

Igor Polishchuk <ipolishchuk@hi5.com> writes:
> The application often executes relatively big sorts. The work_mem size is
> 32MB, and eventually many sessions have a chance to run a sort and allocate
> a big  sort area. I see hundreds of postgres processes with DATA segment >
> 15MB.
> Eventually, it consumes all the available memory. Most of this memory is
> allocated to the sessions that are idle.  I cannot change the connection
> pooling on the application side, and the big sorts cannot be eliminated. I
> need a solution on the DB side.

Find out why the workspace isn't being released back to the OS.  On some
old versions of Unix, not releasing data space is the norm, but that is
not true on any remotely modern version of Linux.

> In case it makes a difference: The Postgres version is 8.2.4 on Linux Suse
> 10.

... which apparently is what you are running, which makes me question
whether you are really seeing any such effect at all.  Are you sure what
you are measuring isn't just top's inclination to report only a portion
of shared memory as having been used by the process?

I strongly suspect that what you are seeing is an illusion caused by
your measurement tools.

            regards, tom lane

Re: Is IDLE session really idle?

From

Tino Schwarze

Date:

15 June 2009, 17:33:18

Hi Igor,

On Mon, Jun 15, 2009 at 01:15:30PM -0700, Igor Polishchuk wrote:

> 2. We can kill the idle sessions periodically. This will free up a big chunk
> of memory already allocated to the sessions. The application will gradually
> reestablish the connections, and the  new sessions will start with small
> memory foot-print.

Don't do that. There is a race condition - if the application just
starts using the connection a millisecond after you thought it was idle
and issued the kill command, bad things might happen, depending on the
robustness of the application. It might have already validated the
connection and started the first query.

> The question is, how safe it is to kill an idle session? If a session just
> became idle in pg_stats_activity, is it possible that it is still returning
> data to the client, or doing some other useful work?

I recently figured out (on PostgreSQL 8.2.x) that an IDLE session might
still be returning query results. I wrote an application to dump whole
tables to flat files. It does a SELECT * FROM table, then streams to the
file. Access is via JDBC, results are retrieved row by row via
ResultSet.next(). I noticed that in pg_stat_activity a session switched
between IDLE and "SELECT * FROM table" command while data was being
retrieved.

HTH,

Tino.

--
"What we nourish flourishes." - "Was wir nähren erblüht."

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de

Re: Is IDLE session really idle?

From

Michael Monnerie

Date:

15 June 2009, 17:36:01

On Montag 15 Juni 2009 Igor Polishchuk wrote:
> I see two ways to solve the problem.:

Why not just increase
vm.swappiness = 100
in /etc/sysctl.conf, and/or use ramzswap? That will swap out unneeded
memory very fast, and free it for use by others.

There was an article on lwn.net recently:
http://lwn.net/SubscriberLink/334649/76bbe60a81cfc7f5/
( Compcache: in-memory compressed swapping )
If installed it on several servers, and my desktop. quite nice:
Compresses 2GB to 512MB on my desktop, or 494MB to 186MB on a server.

Even if staying with "normal" swap, increasing the swappiness can
quickly free memory, but swap I/O might slow things down, depending on
your setup.

mfg zmi
--
// Michael Monnerie, Ing.BSc    -----      http://it-management.at
// Tel: 0660 / 415 65 31                      .network.your.ideas.
// PGP Key:         "curl -s http://zmi.at/zmi.asc | gpg --import"
// Fingerprint: AC19 F9D5 36ED CD8A EF38  500E CE14 91F7 1C12 09B4
// Keyserver: wwwkeys.eu.pgp.net                  Key-ID: 1C1209B4

Re: Is IDLE session really idle?

From

Scott Marlowe

Date:

15 June 2009, 18:13:05

On Mon, Jun 15, 2009 at 2:15 PM, Igor Polishchuk<ipolishchuk@hi5.com> wrote:
> Hello everybody!
>
> I have many app servers using connection pooling. At any time, there are
> about 1000 total connection to the database from all the app servers;
>  however, only few random connections are active.
> The application often executes relatively big sorts. The work_mem size is
> 32MB, and eventually many sessions have a chance to run a sort and allocate
> a big  sort area. I see hundreds of postgres processes with DATA segment >
> 15MB.
> Eventually, it consumes all the available memory. Most of this memory is
> allocated to the sessions that are idle.  I cannot change the connection
> pooling on the application side, and the big sorts cannot be eliminated. I
> need a solution on the DB side.

Could you post the commands and output you used to determine this?  I
agree with Tom that you might be mis-measuring memory usage.

You do not mention your shared_memory size or total memory size.  If
you've got say 16G of ram and 12 Gig of shared_memory, then it's quite
possible having touched all of shared_memory a pgsql backend will show
10 or 12 Gigs memory used.  It's not actually independently using that
much, but some people freak when they see it the first time.

Re: Is IDLE session really idle?

From

Igor Polishchuk

Date:

15 June 2009, 19:25:21

Thank you guys for your responses.

I'm using top to look for the memory consumption by the postgres processes.
Inside the top, I'm using the interactive sorting command F and choose the
sort by Data segment size. The exact sequence is:
1. launch top
2. Press 'F', the list of available for sorting fields appears on the
screen, including "s: DATA       = Data+Stack size (kb)"
3. press "s"
4. Top now shows the list of processes sorted by the field DATA

Here how the sorted top screen looks in the end:

top - 15:14:37 up 1 day, 23:20,  3 users,  load average: 5.64, 5.18, 5.10
Tasks: 819 total,   2 running, 817 sleeping,   0 stopped,   0 zombie
Cpu(s):  1.4%us,  0.8%sy,  0.0%ni, 75.4%id, 22.1%wa,  0.0%hi,  0.3%si,
0.0%st
Mem:  32876676k total, 18455704k used, 14420972k free,   129856k buffers
Swap:  2104504k total,     2720k used,  2101784k free, 15038240k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  DATA COMMAND
 4392 hyperic   16   0  410m  73m 9672 S    0  0.2 136:17.29 363m java
29487 postgres  16   0  108m  83m 2680 S    0  0.3   0:01.10  81m view
  803 postgres  16   0 2300m 180m 163m S    0  0.6   0:01.04  23m postgres
 1808 postgres  16   0 2300m 164m 147m S    0  0.5   0:01.03  23m postgres
  577 postgres  16   0 2298m 166m 150m S    0  0.5   0:00.87  22m postgres
  568 postgres  16   0 2298m 141m 126m S    0  0.4   0:00.63  22m postgres
 1506 postgres  16   0 2298m 139m 124m S    0  0.4   0:00.81  22m postgres
  362 postgres  16   0 2292m 128m 115m S    0  0.4   0:00.66  16m postgres
 7674 postgres  15   0 2288m  29m  20m S    0  0.1   0:00.10  13m postgres
 7238 postgres  16   0 2289m  61m  52m S    0  0.2   0:00.23  12m postgres
 7440 postgres  16   0 2288m  51m  42m S    0  0.2   0:00.18  12m postgres
 7248 postgres  16   0 2288m  52m  44m S    0  0.2   0:00.17  12m postgres
 7336 postgres  16   0 2288m  59m  50m S    0  0.2   0:00.20  12m postgres
 7246 postgres  16   0 2288m  52m  44m S    0  0.2   0:00.12  12m postgres
 6913 postgres  16   0 2288m  59m  51m S    0  0.2   0:00.22  12m postgres
 7013 postgres  16   0 2288m  51m  43m S    0  0.2   0:00.10  12m postgres
 7288 postgres  16   0 2288m  48m  39m S    0  0.2   0:00.16  12m postgres
 7327 postgres  16   0 2288m  53m  44m S    0  0.2   0:00.16  12m postgres
 7070 postgres  16   0 2288m  50m  42m S    0  0.2   0:00.16  12m postgres
 7543 postgres  15   0 2288m  47m  39m S    0  0.1   0:00.13  11m postgres
........

Also, in vmstat, I see the gradual reduction in size of the cache memory.
Apparently, the Linux cache gets gradually dismissed by the postgres
processes memory areas. Eventually, the database just hangs and the host
becomes unresponsive for about 15 minutes till the sessions die out.
So, I believe it is not just my misinterpretation of the metrics.

Isn't it true, that work memory once allocated for a  session does not get
deallocated till the sessions is closed? It was my impression, anyway.
So, eventually enough sessions get big work memory allocated to starve the
Linux out of memory.

My physical memory size is 32GB,
Shared_buffers = 2GB



On 6/15/09 2:13 PM, "Scott Marlowe" <scott.marlowe@gmail.com> wrote:

> On Mon, Jun 15, 2009 at 2:15 PM, Igor Polishchuk<ipolishchuk@hi5.com> wrote:
>> Hello everybody!
>>
>> I have many app servers using connection pooling. At any time, there are
>> about 1000 total connection to the database from all the app servers;
>>  however, only few random connections are active.
>> The application often executes relatively big sorts. The work_mem size is
>> 32MB, and eventually many sessions have a chance to run a sort and allocate
>> a big  sort area. I see hundreds of postgres processes with DATA segment >
>> 15MB.
>> Eventually, it consumes all the available memory. Most of this memory is
>> allocated to the sessions that are idle.  I cannot change the connection
>> pooling on the application side, and the big sorts cannot be eliminated. I
>> need a solution on the DB side.
>
> Could you post the commands and output you used to determine this?  I
> agree with Tom that you might be mis-measuring memory usage.
>
> You do not mention your shared_memory size or total memory size.  If
> you've got say 16G of ram and 12 Gig of shared_memory, then it's quite
> possible having touched all of shared_memory a pgsql backend will show
> 10 or 12 Gigs memory used.  It's not actually independently using that
> much, but some people freak when they see it the first time.

Re: Is IDLE session really idle?

From

Scott Marlowe

Date:

15 June 2009, 19:44:22

On Mon, Jun 15, 2009 at 4:25 PM, Igor Polishchuk<ipolishchuk@hi5.com> wrote:
> Thank you guys for your responses.
>
> I'm using top to look for the memory consumption by the postgres processes.
> Inside the top, I'm using the interactive sorting command F and choose the
> sort by Data segment size. The exact sequence is:
> 1. launch top
> 2. Press 'F', the list of available for sorting fields appears on the
> screen, including "s: DATA       = Data+Stack size (kb)"
> 3. press "s"
> 4. Top now shows the list of processes sorted by the field DATA
>
> Here how the sorted top screen looks in the end:
>
> top - 15:14:37 up 1 day, 23:20,  3 users,  load average: 5.64, 5.18, 5.10
> Tasks: 819 total,   2 running, 817 sleeping,   0 stopped,   0 zombie
> Cpu(s):  1.4%us,  0.8%sy,  0.0%ni, 75.4%id, 22.1%wa,  0.0%hi,  0.3%si,
> 0.0%st
> Mem:  32876676k total, 18455704k used, 14420972k free,   129856k buffers
> Swap:  2104504k total,     2720k used,  2101784k free, 15038240k cached

Out of 32G, you've got 14G free, 18G used of which 15G is cache.  Looks good.

>  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  DATA COMMAND
>  4392 hyperic   16   0  410m  73m 9672 S    0  0.2 136:17.29 363m java
> 29487 postgres  16   0  108m  83m 2680 S    0  0.3   0:01.10  81m view
>  803 postgres  16   0 2300m 180m 163m S    0  0.6   0:01.04  23m postgres
>  1808 postgres  16   0 2300m 164m 147m S    0  0.5   0:01.03  23m postgres
>  577 postgres  16   0 2298m 166m 150m S    0  0.5   0:00.87  22m postgres
>  568 postgres  16   0 2298m 141m 126m S    0  0.4   0:00.63  22m postgres
>  1506 postgres  16   0 2298m 139m 124m S    0  0.4   0:00.81  22m postgres
>  362 postgres  16   0 2292m 128m 115m S    0  0.4   0:00.66  16m postgres
>  7674 postgres  15   0 2288m  29m  20m S    0  0.1   0:00.10  13m postgres
>  7238 postgres  16   0 2289m  61m  52m S    0  0.2   0:00.23  12m postgres
>  7440 postgres  16   0 2288m  51m  42m S    0  0.2   0:00.18  12m postgres
>  7248 postgres  16   0 2288m  52m  44m S    0  0.2   0:00.17  12m postgres
>  7336 postgres  16   0 2288m  59m  50m S    0  0.2   0:00.20  12m postgres
>  7246 postgres  16   0 2288m  52m  44m S    0  0.2   0:00.12  12m postgres
>  6913 postgres  16   0 2288m  59m  51m S    0  0.2   0:00.22  12m postgres
>  7013 postgres  16   0 2288m  51m  43m S    0  0.2   0:00.10  12m postgres
>  7288 postgres  16   0 2288m  48m  39m S    0  0.2   0:00.16  12m postgres
>  7327 postgres  16   0 2288m  53m  44m S    0  0.2   0:00.16  12m postgres
>  7070 postgres  16   0 2288m  50m  42m S    0  0.2   0:00.16  12m postgres
>  7543 postgres  15   0 2288m  47m  39m S    0  0.1   0:00.13  11m postgres

VIRT of 2.2G is pretty normal, considering it likely includes all of
shared_buffers ever touched.  Is this machine now at idle?

> Also, in vmstat, I see the gradual reduction in size of the cache memory.
> Apparently, the Linux cache gets gradually dismissed by the postgres
> processes memory areas.

Well, this top doesn't show that.  As long as you've got 14G or so
free the linux kernel won't be discarding cache.

> Eventually, the database just hangs and the host
> becomes unresponsive for about 15 minutes till the sessions die out.
> So, I believe it is not just my misinterpretation of the metrics.

Do you have output of top and vmstat when this is happening?

> Isn't it true, that work memory once allocated for a  session does not get
> deallocated till the sessions is closed?

By session do you mean transaction, or the life of the connection?

> It was my impression, anyway.
> So, eventually enough sessions get big work memory allocated to starve the
> Linux out of memory.

But we've seen no evidence of this happening in vmstat or top.  It
would be really handy to see what they're saying when this is
happening.

> My physical memory size is 32GB,
> Shared_buffers = 2GB

Re: Is IDLE session really idle?

From

Tom Lane

Date:

15 June 2009, 19:48:14

"Scott Whitney" <swhitney@journyx.com> writes:
> Sorry to bother you. Can you route me towards any information on this
> statement?

> " top's inclination to report only a portion
> of shared memory as having been used by the process?"

On many platforms, top shows each backend process as having "used" only
as many pages of PG's shared memory segment as that specific process has
actually touched in its lifespan.  So the normal behavior is that the
reported memory size gradually rises from just-private-space to
private-space-plus-all-of-shared-memory as the process randomly happens
to have need to touch each of the shared buffer slots.  This is on top
of any actual increase in the process's own (private) memory space.

I suspect the other guy was seeing this effect more than any real
long-term increase in private space.  But of course there's not
enough evidence in what he posted to prove it one way or the other.

Some versions of "top" distinguish shared and private memory usage,
which really helps.  But a lot don't, or people may not know which
column to pay attention to.

> I _thought_ I had seen such behavior in the past, but I've never found (not
> particularly looked) for such information. Top seems to lie to me on a
> fairly-regular basis, and not just via PG...

This particular effect only applies to applications that use large
chunks of shared memory.  There might be some other misleading things :-(

            regards, tom lane

Re: Is IDLE session really idle?

From

Tom Lane

Date:

15 June 2009, 20:19:21

Igor Polishchuk <ipolishchuk@hi5.com> writes:
> ... Eventually, the database just hangs and the host
> becomes unresponsive for about 15 minutes till the sessions die out.

Well, that's a problem all right, but I'm entirely unconvinced by your
hypothesis about what's causing it.  Have you tried watching "vmstat"
(particularly the swap rate) while things go downhill?  Do any signs
of distress start to appear in the postmaster log or other system logs?
Do other services on the box start to fail, and if so what are they
complaining about?

> Isn't it true, that work memory once allocated for a  session does not get
> deallocated till the sessions is closed?

No.

            regards, tom lane

Re: Is IDLE session really idle?

From

Dimitri Fontaine

Date:

22 June 2009, 10:53:55

Hi,

Tom Lane <tgl@sss.pgh.pa.us> writes:
> "Scott Whitney" <swhitney@journyx.com> writes:
>> I _thought_ I had seen such behavior in the past, but I've never found (not
>> particularly looked) for such information. Top seems to lie to me on a
>> fairly-regular basis, and not just via PG...
>
> This particular effect only applies to applications that use large
> chunks of shared memory.  There might be some other misleading things
> :-(

You might appreciate this reading (which talks about using exmap to get
reliable figures, but it seems to relate to shared objects and it's
unclear whether exmap will account correctly for IPC shared memory):

  http://ktown.kde.org/~seli/memory/desktop_benchmark.html

  The tool used to measure memory usage was Exmap - the only tool for
  measuring memory usage that I've ever found to be actually useful (I
  think I've already blogged about it ;) ). Its so-called effective
  memory usage numbers try to account for things like dividing shared
  libraries among all the processes using them, unlike tools like top
  that just report the numbers they find in /proc and nobody really
  knows how to interpret them. In other words, if you use things like
  top or free for precise measuring of memory usage, you're
  crazy. Nevertheless, for the crazy ones, I used also free alongside
  with Exmap, just for the fun of it, numbers from free will follow in
  parentheses. They should not be considered to be useful though.

Regards,
--
dim