Thread: Server side backend permanent session memory usage ?
Hi,
There seems to be an unbounded growth of memory usage by the backend postgres process representing a “permanent” session in our system.
The size/res values retrieved by running the “top” utility seem to exceed the amount I would expect given the mem and buf tuning parameters of the postgresql.conf file. ( Though, I find a theoretical maximum session value formula a bit difficult value to nail down from the documentation. )
I have seen other reported issues regarding session backend memory usage in stack-overflow and the workaround has amounted to periodically, dropping and restoring the pools connection. Otherwise over time ( many days ) the connections backends, collectively keep accumulating memory, swap space starts coming into play and the system performance takes a big hit.
I have implanted the session refresh concept and begun testing for improvement, but at least wanted to report this and get some feedback on my apparent lack of understanding on what is going on under the hood here ?
Some system notes:
FreeBSD 10.x
Postgres 9..3.5 and 9.5 systems
Hardware 2 Gig/2 core and 4 Gig /4 core Memory
There are no memory related errors reported in system logs or postgres logs.
There are no outstanding uncommitted transactions. ( per pgadminIII views )
The load offered is a very regular set of queries distributed against 5 – 10 “permanent” sessions for processing a VoIP softswitch.
I have played around with adjusting the work_mem and shared buffers up and down with no impact on the memory growth observation.
On initial session connection the “top” utility would report on the order of 220M/38000K (SIZE/RES) memory connection.
After 4 days of operation the same session will indicate 516M/300M.
Thank you for any insights or suggestions on how to better analyze what is transpiring.
Best Regards
Dave Day
"Day, David" <dday@redcom.com> writes: > There seems to be an unbounded growth of memory usage by the backend postgres process representing a "permanent" sessionin our system. It's hard to evaluate this report with so little information, but there are at least three possible explanations: 1. Many people misunderstand "top"'s output and believe they are seeing memory bloat when they aren't. This happens because top only charges pages of shared memory to a process after the process has first physically touched those pages. So a PG process's claimed use of shared memory will gradually grow from nothing to the whole shared-memory area, as it has occasion to make use of different shared buffers, lock table entries, etc. You can correct for this by subtracting the SHR (shared) column from the process's reported size, but people often fail to. (Note: it's possible that FreeBSD's implementation doesn't suffer from this problem, but the issue definitely exists on e.g. Linux.) 2. If, over time, the queries issued to the process touch many different tables (I'm talking thousands of tables), or execute large numbers of distinct plpgsql functions, etc, then you will get bloating of the internal caches that hold copies of that catalog data. PG generally operates on the principle that cached is better than not cached, so it doesn't try to limit the size of those caches; but in some installations that can cause problems. If this is your situation, then indeed restarting the sessions periodically may be necessary. 3. Or you might have found an actual memory leak. PG's memory usage conventions are such that true leaks that persist across transactions are pretty rare ... but I won't say it doesn't happen. If you've eliminated point 1 and want to try to look into the other theories, you could do this: attach to a recently-started session with gdb, and execute call MemoryContextStats(TopMemoryContext) quit This will cause a memory usage map to get dumped to stderr (hopefully you are starting the postmaster in such a way that that gets captured to a log file rather than sent to /dev/null). Save that. Wait until you see bloat, reattach and repeat, compare the memory maps. Let us know what you see. If possible, compare maps taken at points where the session is idle and waiting for input. regards, tom lane
Hi, I've been knocking heads with this issue for a while without updating this thread. I have implemented a refresh connection behavior from our pooler/client side which seems to be a successful work-around for memory loss indications on the backend side. With that "solution" in my pocket and at the suggestion/hunch of one our developers, I ran this experiment. We have a server side function written in plpgsql called by the client side application. That fx does work that includes Invoking a server side plpython2u function. If I repeat calling this severer side logic/function outside our app from a a psql invoked connection and I monitor the memory growth on the backend side I see top reporting a continual growth trend after each burst of invocations. The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use of a"subprocess" method does not cleanup correctly and this accounts for the memory growth and system performance degradation. FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files. I could attempt to create a self contained example for the pg community exploration or preferably can attach gdb to the backendfor those who would like to suggest to me some particular details that might confirm or rule out this theory. Thanks Dave Day -------- Tom, Thank you for that very considered answer. I will put that information to use over the next couple of days and get back to the gen-list with my findings. I hear what you are saying about item 1. I suspect it is at least an indicator of the degradation if not the actual causeof it. The only thing I have had to restart historically to recover is the pooling agent to recover normal operation. ( And the poolers/client memory stats are pretty stable with run time ) There are only about 120 tables in two schemas that could be accessed by these session, and I suspect what they are actuallyrepeatedly accessing is a very small subset of that, 20-30 tables and perhaps 30-40 functions, which perhaps makesItem 2 unlikely. Item 3 - should be doable to get these results - might have some information by tomorrow. Regards Dave Day -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Thursday, July 14, 2016 11:22 AM To: Day, David Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Server side backend permanent session memory usage ? "Day, David" <dday@redcom.com> writes: > There seems to be an unbounded growth of memory usage by the backend postgres process representing a "permanent" sessionin our system. It's hard to evaluate this report with so little information, but there are at least three possible explanations: 1. Many people misunderstand "top"'s output and believe they are seeing memory bloat when they aren't. This happens becausetop only charges pages of shared memory to a process after the process has first physically touched those pages. So a PG process's claimed use of shared memory will gradually grow from nothing to the whole shared-memory area, as it hasoccasion to make use of different shared buffers, lock table entries, etc. You can correct for this by subtracting the SHR (shared) column from the process's reported size, but people often fail to. (Note: it's possible that FreeBSD's implementation doesn't suffer from this problem, but the issue definitely existson e.g. Linux.) 2. If, over time, the queries issued to the process touch many different tables (I'm talking thousands of tables), or executelarge numbers of distinct plpgsql functions, etc, then you will get bloating of the internal caches that hold copiesof that catalog data. PG generally operates on the principle that cached is better than not cached, so it doesn'ttry to limit the size of those caches; but in some installations that can cause problems. If this is your situation,then indeed restarting the sessions periodically may be necessary. 3. Or you might have found an actual memory leak. PG's memory usage conventions are such that true leaks that persist acrosstransactions are pretty rare ... but I won't say it doesn't happen. If you've eliminated point 1 and want to try to look into the other theories, you could do this: attach to a recently-startedsession with gdb, and execute call MemoryContextStats(TopMemoryContext) quit This will cause a memory usage map to get dumped to stderr (hopefully you are starting the postmaster in such a way thatthat gets captured to a log file rather than sent to /dev/null). Save that. Wait until you see bloat, reattach andrepeat, compare the memory maps. Let us know what you see. If possible, compare maps taken at points where the sessionis idle and waiting for input. regards, tom lane
"Day, David" <dday@redcom.com> writes: > We have a server side function written in plpgsql called by the client side application. That fx does work that includes > Invoking a server side plpython2u function. > If I repeat calling this severer side logic/function outside our app from a a psql invoked connection and I monitor > the memory growth on the backend side I see top reporting a continual growth trend after each burst of invocations. > The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use ofa "subprocess" method > does not cleanup correctly and this accounts for the memory growth and system performance degradation. > FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files. It'd be very useful to find out whether you can reproduce the leakage in 9.6beta (any beta release will do). There are a couple of significant fixes in plpython that might be relevant --- I'm looking at these in particular: https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=8c75ad436 https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=1d2fe56e4 We did not back-patch those changes because they seemed rather invasive/risky compared to the value for average users. If you still see misbehavior in 9.6, it'd be worth trying to extract a self-contained example. regards, tom lane
-----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us] Sent: Wednesday, July 27, 2016 4:50 PM To: Day, David Cc: pgsql-general@postgresql.org Subject: Re: [GENERAL] Server side backend permanent session memory usage ? "Day, David" <dday@redcom.com> writes: > We have a server side function written in plpgsql called by the client > side application. That fx does work that includes Invoking a server side plpython2u function. > If I repeat calling this severer side logic/function outside our app > from a a psql invoked connection and I monitor the memory growth on the backend side I see top reporting a continual growthtrend after each burst of invocations. > The premise was that the issue of the repeated invocation of the python interpreter and/or the python function use ofa "subprocess" method > does not cleanup correctly and this accounts for the memory growth and system performance degradation. > FreeBSD 10.1, postgres 9.5.3 or postgres 9.3.11, running on VM, no relevant problems indicated in log files. It'd be very useful to find out whether you can reproduce the leakage in 9.6beta (any beta release will do). There are acouple of significant fixes in plpython that might be relevant --- I'm looking at these in particular: https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=8c75ad436 https://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=1d2fe56e4 We did not back-patch those changes because they seemed rather invasive/risky compared to the value for average users. If you still see misbehavior in 9.6, it'd be worth trying to extract a self-contained example. regards, tom lane Tom, I would venture that the first item you listed probably explains our observations. I plan to write up an internal trouble report on our end so that when we start using 9.6 we will disable the connection refresh workaround and re-check the issue and report back at that time. ( Not soon unfortunately ) I guess then it is a word of caution in the meantime to 9.5 and earlier version users who use the python extension and havelong lived connections. Thanks so much for your assistance. Best Regards Dave Day