Thread: PostgreSQL processes use large amount of private memory on Windows

PostgreSQL processes use large amount of private memory on Windows

From
Øystein Kolsrud
Date:

I have a product that uses PostgreSQL (v9.6) as database on Windows, and we often observe that the "shared working set" memory reported for the individual connection processes is quite high (around 200MB), but the "private working set" is relatively low (around 5 MB). I assume this is due to the processes utilizing the shared buffers of Postgres, and in these cases everything works fine as the total memory consumption on the system is quite low. But for some deployments we have observed that the "private working set" is around 200 MB as well, and in these cases the server quickly runs out of memory when the number of connections rises. I have never been able to reproduce this behavior myself.

So my question is: When does a postgres process forked for a connection use private memory instead of shared, and what can I do to avoid this?

The reference documentation (https://www.postgresql.org/docs/9.6/runtime-config-resource.html) describes the settings affecting memory usage including the "shared_buffers" setting which I can see has a clear impact on the "shared working set" utilization, but I find no mention of potential "private working set" consumers. This page also gives clues about performance tuning, but again no mention of private memory: https://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

All settings typically referred to (like "work_mem" and "temp_buffers") are at default settings, and I have a very hard time seeing how those settings could add up to memory in the magnitude of 200MB. The "max_connections" settings is quite high (>1000), but there are nowhere near that many connections active in the system. The only way I have been able to trigger growth of processes in my tests is to extract large amounts of data, but when I do that, it is always the "shared working set" that grows. "private working set" remains stable at around 5 MB. --

Best regards, Øystein Kolsrud

Re: PostgreSQL processes use large amount of private memory on Windows

From
Thomas Kellerer
Date:
Øystein Kolsrud schrieb am 17.09.2020 um 11:19:
> All settings typically referred to (like "work_mem" and
> "temp_buffers") are at default settings, and I have a very hard time
> seeing how those settings could add up to memory in the magnitude of
> 200MB. The "max_connections" settings is quite high (>1000), but
> there are nowhere near that many connections active in the system.
> The only way I have been able to trigger growth of processes in my
> tests is to extract large amounts of data, but when I do that, it is
> always the "shared working set" that grows. "private working set"
> remains stable at around 5 MB.


It's not clear to me how many open (vs. "active") connections you have.
An "active" connection is one that is currently _running_ a statement,
an "open" is one that is "idle", i.e. not doing anything.

To my knowledge, a single backend process (=connection) will not release
the memory it once allocated. The only way to free that memory is to
physically close the connection.

So if all your > 1000 connections are currently idle/open (but not "active"),
they will still consume the memory they have allocated _when_ they where
active.

With that high number of connections it isn't really surprising that
you run out of memory at one point or another then.




Re: PostgreSQL processes use large amount of private memory on Windows

From
Tom Lane
Date:
=?UTF-8?Q?=C3=98ystein_Kolsrud?= <kolsrud@gmail.com> writes:
> So my question is: When does a postgres process forked for a connection use
> private memory instead of shared, and what can I do to avoid this?

The only significant long-term consumption of private memory is for
caches.  There are catalog caches, which can get large if the session
accesses a whole lot of database objects (e.g., thousands of different
tables).  Some of the PLs maintain caches with parsed versions of any
function that's been executed.  (An ex-employer of mine had a lot of
trouble in that regard, because they had hundreds of thousands of lines
worth of plpgsql functions.)  There isn't any user-accessible knob for
limiting the size of those caches.  If you have a problem of that sort,
about the only way to mitigate it is to use fewer backends so that the
total memory consumption stays in bounds, or redesign your application.
In some cases it might help to restart your sessions when they get too
big, but that seems like at best a band-aid.

            regards, tom lane



Re: PostgreSQL processes use large amount of private memory on Windows

From
"Peter J. Holzer"
Date:
On 2020-09-17 11:19:07 +0200, Øystein Kolsrud wrote:
> I have a product that uses PostgreSQL (v9.6) as database on Windows, and we
> often observe that the "shared working set" memory reported for the individual
> connection processes is quite high (around 200MB), but the "private working
> set" is relatively low (around 5 MB). I assume this is due to the processes
> utilizing the shared buffers of Postgres, and in these cases everything works
> fine as the total memory consumption on the system is quite low. But for some
> deployments we have observed that the "private working set" is around 200 MB as
> well, and in these cases the server quickly runs out of memory when the number
> of connections rises. I have never been able to reproduce this behavior myself.
[...]
> All settings typically referred to (like "work_mem" and "temp_buffers") are at
> default settings, and I have a very hard time seeing how those settings could
> add up to memory in the magnitude of 200MB.

The work_mem setting is per operation (e.g. sort, hash join, etc.), so a
complex query may use a multiple of it. However, the default is just 4 MB,
so a query would have to be very complex to use 50 times as much. Also,
that memory is freed after the query (however "freed" does not
necessarily mean "returned to the OS", it can just mean "marked as
availiable for reuse" - that depends on the C library and the OS).

What I would do:

* Set log_statement to all (warning: that can be a lot of log messages.
  It can also be a privacy/security hazard, depending on who has access
  to the server and how sensitive queries are).
* Frequently (at least once per minute) record the size of all postgres
  processes. Send an alert if one of them is "too large".

This should give you a good idea what the processes were doing at the
time they allocated that memory, so that you can reproduce the problem.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment
On Thu, Sep 17, 2020 at 10:06:07AM -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> =?UTF-8?Q?=C3=98ystein_Kolsrud?= <kolsrud@gmail.com> writes:
> > So my question is: When does a postgres process forked for a connection use
> > private memory instead of shared, and what can I do to avoid this?
> 
> The only significant long-term consumption of private memory is for
> caches.  There are catalog caches, which can get large if the session
> accesses a whole lot of database objects (e.g., thousands of different
> tables).  Some of the PLs maintain caches with parsed versions of any
> function that's been executed.  (An ex-employer of mine had a lot of
> trouble in that regard, because they had hundreds of thousands of lines
> worth of plpgsql functions.)  There isn't any user-accessible knob for
> limiting the size of those caches.  If you have a problem of that sort,
> about the only way to mitigate it is to use fewer backends so that the
> total memory consumption stays in bounds, or redesign your application.
> In some cases it might help to restart your sessions when they get too
> big, but that seems like at best a band-aid.
> 
>             regards, tom lane

For what it's worth, I have 171305 lines of plpgsql/sql
functions and it hasn't caused any problem on a server
with 4GB RAM. With a small number of long-lived
connections.

cheers,
raf




Re: PostgreSQL processes use large amount of private memory on Windows

From
Øystein Kolsrud
Date:
On Thu, Sep 17, 2020 at 7:27 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
What I would do:

* Set log_statement to all (warning: that can be a lot of log messages.
  It can also be a privacy/security hazard, depending on who has access
  to the server and how sensitive queries are).
* Frequently (at least once per minute) record the size of all postgres
  processes. Send an alert if one of them is "too large".

This should give you a good idea what the processes were doing at the
time they allocated that memory, so that you can reproduce the problem.

In fact I already did track the memory behavior over time and recorded memory utilization every 5 minutes. The memory consumption exhibits a surprisingly linear growth of about 400 MB per hour. And since I can't see any sharp points where the memory jumps I can't really identify any single query culprit that is causing this. It seems more like there are a large number of small queries that are causing the memory to stack up over time.

And again, the real mystery here is that we don't see this behavior in most deployments of the product which all utilize the database in the same way. This leads me to think that there is some configuration on the server side that is somehow affecting memory allocation. It's almost as if the shared memory allocation between processes is not working and that all processes are holding a copy of the shared memory instead of actually sharing.

--
Best regards, Øystein Kolsrud

On Thu, Sep 17, 2020 at 7:27 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
On 2020-09-17 11:19:07 +0200, Øystein Kolsrud wrote:
> I have a product that uses PostgreSQL (v9.6) as database on Windows, and we
> often observe that the "shared working set" memory reported for the individual
> connection processes is quite high (around 200MB), but the "private working
> set" is relatively low (around 5 MB). I assume this is due to the processes
> utilizing the shared buffers of Postgres, and in these cases everything works
> fine as the total memory consumption on the system is quite low. But for some
> deployments we have observed that the "private working set" is around 200 MB as
> well, and in these cases the server quickly runs out of memory when the number
> of connections rises. I have never been able to reproduce this behavior myself.
[...]
> All settings typically referred to (like "work_mem" and "temp_buffers") are at
> default settings, and I have a very hard time seeing how those settings could
> add up to memory in the magnitude of 200MB.

The work_mem setting is per operation (e.g. sort, hash join, etc.), so a
complex query may use a multiple of it. However, the default is just 4 MB,
so a query would have to be very complex to use 50 times as much. Also,
that memory is freed after the query (however "freed" does not
necessarily mean "returned to the OS", it can just mean "marked as
availiable for reuse" - that depends on the C library and the OS).

What I would do:

* Set log_statement to all (warning: that can be a lot of log messages.
  It can also be a privacy/security hazard, depending on who has access
  to the server and how sensitive queries are).
* Frequently (at least once per minute) record the size of all postgres
  processes. Send an alert if one of them is "too large".

This should give you a good idea what the processes were doing at the
time they allocated that memory, so that you can reproduce the problem.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"


--
Mvh Øystein Kolsrud

Re: PostgreSQL processes use large amount of private memory on Windows

From
Chris Sterritt
Date:
On 17/09/2020 15:06, Tom Lane wrote:
> =?UTF-8?Q?=C3=98ystein_Kolsrud?= <kolsrud@gmail.com> writes:
>> So my question is: When does a postgres process forked for a connection use
>> private memory instead of shared, and what can I do to avoid this?
> The only significant long-term consumption of private memory is for
> caches.  There are catalog caches, which can get large if the session
> accesses a whole lot of database objects (e.g., thousands of different
> tables).  Some of the PLs maintain caches with parsed versions of any
> function that's been executed.  (An ex-employer of mine had a lot of
> trouble in that regard, because they had hundreds of thousands of lines
> worth of plpgsql functions.)  There isn't any user-accessible knob for
> limiting the size of those caches.  If you have a problem of that sort,
> about the only way to mitigate it is to use fewer backends so that the
> total memory consumption stays in bounds, or redesign your application.
> In some cases it might help to restart your sessions when they get too
> big, but that seems like at best a band-aid.
>
>             regards, tom lane
>
>
Would executing DISCARD ALL release the PL cache?

Regards, Chris Sterritt