Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1 - Mailing list pgsql-general

From Stephan Knauss
Subject Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1
Date
Msg-id 6ea52e56-b401-7716-f592-a8fdc98df667@stephans-server.de
Whole thread Raw
In response to Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Debugging leaking memory in Postgresql 13.2/Postgis 3.1  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On 30.03.2021 20:46, Tom Lane wrote:
> Stephan Knauss <pgsql@stephans-server.de> writes:
>> The wiki suggested to dump MemoryContext states for more details, but
>> something strange happens when attaching gdb. It seems that the process
>> is immediately killed and I can no longer dump such details.
> (I think the -v option is the one that matters on Linux, not -d
> as you might guess).  The idea here is that the backends would
> get an actual ENOMEM failure from malloc() before reaching the
> point where the kernel's OOM-kill behavior takes over.  Given
> that, they'd dump memory maps to stderr of their own accord,
> and you could maybe get some insight as to what's leaking.
> This'd also reduce the severity of the problem when it does
> happen.

Hello Tom, the output below looks similar to the OOM output you
expected. Can you give a hint how to interpret the results?

I had a backend which had a larger amount of memory allocated already.
So I gave "gcore -a" a try.

In contrast to the advertised behavior, the process did not continue to
run but I got a core file at least. Probably related to gcore just
calling gdb attach which somehow triggers a SIGKILL of all backends.

With 4.2GB in size it hopefully has most of the relevant memory
structures are there. Without a running process I still can not call
MemoryContextStats(), but I found a macro which claims to decode the
memory structure post mortem:

https://www.cybertec-postgresql.com/en/checking-per-memory-context-memory-consumption/


This gave me the following memory structure:

How should it be interpreted? It looks like the size is bytes as it
calculates with pointers. But the numbers look a bit small, given that I
had a backend with roughly 6GB RSS memory.

I thought it might print overall size and then indent and print the
memory of children, but the numbers do indicate this is not the case,
having a higher level smaller size than children:

   CachedPlanSource: 67840
    unnamed prepared statement: 261920

So how to read it and any indication why I have a constantly increasing
memory footprint? Is there any indication where multiple gigabytes are
allocated?



root@0ec98d20bda2:/# gdb /usr/lib/postgresql/13/bin/postgres core.154218
<gdb-context
GNU gdb (Debian 8.2.1-2+b3) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
<http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
     <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/lib/postgresql/13/bin/postgres...Reading
symbols from
/usr/lib/debug/.build-id/31/ae2853776500091d313e76cf679017e697884b.debug...done.
done.

warning: core file may not match specified executable file.
[New LWP 154218]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `postgres: osm gis 172.20.0.3(51894) idle'.
#0  0x00007fc01cfa07b7 in epoll_wait (epfd=4, events=0x55f403584080,
maxevents=maxevents@entry=1, timeout=timeout@entry=-1) at
../sysdeps/unix/sysv/linux/epoll_wait.c:30
30      ../sysdeps/unix/sysv/linux/epoll_wait.c:
No such file or directory.
(gdb) >>>> > > >>>(gdb) (gdb) >>>> > > >>>>> > > >>(gdb) (gdb)
TopMemoryContext: 109528
  dynahash: 7968
  HandleParallelMessages: 7968
  dynahash: 7968
  dynahash: 7968
  dynahash: 7968
  dynahash: 24392
  dynahash: 24352
  RowDescriptionContext: 24352
  MessageContext: 7968
  dynahash: 7968
  dynahash: 32544
  TransactionAbortContext: 32544
  dynahash: 7968
  TopPortalContext: 7968
  dynahash: 16160
  CacheMemoryContext: 1302944
   CachedPlan: 138016
   CachedPlanSource: 67840
    unnamed prepared statement: 261920
   index info: 1824
   index info: 1824
   index info: 3872
   index info: 1824
   index info: 1824
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 1824
   index info: 3872
   relation rules: 32544
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 3872
   relation rules: 24352
   index info: 3872
   index info: 3872
   index info: 1824
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 1824
   index info: 3872
   index info: 1824
   index info: 3872
   relation rules: 32544
   index info: 1824
   index info: 2848
   index info: 1824
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 3872
   index info: 1824
   index info: 3872
   index info: 1824
   index info: 1824
   relation rules: 32544
   index info: 1824
   index info: 2848
   index info: 1824
   index info: 800
   index info: 1824
   index info: 800
   index info: 800
   index info: 2848
   index info: 1824
   index info: 800
   index info: 800
   index info: 800
   index info: 2848
   index info: 1824
   index info: 1824
--Type <RET> for more, q to quit, c to continue without paging--  index
info: 2848
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 800
   index info: 800
   index info: 800
   index info: 2848
   index info: 2848
   index info: 1824
   index info: 1824
   index info: 800
   index info: 800
   index info: 2848
   index info: 800
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 800
   index info: 2848
   index info: 2848
   index info: 2848
   index info: 800
   index info: 800
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 1824
   index info: 2848
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 2848
   index info: 800
   index info: 1824
   index info: 800
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 800
   index info: 1824
   index info: 2848
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 1824
   index info: 1824
  WAL record construction: 49544
  dynahash: 7968
  MdSmgr: 7968
  dynahash: 16160
  dynahash: 103896
  ErrorContext: 7968
(gdb) quit
root@0ec98d20bda2:/# cat gdb-context
define sum_context_blocks
set $context = $arg0
set $block = ((AllocSet) $context)->blocks
set $size = 0
while ($block)
set $size = $size + (((AllocBlock) $block)->endptr - ((char *) $block))
set $block = ((AllocBlock) $block)->next
end
printf "%s: %d\n",((MemoryContext)$context)->name, $size
end

define walk_contexts
set $parent_$arg0 = ($arg1)
set $indent_$arg0 = ($arg0)
set $i_$arg0 = $indent_$arg0
while ($i_$arg0)
printf " "
set $i_$arg0 = $i_$arg0 - 1
end
sum_context_blocks $parent_$arg0
set $child_$arg0 = ((MemoryContext) $parent_$arg0)->firstchild
set $indent_$arg0 = $indent_$arg0 + 1
while ($child_$arg0)
walk_contexts $indent_$arg0 $child_$arg0
set $child_$arg0 = ((MemoryContext) $child_$arg0)->nextchild
end
end

walk_contexts 0 TopMemoryContext





pgsql-general by date:

Previous
From: Mutuku Ndeti
Date:
Subject: Postgres connection to hot standby
Next
From: Virendra Kumar
Date:
Subject: Copy Statistics Tables During Upgrade