On Fri, Nov 30, 2018 at 08:20:49PM +0100, Jakub Glapa wrote:
> In the last days I've been monitoring no segfault occurred but the
> das_allocation did.
> I'm starting to doubt if the segfault I've found in dmesg was actually
> related.
The dmesg looks like a real crash, not just OOM. You can hopefully find the
timestamp of the segfaults in /var/log/syslog, and compare with postgres logs
if they go back far enough. All the postgres processes except the parent
would've been restarted at that time.
> I've grepped the postgres log for dsa_allocated:
> Why do the messages occur sometimes as FATAL and sometimes as ERROR?
I believe it may depend if it happens in a parallel worker or the leader.
You may get more log detail if you enable CSV logging (although unfortunately
as I recall it doesn't indicate it's a parallel worker).
You could force it to dump core if you recompile postgres with an assert() (see
patch below).
You could build an .deb by running dpkg-buildpackage -rfakeroot or similar (i
haven't done this in awhile), or you could compile, install, and launch
debugging binaries from your homedir (or similar)
You'd want to compile the same version (git checkout REL_10_6) and with the
proper configure flags..perhaps starting with:
./configure --with-libxml --with-libxslt --enable-debug --prefix=$HOME/src/postgresql.bin --enable-cassert && time make
&&make install
Be careful if you have extensions installed that they still work.
Justin
--- a/src/backend/utils/mmgr/dsa.c
+++ b/src/backend/utils/mmgr/dsa.c
@@ -727,4 +727,7 @@ dsa_allocate_extended(dsa_area *area, size_t size, int flags)
if (!FreePageManagerGet(segment_map->fpm, npages, &first_page))
+ {
elog(FATAL,
"dsa_allocate could not find %zu free pages", npages);
+ abort()
+ }