Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket - Mailing list pgsql-hackers

From Asim R P
Subject Re: [HACKERS] possible self-deadlock window after bad ProcessStartupPacket
Date
Msg-id CANXE4Tevcwdx-exHGPq22CzDk6KJhfJQfON4wzKj2rNmYdyxpg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket  (Andres Freund <andres@anarazel.de>)
Responses Re: [HACKERS] possible self-deadlock window after badProcessStartupPacket  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
On Thu, Jun 22, 2017 at 10:50 AM, Andres Freund <andres@anarazel.de> wrote:
>
> Or, probably more robust: Simply _exit(2) without further ado, and rely
> on postmaster to output an appropriate error message. Arguably it's not
> actually useful to see hundreds of "WARNING: terminating connection because of
> crash of another server process" messages in the log anyway.
>

To support using _exit(2) in *quickdie() handlers, I would like to
share another stack trace indicating self-deadlock.  In this case, WAL
writer process got SIGQUIT while it was already handling a SIGQUIT,
leading to self-deadlock.

#0  __lll_lock_wait_private () at
../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:95
#1  0x00007f0bf04db2bd in _int_free (av=0x7f0bf081fb20 <main_arena>,
p=0x1557e60, have_lock=0) at malloc.c:3962
#2  0x00007f0bf04df53c in __GI___libc_free (mem=mem@entry=0x1557e70)
at malloc.c:2968
#3  0x00007f0bf0495025 in __run_exit_handlers (status=2,
listp=0x7f0bf081f5f8 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true)
   at exit.c:91
#4  0x00007f0bf0495045 in __GI_exit (status=<optimized out>) at exit.c:104
#5  0x0000000000843994 in wal_quickdie ()
#6  <signal handler called>
#7  0x00007f0bf04db014 in _int_free (av=0x7f0bf081fb20 <main_arena>,
p=<optimized out>, have_lock=0) at malloc.c:4014
#8  0x00007f0bf04df53c in __GI___libc_free (mem=<optimized out>) at
malloc.c:2968
#9  0x00007f0bebf8b2ba in ?? () from /usr/lib/x86_64-linux-gnu/libtasn1.so.6
#10 0x00007f0bebf8c4ba in asn1_delete_structure2 () from
/usr/lib/x86_64-linux-gnu/libtasn1.so.6
#11 0x00007f0beec24738 in ?? () from /usr/lib/x86_64-linux-gnu/libgnutls.so.30
#12 0x00007f0bf3bb6de7 in _dl_fini () at dl-fini.c:235
#13 0x00007f0bf0494ff8 in __run_exit_handlers (status=2,
listp=0x7f0bf081f5f8 <__exit_funcs>,
run_list_atexit=run_list_atexit@entry=true)
   at exit.c:82
#14 0x00007f0bf0495045 in __GI_exit (status=<optimized out>) at exit.c:104
#15 0x0000000000843994 in wal_quickdie ()
#16 <signal handler called>
#17 0x00007f0bf05585b3 in __select_nocancel () at
../sysdeps/unix/syscall-template.S:84
#18 0x0000000000b7c5da in pg_usleep ()
#19 0x0000000000843c4a in WalWriterMain ()
#20 0x000000000059ac47 in AuxiliaryProcessMain ()


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] Re: [COMMITTERS] pgsql: Remove pgbench "progress" testpending solution of its timing is (fwd)
Next
From: "Jamison, Kirk"
Date:
Subject: RE: Recovery performance of standby for multiple concurrenttruncates on large tables