Rethinking placement of latch self-pipe initialization - Mailing list pgsql-hackers

From Tom Lane
Subject Rethinking placement of latch self-pipe initialization
Date
Msg-id 14045.1349630865@sss.pgh.pa.us
Whole thread Raw
Responses Re: Rethinking placement of latch self-pipe initialization  (Amit Kapila <amit.kapila@huawei.com>)
Re: Rethinking placement of latch self-pipe initialization  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Rethinking placement of latch self-pipe initialization  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Sean Chittenden recently reported that 9.2 can crash after logging
"FATAL: pipe() failed" if the kernel is short of file descriptors:
http://archives.postgresql.org/pgsql-general/2012-10/msg00202.php

The only match to that error text is in initSelfPipe().  What I believe
is happening is that InitProcess is calling OwnLatch which calls
initSelfPipe, and the latter fails, and then the postmaster thinks that
was a backend crash because we have armed the dead-man switch but not
set up on_shmem_exit(ProcKill) which would disarm it.

It's possible we could fix this by changing the order of operations
in InitProcess and OwnLatch, but it'd be tricky, not least because
ProcKill calls DisownLatch which asserts that OwnLatch was done.

What I think would be a better idea is to fix things so that OwnLatch
cannot fail except as a result of internal logic errors, by splitting
out the acquisition of the self-pipe into a separate function named say
InitializeLatchSupport.  The question then becomes where to put the
InitializeLatchSupport calls.  My first thought is to put them near the
signal-setup stanzas for the various processes (ie, the pqsignal calls)
similarly to what we did recently for initialization of timeout support.
However there might be a better idea.

Comments?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: plpgsql_check_function - rebase for 9.3
Next
From: Brar Piening
Date:
Subject: Re: Visual Studio 2012 RC