Home > mailing lists

Re: Re: [SQL] PostgreSQL crashes on me :( - Mailing list pgsql-hackers

From	Ian Lance Taylor
Subject	Re: Re: [SQL] PostgreSQL crashes on me :(
Date	December 18, 2000 12:33:40
Msg-id	20001218173340.29569.qmail@daffy.airs.com Whole thread Raw
In response to	Re: [SQL] PostgreSQL crashes on me :( (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Re: [SQL] PostgreSQL crashes on me :(
List	pgsql-hackers

Tree view

Date: Sun, 17 Dec 2000 22:47:55 -0500  From: Tom Lane <tgl@sss.pgh.pa.us>
  BUT: I think there's a race  condition here, at least on systems where errno is not saved and  restored around a
signalhandler.  Consider the following scenario:
 
   Postmaster is waiting at the select() --- its normal state.
   Postmaster receives a SIGCHLD signal due to backend exit, so   it goes off and does the reaper() thing.  On return
from  reaper() the system arranges to return EINTR error from   the select().
 
   Before control can reach the "if (errno..." test, another   SIGCHLD comes in.  reaper() is invoked again and does
its  thing.
 
  The normal exit condition from reaper() will be errno == ECHILD,  because that's what the waitpid() or wait3() call
willreturn after  all children are dealt with.  If the signal-handling mechanism allows  that to be returned to the
mainlinecode, we have a failure.
 
  Can any FreeBSD hackers comment on the plausibility of this theory?

I'm not a FreeBSD hacker, but I do know how the BSD kernel works
unless FreeBSD has changed things.  The important facts are:

1) The kernel only delivers signals when a process moves from kernel  mode to user mode, after a system call or an
interrupt(including a  timer interrupt).
 

2) The errno variable is set in user space after the process has  returned to user mode.

Therefore, the scenario you describe is possible, but only if there
happens to be both a timer interrupt and a SIGCHLD signal within a
couple of instructions after the select returns.

(I suppose that a page fault instead of a timer interrupt could have
the same effect as well, although a page fault here seems quite
unlikely unless the system is extremely overloaded.)
  A quick-and-dirty workaround would be to save and restore errno in  reaper() and the other postmaster signal
handlers. It might be  a better idea in the long run to avoid doing system calls in the  signal handlers --- but that
wouldtake a more substantial rewrite.
 

Ideally, signal handlers should not make system calls.  However, if
this is impossible, then signal handlers must save and restore errno.

Ian

pgsql-hackers by date:

From: Stephan Szabo
Date: 18 December 2000, 12:20:11
Subject: Re: Ocasional problems !!!!

From: Tom Lane
Date: 18 December 2000, 12:40:32
Subject: Re: Re: [SQL] PostgreSQL crashes on me :(

Re: Re: [SQL] PostgreSQL crashes on me :( - Mailing list pgsql-hackers

Previous

Next