Thread: Re: [ANNOUNCEMENT] Updated: cygrunsrv-0.94-1
On Thu, Jul 12, 2001 at 11:45:39AM -0400, Jason Tishler wrote: > Please post your patch -- I've been waiting with bated breath since you > first mentioned it. :,) I will be quite willing to champion it (or a > better solution, if one can be devised) for inclusion into PostgreSQL CVS. OK, here's the patch. It's crude, simply causing the SIGHUP handlers (one for postmaster and one for postgres) to return immediately without triggering the normal config-file-reloading response. BTW, while testing this patch again after a 'cvs update' I notice that I've got more 'postgres' processes running than usual. In the idle state I expect to have just one process named 'postgres' running; it's actually the postmaster and has cygrunsrv as its parent. But now I've got two other postgres processes, one a child of the postmaster and the other a child of the first. I don't remember seeing this before. I don't have any client programs running. Stopping and restarting the service results in the same situation. Starting a client causes a new postgres child process of the postmaster to run while the client is connected, as usual. I'm puzzled about those two other processes. -- Fred Yankowski fred@OntoSys.com tel: +1.630.879.1312 Principal Consultant www.OntoSys.com fax: +1.630.879.1370 OntoSys, Inc 38W242 Deerpath Rd, Batavia, IL 60510, USA -- Index: src/backend/postmaster/postmaster.c =================================================================== RCS file: /home/projects/pgsql/cvsroot/pgsql/src/backend/postmaster/postmaster.c,v retrieving revision 1.229 diff -u -p -r1.229 postmaster.c --- src/backend/postmaster/postmaster.c 2001/06/29 16:05:57 1.229 +++ src/backend/postmaster/postmaster.c 2001/07/12 16:34:29 @@ -1284,6 +1284,18 @@ SIGHUP_handler(SIGNAL_ARGS) int save_errno = errno; PG_SETMASK(&BlockSig); +#ifdef __CYGWIN__ + /* + * Ignore SIGHUP since Cygwin sends a SIGHUP when NT is being + * shut down. Attempting to handle this signal as normal + * seems to block the concurrent activity to shutdown the + * postmaster. + */ + if (DebugLvl) + postmaster_error("ignoring caught SIGHUP in postmaster"); + errno = save_errno; + return; +#endif if (Shutdown <= SmartShutdown) { Index: src/backend/tcop/postgres.c =================================================================== RCS file: /home/projects/pgsql/cvsroot/pgsql/src/backend/tcop/postgres.c,v retrieving revision 1.227 diff -u -p -r1.227 postgres.c --- src/backend/tcop/postgres.c 2001/06/29 16:05:56 1.227 +++ src/backend/tcop/postgres.c 2001/07/12 16:34:31 @@ -1022,6 +1022,10 @@ FloatExceptionHandler(SIGNAL_ARGS) static void SigHupHandler(SIGNAL_ARGS) { +#ifdef __CYGWIN__ + elog(DEBUG, "ignoring caught SIGHUP in postgres\n"); + return; +#endif got_SIGHUP = true; }
Fred, On Thu, Jul 12, 2001 at 11:42:08AM -0500, Fred Yankowski wrote: > On Thu, Jul 12, 2001 at 11:45:39AM -0400, Jason Tishler wrote: > > Please post your patch -- I've been waiting with bated breath since you > > first mentioned it. :,) I will be quite willing to champion it (or a > > better solution, if one can be devised) for inclusion into PostgreSQL CVS. > > OK, here's the patch. It's crude, simply causing the SIGHUP handlers > (one for postmaster and one for postgres) to return immediately > without triggering the normal config-file-reloading response. What about trying to tackle this from another point of view? I'm not sure if this is doable or acceptable, but what about adding logic to the Cygwin DLL so that it does not send SIGHUP (to itself) when the process is running under cygrunsrv? If this is deemed accepted, does anyone have any ideas on what is the best way to determine when running under cygrunsrv? Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
On Mon, Jul 16, 2001 at 10:04:17AM -0400, Jason Tishler wrote: > Fred, > > On Thu, Jul 12, 2001 at 11:42:08AM -0500, Fred Yankowski wrote: > > On Thu, Jul 12, 2001 at 11:45:39AM -0400, Jason Tishler wrote: > > > Please post your patch -- I've been waiting with bated breath since you > > > first mentioned it. :,) I will be quite willing to champion it (or a > > > better solution, if one can be devised) for inclusion into PostgreSQL CVS. > > > > OK, here's the patch. It's crude, simply causing the SIGHUP handlers > > (one for postmaster and one for postgres) to return immediately > > without triggering the normal config-file-reloading response. > > What about trying to tackle this from another point of view? I'm not > sure if this is doable or acceptable, but what about adding logic to the > Cygwin DLL so that it does not send SIGHUP (to itself) when the process is > running under cygrunsrv? Hmmm, sounds like an ugly hack to me... Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Developer mailto:cygwin@cygwin.com Red Hat, Inc.
Corrina, On Mon, Jul 16, 2001 at 05:19:08PM +0200, Corinna Vinschen wrote: > On Mon, Jul 16, 2001 at 10:04:17AM -0400, Jason Tishler wrote: > > What about trying to tackle this from another point of view? I'm not > > sure if this is doable or acceptable, but what about adding logic to the > > Cygwin DLL so that it does not send SIGHUP (to itself) when the process is > > running under cygrunsrv? > > Hmmm, sounds like an ugly hack to me... Which is why I couched the above with "acceptable." However, there are other Unix daemons (e.g., inetd) that will respond to SIGHUP in a similar manner. Is modifying all of them, instead of just the Cygwin DLL, better? Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
On Mon, Jul 16, 2001 at 11:34:29AM -0400, Jason Tishler wrote: > Corrina, > > On Mon, Jul 16, 2001 at 05:19:08PM +0200, Corinna Vinschen wrote: > > On Mon, Jul 16, 2001 at 10:04:17AM -0400, Jason Tishler wrote: > > > What about trying to tackle this from another point of view? I'm not > > > sure if this is doable or acceptable, but what about adding logic to the > > > Cygwin DLL so that it does not send SIGHUP (to itself) when the process is > > > running under cygrunsrv? > > > > Hmmm, sounds like an ugly hack to me... > > Which is why I couched the above with "acceptable." However, there are > other Unix daemons (e.g., inetd) that will respond to SIGHUP in a similar > manner. Is modifying all of them, instead of just the Cygwin DLL, better? That's not what I meant. I just don't like a solution which checks for a specific situation which might change in future due to reasons we don't know yet. Would perhaps changing the general behaviour of Cygwin help? For example when changing the runlevel on a Linux system is requested, init(8) sends a SIGTERM to processes which aren't defined on the new runlevel. Which is a similar situation, IMO. Perhaps changing Cygwin from sending SIGHUP to sending SIGTERM makes any sense? Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Developer mailto:cygwin@cygwin.com Red Hat, Inc.
On Mon, Jul 16, 2001 at 10:04:17AM -0400, Jason Tishler wrote: > What about trying to tackle this from another point of view? I'm not > sure if this is doable or acceptable, but what about adding logic to the > Cygwin DLL so that it does not send SIGHUP (to itself) when the process is > running under cygrunsrv? If this is deemed accepted, does anyone have any > ideas on what is the best way to determine when running under cygrunsrv? That solution would be fine by me. In fact, I would prefer it since PostgreSQL would then not require _any_ source code changes to allow operation as a win32 service. I wonder if Cygwin should generate SIGHUP in this situation for any process, not just cygrunsrv. The ctrl_c_handler() function in exceptions.cc sends SIGHUP to its own process when handling the CTRL_CLOSE_EVENT and CTRL_SHUTDOWN_EVENT signals from the system. According to MSDN [1], CTRL_SHUTDOWN_EVENT will only be received by service processes. Unix-based daemons don't expect to be associated with a terminal (except for debugging modes of operation) and, if they handle SIGHUP at all, use SIGHUP as a manually-invoked signal to trigger administrative operations, not a signal indicating system shutdown. So what good is it for Cygwin to send SIGHUP in the event of CTRL_SHUTDOWN_EVENT? I don't have time to rebuild and test the Cygwin DLL right now, but what I'm suggesting is represented by the attached patch. [1] http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/hh/winbase/conchar_5zz9.asp -- Fred Yankowski fred@OntoSys.com tel: +1.630.879.1312 Principal Consultant www.OntoSys.com fax: +1.630.879.1370 OntoSys, Inc 38W242 Deerpath Rd, Batavia, IL 60510, USA -- Index: exceptions.cc =================================================================== RCS file: /cvs/src/src/winsup/cygwin/exceptions.cc,v retrieving revision 1.91 diff -u -p -r1.91 exceptions.cc --- exceptions.cc 2001/06/28 02:19:57 1.91 +++ exceptions.cc 2001/07/16 16:22:11 @@ -900,7 +900,8 @@ ctrl_c_handler (DWORD type) for each Cygwin process window that's open when the computer is shut down or console window is closed. */ { - sig_send (NULL, SIGHUP); + if (type == CTRL_CLOSE_EVENT) + sig_send (NULL, SIGHUP); return FALSE; } tty_min *t = cygwin_shared->tty.get_tty (myself->ctty);
On Mon, Jul 16, 2001 at 06:27:27PM +0200, Corinna Vinschen wrote: > For example when changing the runlevel on a Linux system is requested, > init(8) sends a SIGTERM to processes which aren't defined on the new > runlevel. Which is a similar situation, IMO. Perhaps changing Cygwin > from sending SIGHUP to sending SIGTERM makes any sense? Sending SIGTERM rather than SIGHUP does seem more appropriate for this case in general. However, it might not work well for PostgreSQL. PostgreSQL has three modes of shutdown (that I know of): SIGTERM triggers a "smart" shutdown mode that will wait for clients to disconnect first; SIGINT triggers "fast" shutdown that aborts current transactions and shuts down cleanly very quickly; SIGQUIT triggers "immediate" shutdown that quits with minimal attempt to clean up first, leading to recovery on the next restart. Unfortunately for our case, the SIGTERM mode is not appropriate for system shutdown because we would block until interactive clients all happen to disconnect, which is likely to cause the win32 system to just kill the postgresql processes after it waits the maximum allowed time for services to shutdown. Cygrunsrv can send an arbitrary signal to the managed process in the event of system shutdown, and I configure it to send SIGINT for PostgreSQL to trigger smart shutdown. The problem, as I see it, is that there is a race condition and if we have Cygwin send SIGTERM rather than SIGHUP then the PostgreSQL processes may get that before the SIGINT sent by cygrunsrv and will embark on the smart shutdown path. I believe (but, I realize, I'm not sure) that PostgreSQL will continue with the "smart" shutdown even on later receipt of SIGINT. A Unix system would typically give daemon processes a chance to shutdown cleanly between run-levels through the use of /etc/init.d scripts (or the like), before hammering them with a signal. Cygrunsrv's --shutdown option gives us a limited capability similar to those init.d scripts, but unfortunately doesn't get the same priority in time that the scripts get. -- Fred Yankowski fred@OntoSys.com tel: +1.630.879.1312 Principal Consultant www.OntoSys.com fax: +1.630.879.1370 OntoSys, Inc 38W242 Deerpath Rd, Batavia, IL 60510, USA
Corinna, On Mon, Jul 16, 2001 at 06:27:27PM +0200, Corinna Vinschen wrote: > Perhaps changing Cygwin from sending SIGHUP to sending SIGTERM makes > any sense? Yes, I believe that this is the way to go -- at least for PostgreSQL... On Mon, Jul 16, 2001 at 11:51:00AM -0500, Fred Yankowski wrote: > Sending SIGTERM rather than SIGHUP does seem more appropriate for this > case in general. However, it might not work well for PostgreSQL. I appear to have empirical evidence that indicates that PostgreSQL can tolerate receiving SIGTERM and/or SIGINT during an NT shutdown. My procedure is as follows: 1. I applied the attached patch, rebuilt my Cygwin DLL, and installed it. 2. I installed postmaster under cygrunsrv as follows: $ cygrunsrv --install postmaster --path /usr/bin/postmaster \ --args "-D /usr/share/postgresql/data -i" --dep ipc-daemon \ --termsig INT --user 'bhmco\jt' --shutdown 3. I connected to this postmaster via psql running on another machine. 4. I restarted this NT box. The following are the messages displayed during shutdown and startup: Smart Shutdown request at Wed Jul 18 14:00:32 2001 [1] FATAL 1: This connection has been terminated by the administrator. [2] DEBUG: shutting down Fast Shutdown request at Wed Jul 18 14:00:33 2001 [3] DEBUG: database system is shut down DEBUG: database system was shut down at 2001-07-18 14:00:35 [restart occurs here] DEBUG: CheckPoint record at (0, 63186000) DEBUG: Redo record at (0, 63186000); Undo record at (0, 0); Shutdown TRUE DEBUG: NextTransactionId: 11662; NextOid: 414368 DEBUG: database system is in production state [4] Message [1] is due to ctrl_c_handler() sending a SIGTERM instead of SIGHUP to the postmaster process. Message [2] is due to ctrl_c_handler() sending a SIGTERM to the backend postgres process that is serving the only connection (from psql) which causes it to terminate. Note that normally it is postmaster that sends this signal (not some other process). Message [3] is due to cygrunsrv responding to the NT shutdown message and in turn sending a SIGINT to postmaster. Note that this seems to indicate that a Fast Shutdown can interrupt and supersede a Smart one. Message [4] indicates that PostgreSQL was able to restart without any manual intervention. Although the above is not quite how PostgreSQL shutdowns on other platforms, it is very close and seems to work. Should I submit the attached patch (with ChangeLog) to cygwin-patches for consideration? Thanks, Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
Attachment
On Wed, Jul 18, 2001 at 03:11:42PM -0400, Jason Tishler wrote: > Smart Shutdown request at Wed Jul 18 14:00:32 2001 [1] > FATAL 1: This connection has been terminated by the administrator. [2] > DEBUG: shutting down > Fast Shutdown request at Wed Jul 18 14:00:33 2001 [3] > DEBUG: database system is shut down > DEBUG: database system was shut down at 2001-07-18 14:00:35 > > [restart occurs here] > > DEBUG: CheckPoint record at (0, 63186000) > DEBUG: Redo record at (0, 63186000); Undo record at (0, 0); Shutdown TRUE > DEBUG: NextTransactionId: 11662; NextOid: 414368 > DEBUG: database system is in production state [4] > > Message [1] is due to ctrl_c_handler() sending a SIGTERM instead of > SIGHUP to the postmaster process. Message [2] is due to ctrl_c_handler() > sending a SIGTERM to the backend postgres process that is serving the > only connection (from psql) which causes it to terminate. Note that > normally it is postmaster that sends this signal (not some other process). > Message [3] is due to cygrunsrv responding to the NT shutdown message > and in turn sending a SIGINT to postmaster. Note that this seems to > indicate that a Fast Shutdown can interrupt and supersede a Smart one. > Message [4] indicates that PostgreSQL was able to restart without any > manual intervention. > > Although the above is not quite how PostgreSQL shutdowns on other > platforms, it is very close and seems to work. > > Should I submit the attached patch (with ChangeLog) to cygwin-patches > for consideration? Yep! I would like Chris "Mister Signal" Faylor to review the patch. Thanks for investigating the situation, Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Developer mailto:cygwin@cygwin.com Red Hat, Inc.
On Wed, Jul 18, 2001 at 03:11:42PM -0400, Jason Tishler wrote: > I appear to have empirical evidence that indicates that PostgreSQL > can tolerate receiving SIGTERM and/or SIGINT during an NT shutdown. ... > Note that this seems to > indicate that a Fast Shutdown can interrupt and supersede a Smart one. > Message [4] indicates that PostgreSQL was able to restart without any > manual intervention. Excellent! Your experimental results trump my hunch, big time. > Should I submit the attached patch (with ChangeLog) to cygwin-patches > for consideration? Yes, please. This patch would seem to provide the last piece needed (along with cygrunsrv) to make PostgreSQL robust as an NT/Win2000 service. And with no changes whatsoever to PostgreSQL code -- bonus! -- Fred Yankowski fred@OntoSys.com tel: +1.630.879.1312 Principal Consultant www.OntoSys.com fax: +1.630.879.1370 OntoSys, Inc 38W242 Deerpath Rd, Batavia, IL 60510, USA
The power just returned here which started me thinking about this. To restart pg I needed to delete the .pid. Is it correct that cygrunsrv solution will not change that? I have a applictation on a collection of laptops running NT 5 which are vulnerable to people pulling the plug and the like. To protect the restart I was thinking about a startup .bat to delete the .pid. Is this still necessary? ----- Original Message ----- From: "Fred Yankowski" <fred@ontosys.com> Newsgroups: comp.databases.postgresql.ports.cygwin Sent: Thursday, July 19, 2001 8:54 AM Subject: Re: [ANNOUNCEMENT] Updated: cygrunsrv-0.94-1 > On Wed, Jul 18, 2001 at 03:11:42PM -0400, Jason Tishler wrote: > > I appear to have empirical evidence that indicates that PostgreSQL > > can tolerate receiving SIGTERM and/or SIGINT during an NT shutdown. > ... > > Note that this seems to > > indicate that a Fast Shutdown can interrupt and supersede a Smart one. > > Message [4] indicates that PostgreSQL was able to restart without any > > manual intervention. > > Excellent! Your experimental results trump my hunch, big time. > > > Should I submit the attached patch (with ChangeLog) to cygwin-patches > > for consideration? > > Yes, please. This patch would seem to provide the last piece needed > (along with cygrunsrv) to make PostgreSQL robust as an NT/Win2000 > service. And with no changes whatsoever to PostgreSQL code -- bonus! > > -- > Fred Yankowski fred@OntoSys.com tel: +1.630.879.1312 > Principal Consultant www.OntoSys.com fax: +1.630.879.1370 > OntoSys, Inc 38W242 Deerpath Rd, Batavia, IL 60510, USA > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly
Don, On Thu, Jul 19, 2001 at 03:42:54PM -0700, Don Saxton wrote: > The power just returned here which started me thinking about this. To > restart pg I needed to delete the .pid. Is it correct that cygrunsrv > solution will not change that? Yes. > I have a applictation on a collection of laptops running NT 5 which are > vulnerable to people pulling the plug and the like. To protect the restart I > was thinking about a startup .bat to delete the .pid. Is this still > necessary? Yes. When computes go bang (due to a power failure), bad things happen. Unless, of course, you submit the "--ups" patch to cygrunsrv -- this should fix everything. :,) Jason -- Jason Tishler Director, Software Engineering Phone: 732.264.8770 x235 Dot Hill Systems Corp. Fax: 732.264.8798 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com Hazlet, NJ 07730 USA WWW: http://www.dothill.com
because the laptops won't have ups, it will have to be the --oops option. ----- Original Message ----- From: "Jason Tishler" <jason@tishler.net> To: "Don Saxton" <euphobot@pacbell.net> Cc: <pgsql-cygwin@postgresql.org> Sent: Monday, July 23, 2001 7:26 AM Subject: Re: [CYGWIN] Re: [ANNOUNCEMENT] Updated: cygrunsrv-0.94-1 > Don, > > On Thu, Jul 19, 2001 at 03:42:54PM -0700, Don Saxton wrote: > > The power just returned here which started me thinking about this. To > > restart pg I needed to delete the .pid. Is it correct that cygrunsrv > > solution will not change that? > > Yes. > > > I have a applictation on a collection of laptops running NT 5 which are > > vulnerable to people pulling the plug and the like. To protect the restart I > > was thinking about a startup .bat to delete the .pid. Is this still > > necessary? > > Yes. > > When computes go bang (due to a power failure), bad things happen. > Unless, of course, you submit the "--ups" patch to cygrunsrv -- this > should fix everything. :,) > > Jason > > -- > Jason Tishler > Director, Software Engineering Phone: 732.264.8770 x235 > Dot Hill Systems Corp. Fax: 732.264.8798 > 82 Bethany Road, Suite 7 Email: Jason.Tishler@dothill.com > Hazlet, NJ 07730 USA WWW: http://www.dothill.com