Re: We should Axe /contrib/start-scripts - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: We should Axe /contrib/start-scripts |
Date | |
Msg-id | 24413.1251329569@sss.pgh.pa.us Whole thread Raw |
In response to | Re: We should Axe /contrib/start-scripts (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: We should Axe /contrib/start-scripts
|
List | pgsql-hackers |
I wrote: > "Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes: >> Thanks Andrew, Alvaro, and Chander. You've given me some thoughts to >> toss around. Of course, any of these is going to be somewhat more >> complex than using [ pg_ctl -w ] > Yeah. I wonder if we shouldn't expend a bit more effort to make that > way bulletproof. As I mentioned the other day, if there were a way for > pg_ctl to pass down its parent's PID then we could have the postmaster > exclude that as a false match, and then using pg_ctl would be just as > safe as invoking the postmaster directly. > The two ways I can see to do that are to add a command line switch > to the postmaster, or to pass the PID as an environment variable, > say "PG_GRANDPARENT_PID". The latter is a bit uglier but it would > require touching much less code (and documentation). Attached is a simple patch that uses the environment-variable approach. This is a whole lot more self-contained than what would be needed to pass the PID as an explicit switch, so I'm inclined to do it this way. You could argue that a bad guy could confuse matters by intentionally passing an existing postmaster's PID in this variable --- but someone with the ability to launch the postmaster can probably also remove an existing lockfile, so I don't think there's a credible security risk. Any objections? regards, tom lane Index: src/backend/utils/init/miscinit.c =================================================================== RCS file: /cvsroot/pgsql/src/backend/utils/init/miscinit.c,v retrieving revision 1.176 diff -c -r1.176 miscinit.c *** src/backend/utils/init/miscinit.c 12 Aug 2009 20:53:30 -0000 1.176 --- src/backend/utils/init/miscinit.c 26 Aug 2009 23:24:56 -0000 *************** *** 683,689 **** int len; int encoded_pid; pid_t other_pid; ! pid_t my_pid = getpid(); /* * We need a loop here because of race conditions. But don't loop forever --- 683,728 ---- int len; int encoded_pid; pid_t other_pid; ! pid_t my_pid, ! my_p_pid, ! my_gp_pid; ! const char *envvar; ! ! /* ! * If the PID in the lockfile is our own PID or our parent's or ! * grandparent's PID, then the file must be stale (probably left over from ! * a previous system boot cycle). We need to check this because of the ! * likelihood that a reboot will assign exactly the same PID as we had in ! * the previous reboot, or one that's only one or two counts larger and ! * hence the lockfile's PID now refers to an ancestor shell process. We ! * allow pg_ctl to pass down its parent shell PID (our grandparent PID) ! * via the environment variable PG_GRANDPARENT_PID; this is so that ! * launching the postmaster via pg_ctl can be just as reliable as ! * launching it directly. There is no provision for detecting ! * further-removed ancestor processes, but if the init script is written ! * carefully then all but the immediate parent shell will be root-owned ! * processes and so the kill test will fail with EPERM. Note that we ! * cannot get a false negative this way, because an existing postmaster ! * would surely never launch a competing postmaster or pg_ctl process ! * directly. ! */ ! my_pid = getpid(); ! ! #ifndef WIN32 ! my_p_pid = getppid(); ! #else ! /* ! * Windows hasn't got getppid(), but doesn't need it since it's not ! * using real kill() either... ! */ ! my_p_pid = 0; ! #endif ! ! envvar = getenv("PG_GRANDPARENT_PID"); ! if (envvar) ! my_gp_pid = atoi(envvar); ! else ! my_gp_pid = 0; /* * We need a loop here because of race conditions. But don't loop forever *************** *** 745,761 **** /* * Check to see if the other process still exists * ! * If the PID in the lockfile is our own PID or our parent's PID, then ! * the file must be stale (probably left over from a previous system ! * boot cycle). We need this test because of the likelihood that a ! * reboot will assign exactly the same PID as we had in the previous ! * reboot. Also, if there is just one more process launch in this ! * reboot than in the previous one, the lockfile might mention our ! * parent's PID. We can reject that since we'd never be launched ! * directly by a competing postmaster. We can't detect grandparent ! * processes unfortunately, but if the init script is written ! * carefully then all but the immediate parent shell will be ! * root-owned processes and so the kill test will fail with EPERM. * * We can treat the EPERM-error case as okay because that error * implies that the existing process has a different userid than we --- 784,794 ---- /* * Check to see if the other process still exists * ! * Per discussion above, my_pid, my_p_pid, and my_gp_pid can be ! * ignored as false matches. ! * ! * Normally kill() will fail with ESRCH if the given PID doesn't ! * exist. * * We can treat the EPERM-error case as okay because that error * implies that the existing process has a different userid than we *************** *** 772,789 **** * Unix socket file belonging to an instance of Postgres being run by * someone else, at least on machines where /tmp hasn't got a * stickybit.) - * - * Windows hasn't got getppid(), but doesn't need it since it's not - * using real kill() either... - * - * Normally kill() will fail with ESRCH if the given PID doesn't - * exist. */ ! if (other_pid != my_pid ! #ifndef WIN32 ! && other_pid != getppid() ! #endif ! ) { if (kill(other_pid, 0) == 0 || (errno != ESRCH && errno != EPERM)) --- 805,813 ---- * Unix socket file belonging to an instance of Postgres being run by * someone else, at least on machines where /tmp hasn't got a * stickybit.) */ ! if (other_pid != my_pid && other_pid != my_p_pid && ! other_pid != my_gp_pid) { if (kill(other_pid, 0) == 0 || (errno != ESRCH && errno != EPERM)) Index: src/bin/pg_ctl/pg_ctl.c =================================================================== RCS file: /cvsroot/pgsql/src/bin/pg_ctl/pg_ctl.c,v retrieving revision 1.111 diff -c -r1.111 pg_ctl.c *** src/bin/pg_ctl/pg_ctl.c 11 Jun 2009 14:49:07 -0000 1.111 --- src/bin/pg_ctl/pg_ctl.c 26 Aug 2009 23:24:56 -0000 *************** *** 672,677 **** --- 672,692 ---- unlimit_core_size(); #endif + /* + * If possible, tell the postmaster our parent shell's PID (see the + * comments in CreateLockFile() for motivation). Windows hasn't got + * getppid() unfortunately. + */ + #ifndef WIN32 + { + static char env_var[32]; + + snprintf(env_var, sizeof(env_var), "PG_GRANDPARENT_PID=%d", + (int) getppid()); + putenv(env_var); + } + #endif + exitcode = start_postmaster(); if (exitcode != 0) {
pgsql-hackers by date: