Thread: pgsql8b5 not launching on OSX system start; otherwise OK
hi all, i've a new install of pgsql8b5 running on OSX 10.3.6. i can readily start it from the command line with: sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf </dev/null >>/var/devlogs/postgres.log &" after which it behaves as i'd expect =) however, if i place an identical startup string in my OSX's StartupItem for pgsql & reboot, pgsql does not start on boot. immediately after, i can launch ... but not on system start. i've turned debugging (debug5, i think i got 'em all ...) on, and my "/var/devlogs/postgres.log" after startup only shows: LOG: logger shutting down DEBUG: proc_exit(0) DEBUG: shmem_exit(0) DEBUG: exit(0) system & kernel logs show nothing of obvious consequence ... any suggestions as to how to track down the no-start-on-startup problem? thx! richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h > 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf > </dev/null >>/var/devlogs/postgres.log &" Hmm, isn't this letting postmaster stderr disappear into the bit bucket? Try adding "2>&1" after the ">>/var/devlogs/postgres.log" so you can see if anything interesting shows up. regards, tom lane
hi tom, -- On Thursday, December 2, 2004 12:33:48 PM PST -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: >> sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster -n -i -h >> 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf >> </dev/null >>/var/devlogs/postgres.log &" > > Hmm, isn't this letting postmaster stderr disappear into the bit bucket? entirely possible, and probably probable. (it actually was 'in there' at one point, per the distro's included startup script ... damn that copy-n-paste!) > Try adding "2>&1" after the ">>/var/devlogs/postgres.log" so you can see > if anything interesting shows up. ok, did that, and 'simplified' my cmd as much as possible ... here's the exact c/p from my current script: sudo -u testuser sh -c "/usr/local/pgsql/bin/postmaster -i -h 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf &" >>/var/devlogs/postgres.log 2>&1 which i've tried to make 'as similar as possible' to the distro's example script: sudo -u $PGUSER sh -c "${DAEMON} -D '${PGDATA}' &" >>$PGLOG 2>&1 given my additions of: -n do not reinitialize shared memory after abnormal exit -i enable TCP/IP connections -h HOSTNAME host name or IP address to listen on , and the spec'd config file, mine, all in all, _looks_ ok to me. with the aforementioned startup string, here's the tail from my '/var/devlogs/postgres.log' immediately after a reboot, b4 starting postmaster from the cmd line: LOCATION: PostmasterMain, postmaster.c:644 DEBUG: 00000: ----------------------------------------- LOCATION: PostmasterMain, postmaster.c:646 DEBUG: 00000: invoking IpcMemoryCreate(size=2547712) LOCATION: CreateSharedMemoryAndSemaphores, ipci.c:87 DEBUG: 00000: max_safe_fds = 917, usable_fds = 951, already_open = 73 LOCATION: set_max_safe_fds, fd.c:360 LOG: 00000: logger shutting down LOCATION: SysLoggerMain, syslogger.c:361 DEBUG: 00000: proc_exit(0) LOCATION: proc_exit, ipc.c:95 DEBUG: 00000: shmem_exit(0) LOCATION: shmem_exit, ipc.c:126 DEBUG: 00000: exit(0) LOCATION: proc_exit, ipc.c:113 whereas the output starting *successfully* by executing the startup script from the cmd line is just: LOCATION: PostmasterMain, postmaster.c:644 DEBUG: 00000: ----------------------------------------- LOCATION: PostmasterMain, postmaster.c:646 DEBUG: 00000: invoking IpcMemoryCreate(size=2547712) LOCATION: CreateSharedMemoryAndSemaphores, ipci.c:87 DEBUG: 00000: max_safe_fds = 917, usable_fds = 951, already_open = 73 LOCATION: set_max_safe_fds, fd.c:360 note, of course, _no_ 'proc exit'. thoughts? richard
On Thu, Dec 02, 2004 at 12:43:57PM -0800, OpenMacNews wrote: > given my additions of: > > -n do not reinitialize shared memory after abnormal exit > -i enable TCP/IP connections > -h HOSTNAME host name or IP address to listen on Why don't you use postgresql.conf for this, rather than modifying the start script? -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "No necesitamos banderas No reconocemos fronteras" (Jorge González)
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > LOG: 00000: logger shutting down > LOCATION: SysLoggerMain, syslogger.c:361 I should have twigged to that before --- if you're running the syslogger, then nothing except very early startup messages is going to go to stderr. Look in wherever you told it to put the log output. regards, tom lane
hi tom, >> LOG: 00000: logger shutting down >> LOCATION: SysLoggerMain, syslogger.c:361 > > I should have twigged to that before --- if you're running the syslogger, > then nothing except very early startup messages is going to go to > stderr. Look in wherever you told it to put the log output. i thought i was, in that the startup script was 'dumping' to /var/devlogs/postgres.log. also, given my logging section from my conf file: ###################### ## ERROR REPORTING AND LOGGING # log_destination = 'stderr' # relevant when logging to stderr: redirect_stderr = true log_directory = '/var/devlogs' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # relevant when logging to syslog: syslog_facility = 'LOCAL0' syslog_ident = 'postgres' client_min_messages = debug5 log_min_messages =debug5 log_error_verbosity = verbose log_min_error_statement = debug5 there's been no trace of any output to any 'postgresql-%Y-%m-%d_%H%M%S.log' files. while stumbling around, though, i noticed that after an un-successful startup (i.e., no pgsql launched), there, nonetheless, WAS a pgsql pid file in my process dir. odd ... so i deleted it, rebooted, and - voila! pgsql is up & running ... and there are now dated log files, as well. despite being able to start/stop pgsql from cmd line at will, *something* in my system is not removing the pid file. although i've seen nothing pid-related in my logs, preceding my startup file launch cmd with a pid check/delete: if [ -f /var/run/postgresql.pid ]; then rm -rf /var/run/postgresql.pid fi (launch cmd) seems to have done the trick. i can now reboot w/ pgsql launch on start without fail. so, (a) i'll now hunt-n-destroy why i'm having a lingering pid file lying around, and why a restart-launch chokes on an existing pid, but not a cmd-line launch? (b) i might suggest that such a check be placed in the example startup script for safety's sake ... although you'd have to check for the defined pid path+file, of course. thx! for your guidance =) cheers, richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > although i've seen nothing pid-related in my logs, preceding my startup file > launch cmd with a pid check/delete: > if [ -f /var/run/postgresql.pid ]; then > rm -rf /var/run/postgresql.pid > fi > (launch cmd) > seems to have done the trick. i can now reboot w/ pgsql launch on start > without fail. In that case it's a problem in your launch script. The postmaster doesn't even know that such a file exists; it keeps its lock file in the data directory. regards, tom lane
hi tom, > In that case it's a problem in your launch script. The postmaster > doesn't even know that such a file exists; it keeps its lock file > in the data directory. well, hmmmm. the launch script is currently simplified (for testing) to just the pid-checking-if-stmt + the single line launch cmd. there's honestly not much left to have a problem with ... note that my cmd line refers to the conf file, which has the external PID id'd in it: external_pid_file = '/var/run/postgresql.pid' i've set it up to be (eventually) watched by a watchdog app ... so, wouldn't (a) the postmaster know abt the PID file, and (b) check for its existence? or am i misunderstanding the purpose/use of the external pid? cheers, richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > note that my cmd line refers to the conf file, which has the external > PID id'd in it: > external_pid_file = '/var/run/postgresql.pid' Oh, now you tell us ;-) Still, I'm not sure what could be the problem. The only code that reacts to that setting is in postmaster.c: /* * Write the external PID file if requested */ if (external_pid_file) { FILE *fpidfile = fopen(external_pid_file, "w"); if (fpidfile) { fprintf(fpidfile, "%d\n", MyProcPid); fclose(fpidfile); /* Should we remove the pid file on postmaster exit? */ } else write_stderr("%s: could not write external PID file \"%s\": %s\n", progname, external_pid_file, strerror(errno)); } I suppose that the fopen might have failed (maybe the original pid file wasn't writable by the postmaster??), but why wouldn't it have printed an error message and kept going? regards, tom lane
hi, >> note that my cmd line refers to the conf file, which has the external >> PID id'd in it: > >> external_pid_file = '/var/run/postgresql.pid' > Oh, now you tell us ;-) heh. sorry -- just thought it was SOP. in case you haven't noticed, i'm at that 'wunnerful' ramp-up stage that i dunno what i dunno ... or ... er ... or know what i should know ... or somesuch ... =8-D > write_stderr("%s: could not write external PID file \"%s\": %s\n", > progname, external_pid_file, strerror(errno)); > } simple enuf ... > I suppose that the fopen might have failed (maybe the original pid file > wasn't writable by the postmaster??), just checked -- looks ok. PID is properly 'owned & operated' by the postmaster superuser defined in the launch command > but why wouldn't it have printed an error message and kept going? that's the rub. i'd expect to see it in the logs, as well. i just did a simple experiment. disable PIFfile check/delete in startup script stop postgres delete PIDfile (if still there) reboot ---> postgres launches OK verify PIDfile exists ... it does ---> can start/stop pgsql at will @ cmd line stop postgres touch PIDfile (if _not_ there) reboot --> NO launch, nothing in the logs verify PIDfile exists ... it does ---> can start/stop pgsql at will @ cmd line reboot --> still NO launch, nothing in the logs verify PIDfile still exists ... it does ---> can start/stop pgsql at will @ cmd line stop postgres delete PIDfile reboot --> back to normal all reproducible. imho, it's acting like the cmd line launch is working with a different PID file ... somethin's wonky. so, (1) i have a workaround for the moment via the script check (couldn't hurt, really, to add the check to the startup script ...) (2) since i've been appropriately mangling my system while getting this all running, i think it may be time for a wipe-n-reinstall ... who knows what i've done to myself? as you've mentioned, i wonder if i've an odd permission on a process or log dir somehwere ... cheers, richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > stop postgres > touch PIDfile (if _not_ there) > reboot > --> NO launch, nothing in the logs > verify PIDfile exists ... it does But who is it owned by, and with what permissions? If you do the "touch" as some other user than the postmaster runs as, it's very plausible the postmaster can't write the file. (That doesn't yet explain why it goes south afterward, but first we need to understand the conditions that make it fail.) regards, tom lane
hi, > But who is it owned by, and with what permissions? same owner as postmaster, 0644 or 0600 > If you do the "touch" as some other user than the postmaster runs as, it's very > plausible the postmaster can't write the file. (That doesn't yet > explain why it goes south afterward, but first we need to understand > the conditions that make it fail.) yup. agreed. postmaster launched as 'testuser', pidfile touched as: sudo -u testuser touch /var/run/postgresql.pid resulting in: -rw-r--r-- 1 testuser testuser 4 Dec 2 14:07 postgresql.pid fwiw, i've got a clean build under way on another box: pgsql, prereqs and dir hierarchy will all be 'fresh'. we'll see if it's me (betcha! there's been a LOT going on on _this_ box ... more than pgsql) or the code ... richard
(From someone else who doesn't know what doesn't know, ... :-/) > sudo -u testuser sh -c "nohup /usr/local/pgsql/bin/postmaster [...] ... > >> note that my cmd line refers to the conf file, which has the external > >> PID id'd in it: > > > >> external_pid_file = '/var/run/postgresql.pid' > ... > just checked -- looks ok. PID is properly 'owned & operated' by the postmaster > superuser defined in the launch command Who owns /var/run? What group? Does testuser have permission to delete files there? (May need to add testuser to the wheel or admin group?) Another thought, try su -c instead of sudo? (See warning on first line. It's been a while since I've mucked that deep in the Mac OS X configurations, and my box is still on 10.2, so I'm probably just blowing smoke.) -- Joel Rees <rees@ddcom.co.jp> digitcom, inc. 株式会社デジコム Kobe, Japan +81-78-672-8800 ** <http://www.ddcom.co.jp> **
hi joel, >> just checked -- looks ok. PID is properly 'owned & operated' by the >> postmaster superuser defined in the launch command > > Who owns /var/run? What group? Does testuser have permission to delete > files there? (May need to add testuser to the wheel or admin group?) good points =) already done, tho ... % ls -ald /var/run drwxrwxr-x 29 root daemon 986 Dec 2 20:53 /var/run % niutil -read / /groups/daemon name: daemon gid: 1 passwd: * users: root testuser > Another thought, try su -c instead of sudo? afaik, shouldn't make a diff, as testuser is in /etc/sudoers ... thx! > Kobe, Japan <--- *there's* the beef ... :p cheers, richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > i've a new install of pgsql8b5 running on OSX 10.3.6. > ... > however, if i place an identical startup string in my OSX's StartupItem for > pgsql & reboot, pgsql does not start on boot. I was trying to reproduce this on my own machine, but couldn't get out of the starting gate. I put an executable shell script into "/System Folder/Startup Items", but I couldn't see any evidence that the system paid any attention to it at all. Exactly what are you doing to tell OSX to run a bit of shell script at boot time? regards, tom lane
>> i've a new install of pgsql8b5 running on OSX 10.3.6. >> ... >> however, if i place an identical startup string in my OSX's >> StartupItem for >> pgsql & reboot, pgsql does not start on boot. > > I was trying to reproduce this on my own machine, but couldn't get out > of the starting gate. I put an executable shell script into > "/System Folder/Startup Items", but I couldn't see any evidence that > the > system paid any attention to it at all. Exactly what are you doing to > tell OSX to run a bit of shell script at boot time? You basicall put formatted files in a directory under /Library/StartupItems One file should contain scripts for starting, stopping and restarting the service, the other should contains some generic stuff in a "Plist" file / XML file. I use the startup package from Liyanage (site down alas) sudo systemstarter -help will give you some info on how to test this without rebooting HTH, Philippe Schmid
Attachment
hi tom, >> however, if i place an identical startup string in my OSX's StartupItem for >> pgsql & reboot, pgsql does not start on boot. > > I was trying to reproduce this on my own machine, but couldn't get out > of the starting gate. I put an executable shell script into > "/System Folder/Startup Items", but I couldn't see any evidence that the > system paid any attention to it at all. Exactly what are you doing to > tell OSX to run a bit of shell script at boot time? wrong location/folder name ... 'System Folder' is an OS9 construct (you haven't installed OS9 and OSX on the same partition, now, have you? tsk, tsk ... ;-) ) OSX userland startup scripts need to go into /Library/StartupItems/SCRIPTNAME (System startup scripts go in '/System/Library/StartupItems/SCRIPTNAME' but we users are supposed to stay out o' there. BUT, you should always check to make sure your userland script isn't conflicting with an Apple-installed flavor in /System/...) in the SCRIPTNAME dir you need two files: (1) 'SCRIPTNAME', containing your script (2) 'StartupParameters/.plist', a test or XML-formatted parameter file perms/ownership should be: chown -R root:wheel /Library/StartupItems/SCRIPTNAME chmod 755 /Library/StartupItems/SCRIPTNAME chmod 755 /Library/StartupItems/DarkMatter/SCRIPTNAME chmod 644 /Library/StartupItems/SCRIPTNAME/StartupParameters.plist fyi: here's an O'Reilly blurb with way more info than you want to know ... <http://www.macdevcenter.com/pub/a/mac/2003/10/21/startup.html> HTH! richard
OpenMacNews <pgsql-general.20.openmacnews@spamgourmet.com> writes: > fyi: here's an O'Reilly blurb with way more info than you want to know ... > <http://www.macdevcenter.com/pub/a/mac/2003/10/21/startup.html> After eyeballing that, I think I have no hope of reproducing your test conditions unless you show me the exact script and property list files you used. In particular, I was wondering if the problem could be related to launching the postmaster in advance of some system service it needs; without seeing the Requires/Uses specs you gave, there's no way to know what might have happened. BTW, that page also references this Apple document saying that StartupItems are being obsoleted: http://developer.apple.com/documentation/macosx/Conceptual/BPSystemStartup/Concepts/BootProcess.html#//apple_ref/doc/uid/20002130/CJBBICAB However, they should still work as of 10.3.*, so that's just an interesting tidbit for the future. regards, tom lane
hi tom, > After eyeballing that, I think I have no hope of reproducing your test > conditions unless you show me the exact script and property list files > you used. certainly easy enuf. thought *i'm* not certain *i* have hope of reproducing much after today's shenanigans ... jeeesh! fyi, the latest versions (as you might suspect, they've been rather dynamic of late ...) are: % vi /Library/StartupItems/PostgreSQL/PostgreSQL ------------------------------------------ #!/bin/sh . /etc/rc.common StartService () { if [ "${POSTGRESQL:=-NO-}" = "-YES-" ]; then ConsoleMessage "Starting PgSQL" if [ -f /var/run/postgresql.pid ]; then ConsoleMessage "clearing PgSQL PIDfile" rm -f /var/run/postgresql.pid fi sudo -u testuser sh -c "/usr/local/pgsql/bin/postmaster -n -i -h 10.0.0.6 -D /var/data/pgsql -c config_file=/etc/pgsql/postgresql.conf &" >>/var/devlogs/postgres.log 2>&1 fi } StopService () { ConsoleMessage "Stopping PgSQL" sudo -u testuser $POSTGRE_DAEMON stop -D /var/data/pgsql -s -m fast } RestartService () { if [ "${POSTGRESQL:=-NO-}" = "-YES-" ]; then ConsoleMessage "Restarting PgSQL" sudo -u testuser /usr/local/pgsql/bin/pg_ctl restart -D /var/data/pgsql -s -m fast else StopService fi } RunService "$1" ------------------------------------------ and, %vi /Library/StartupItems/PostgreSQL/StartupParameters.plist ------------------------------------------ { Description = "PgSQL DatabaseServer"; Provides = ("PgSQL", "DatabaseServer"); Requires = ("Disks", "Resolver"); Uses = ("NFS", "NetworkTime"); OrderPreference = "Late"; Messages = { start = "Starting PgSQL"; stop = "Stopping PgSQL"; }; } ------------------------------------------ where, of course % vi /etc/hostconfig ------------------------------------------ +++ POSTGRESQL=-YES- ------------------------------------------ > In particular, I was wondering if the problem could be > related to launching the postmaster in advance of some system service it > needs; without seeing the Requires/Uses specs you gave, there's no way > to know what might have happened. i've had that issue in the past ... primarily related to partitions with DIRs symlinked elsewhere not spinning up fast enuf. long story short, i took care of it (chats on the Apple kernle board) and hasn't been an issue since ... > BTW, that page also references this Apple document saying that > StartupItems are being obsoleted: > http://developer.apple.com/documentation/macosx/Conceptual/BPSystemStartup/Co > ncepts/BootProcess.html#//apple_ref/doc/uid/20002130/CJBBICAB However, they > should still work as of 10.3.*, so that's just an > interesting tidbit for the future. yeah, yeah ;-) everything under xinetd, eventually ... one thing at a time -- i'm running outa beer here! cheers, richard