Thread: Found this in the server log on MAC OSX
To all, I restarted postgres this morning to make a configuration file change take effect and when I looked at the serverlog to make sure things were ok I found the prior shutdown message(*in bold*). This shut down was not initiated by me or anyone else who has access to the machine. Any ideas as to what might be causing this? OSX Server 10.3.2 running on a dual 2.ghz G5 with 8GB memory attached to a XRaid 3.5TB system via Fibre channel. We are running 7.4.1. The LOG: unexpected EOF on client connection messages are from our JDBC implementation, they are benign. Thanks. --sean LOG: unexpected EOF on client connection *LOG: received smart shutdown request * *FATAL: the database system is shutting down FATAL: the database system is shutting down LOG: server process (PID 4691) was terminated by signal 9 LOG: terminating any other active server processes LOG: statistics collector process (PID 361) was terminated by signal 9 LOG: all server processes terminated; reinitializing LOG: database system was interrupted at 2004-02-23 07:50:15 EST LOG: checkpoint record is at D0/B2FD5AB4 LOG: redo record is at D0/B2FD5AB4; undo record is at 0/0; shutdown FALSE LOG: next transaction ID: 31473487; next OID: 3103140043 LOG: database system was not properly shut down; automatic recovery in progress LOG: record with zero length at D0/B2FD5AF4 LOG: redo is not required LOG: database system is ready* LOG: shutting down LOG: database system is shut down LOG: database system was shut down at 2004-02-23 08:12:28 EST LOG: checkpoint record is at D0/B2FD5B34 LOG: redo record is at D0/B2FD5B34; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 31473487; next OID: 3103140043 LOG: database system is ready
On Monday February 23 2004 8:43, Sean Shanny wrote: > > I restarted postgres this morning to make a configuration file change > take effect and when I looked at the serverlog to make sure things were > ok I found the prior shutdown message(*in bold*). This shut down was > not initiated by me or anyone else who has access to the machine. Any > ideas as to what might be causing this? > > LOG: unexpected EOF on client connection > *LOG: received smart shutdown request * > *FATAL: the database system is shutting down > FATAL: the database system is shutting down > LOG: server process (PID 4691) was terminated by signal 9 > LOG: terminating any other active server processes > LOG: statistics collector process (PID 361) was terminated by signal 9 Looks like it got a SIGTERM, which is what you might get if someone shutdown OSX (osx pls gurus correct me). Maybe your machine rebooted? Logging pids and timestamps might help tell the tale next time ...
Ed, If I continue to see the behavior I will turn the logging stuff on as you suggested. The machine uptime is show 3+ days so it did not reboot. Thanks. --sean Ed L. wrote: >On Monday February 23 2004 8:43, Sean Shanny wrote: > > >>I restarted postgres this morning to make a configuration file change >>take effect and when I looked at the serverlog to make sure things were >>ok I found the prior shutdown message(*in bold*). This shut down was >>not initiated by me or anyone else who has access to the machine. Any >>ideas as to what might be causing this? >> >>LOG: unexpected EOF on client connection >>*LOG: received smart shutdown request * >>*FATAL: the database system is shutting down >>FATAL: the database system is shutting down >>LOG: server process (PID 4691) was terminated by signal 9 >>LOG: terminating any other active server processes >>LOG: statistics collector process (PID 361) was terminated by signal 9 >> >> > >Looks like it got a SIGTERM, which is what you might get if someone shutdown >OSX (osx pls gurus correct me). Maybe your machine rebooted? Logging >pids and timestamps might help tell the tale next time ... > > > > >
"Ed L." <pgsql@bluepolka.net> writes: > On Monday February 23 2004 8:43, Sean Shanny wrote: >> *LOG: received smart shutdown request * >> *FATAL: the database system is shutting down >> FATAL: the database system is shutting down >> LOG: server process (PID 4691) was terminated by signal 9 >> LOG: terminating any other active server processes >> LOG: statistics collector process (PID 361) was terminated by signal 9 > Looks like it got a SIGTERM, which is what you might get if someone shutdown > OSX (osx pls gurus correct me). Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic mechanism in OS X that would issue SIGKILL against a Postgres backend. Certainly Postgres itself would not. Some Linux kernels issue SIGKILL to get out of out-of-memory situations, but I believe OS X to be better behaved than that. My private opinion is that Sean is mistaken and that the above trace shows someone manually (and not very competently) shutting down the database. First they tried a SIGTERM against the postmaster (or equivalently "pg_ctl stop") and after getting tired of waiting for the clients to shut down, they did SIGKILL against the backends. It's too bad we have no timestamps in this log, as the intervals between the above-recorded entries would be very revealing. regards, tom lane
On Monday February 23 2004 10:50, Tom Lane wrote: > "Ed L." <pgsql@bluepolka.net> writes: > > On Monday February 23 2004 8:43, Sean Shanny wrote: > >> *LOG: received smart shutdown request * > >> *FATAL: the database system is shutting down > >> FATAL: the database system is shutting down > >> LOG: server process (PID 4691) was terminated by signal 9 > >> LOG: terminating any other active server processes > >> LOG: statistics collector process (PID 361) was terminated by signal > >> 9 > > > > Looks like it got a SIGTERM, which is what you might get if someone > > shutdown OSX (osx pls gurus correct me). > > Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic > mechanism in OS X that would issue SIGKILL against a Postgres backend. > Certainly Postgres itself would not. Some Linux kernels issue SIGKILL > to get out of out-of-memory situations, but I believe OS X to be better > behaved than that. Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request an indication of a SIGTERM? I'm just speculating about what happened, but isn't that what you'd see during a system shutdown? The kernel sending SIGTERMs?
To all, It looks like it is the fault of the shutdown script <said Sean with sheepish look on my face> We have to take a look a the scripts, found them on the web, to figure out what they are doing to cause this. Sorry for the clutter in your mailboxes. --sean waiting for postmaster to shut down................................................................ failed pg_ctl: postmaster does not shut down /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 3:35.75: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres:: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 192.168.34.19: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: ??: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.56: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres:: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: tripmaster: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 192.168.35.16: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: idle: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: in: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: transaction: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:00.71: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres:: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: buffer: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: std-: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: S: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: 0:03.78: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: postgres:: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: stats: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: collector: no such pid /Library/StartupItems/PostgreSQL/PostgreSQL: line 61: kill: process: no such pid Tom Lane wrote: >"Ed L." <pgsql@bluepolka.net> writes: > > >>On Monday February 23 2004 8:43, Sean Shanny wrote: >> >> >>>*LOG: received smart shutdown request * >>>*FATAL: the database system is shutting down >>>FATAL: the database system is shutting down >>>LOG: server process (PID 4691) was terminated by signal 9 >>>LOG: terminating any other active server processes >>>LOG: statistics collector process (PID 361) was terminated by signal 9 >>> >>> > > > >>Looks like it got a SIGTERM, which is what you might get if someone shutdown >>OSX (osx pls gurus correct me). >> >> > >Uh, no, signal 9 is SIGKILL not SIGTERM. I'm not aware of any automatic >mechanism in OS X that would issue SIGKILL against a Postgres backend. >Certainly Postgres itself would not. Some Linux kernels issue SIGKILL >to get out of out-of-memory situations, but I believe OS X to be better >behaved than that. > >My private opinion is that Sean is mistaken and that the above trace >shows someone manually (and not very competently) shutting down the >database. First they tried a SIGTERM against the postmaster (or >equivalently "pg_ctl stop") and after getting tired of waiting for >the clients to shut down, they did SIGKILL against the backends. >It's too bad we have no timestamps in this log, as the intervals >between the above-recorded entries would be very revealing. > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > > >
"Ed L." <pgsql@bluepolka.net> writes: > Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request > an indication of a SIGTERM? I'm just speculating about what happened, but > isn't that what you'd see during a system shutdown? The kernel sending > SIGTERMs? Yes, the trace is sort of consistent with the idea of a system shutdown: you'd see SIGTERMs issued, followed some time later by SIGKILL. I thought Sean had said that the machine did not shut down during this interval, and so mentally eliminated that theory --- but based on his latest comment I guess that is what happened after all. So that does leave me with a question: why didn't it work more cleanly? Our signal responses are designed around the assumption that during shutdown the kernel will send SIGTERM to *all* the Postgres processes. Backends interpret that as an immediate shutdown and should exit quickly enough to avoid getting SIGKILL'd later. It looks like either the postmaster was sent SIGTERM but the backends weren't, or the interval between SIGTERM and SIGKILL was unreasonably short. I don't think I believe the latter; the last time I checked this on Darwin, it seemed to be using the traditional 20-second grace period. Another question: if that was a shutdown we were looking at, how did the postmaster live long enough to record the final log lines? It shoulda gotten SIGKILL'd at the same time as its children. In short, there's something pretty odd about the way these signals are being passed around. It looks something like a standard system shutdown sequence, but not enough like it. regards, tom lane
Tom, Let me clarify.... I was meant shutdown in the context of issuing a stop against postgres not shutting down the OS. Sorry if I am confusing things. The scripts we are using to issue start, stop etc for postgres seem to be causing the issue. I changed the config to use timestamps in the log and the act of stopping and starting the server caused the same error to occur. :-( From the scripts we are using: StartService () { if [ "${POSTGRES:=-YES-}" = "-YES-" ]; then ConsoleMessage "Starting PostgreSQL database server" su - postgres -c '/usr/local/pgsql/bin/pg_ctl start -D /usr/local/pgsql/data -l /usr/local/pgsql/data/logfile -o -i' fi } StopService() { ConsoleMessage "Stopping PostgreSQL database services" /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data x=`/bin/ps axc | /usr/bin/grep postgres` if /bin/test "$x" then set $x kill -9 $x fi } Thanks. --sean Tom Lane wrote: >"Ed L." <pgsql@bluepolka.net> writes: > > >>Uh, no, I didn't say signal 9 is SIGTERM. Isn't a "smart" shutdown request >>an indication of a SIGTERM? I'm just speculating about what happened, but >>isn't that what you'd see during a system shutdown? The kernel sending >>SIGTERMs? >> >> > >Yes, the trace is sort of consistent with the idea of a system shutdown: >you'd see SIGTERMs issued, followed some time later by SIGKILL. >I thought Sean had said that the machine did not shut down during this >interval, and so mentally eliminated that theory --- but based on his >latest comment I guess that is what happened after all. > >So that does leave me with a question: why didn't it work more cleanly? >Our signal responses are designed around the assumption that during >shutdown the kernel will send SIGTERM to *all* the Postgres processes. >Backends interpret that as an immediate shutdown and should exit quickly >enough to avoid getting SIGKILL'd later. It looks like either the >postmaster was sent SIGTERM but the backends weren't, or the interval >between SIGTERM and SIGKILL was unreasonably short. I don't think I >believe the latter; the last time I checked this on Darwin, it seemed to >be using the traditional 20-second grace period. > >Another question: if that was a shutdown we were looking at, how did the >postmaster live long enough to record the final log lines? It shoulda >gotten SIGKILL'd at the same time as its children. > >In short, there's something pretty odd about the way these signals are >being passed around. It looks something like a standard system shutdown >sequence, but not enough like it. > > regards, tom lane > > >
Sean Shanny <shannyconsulting@earthlink.net> writes: > StopService() > { > ConsoleMessage "Stopping PostgreSQL database services" > /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data > x=`/bin/ps axc | /usr/bin/grep postgres` > if /bin/test "$x" > then > set $x > kill -9 $x > fi > } Gack. Whoever wrote this should be shot. Take out everything after the "pg_ctl stop" line. You might want to add "-m immediate" to the pg_ctl command as a less dangerous way of encouraging a fast shutdown... regards, tom lane
Tom Lane wrote: >>StopService() >>{ >> ConsoleMessage "Stopping PostgreSQL database services" >> /usr/local/pgsql/bin/pg_ctl stop -D /usr/local/pgsql/data >> x=`/bin/ps axc | /usr/bin/grep postgres` >> if /bin/test "$x" >> then >> set $x >> kill -9 $x >> fi >>} > > Gack. Whoever wrote this should be shot. Take out everything after the > "pg_ctl stop" line. You might want to add "-m immediate" to the pg_ctl > command as a less dangerous way of encouraging a fast shutdown... BTW, I remember vaguely that the Cpanel 3 scripts (Cpanel is a webhosting control panel product) used to do something like this for every daemon... the horror... -- dave