Thread: Postgres not starting at boot(FreeBSD) - startup script not releasing

Try this on for size...   recently during a reboot (first in about 3 months for
this particular server) our entire rc.d directory failed to start...  after some
hacking of the rc file to output some helpful debuggin, it was apparent that the
010.pgsql.sh script in /usr/local/etc/rc.d was timing out and causing any
directives thereafter not to be processed.

Running the script manually as root starts the postmaster but doesn't return you
to the command prompt. ^C and checking the errlog shows

Waiting for postmaster starting up..DEBUG:  Data Base System is starting up at
Sat Mar  9 17:05:45 2002
DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:39 2002
DEBUG:  Data Base System is in production state at Sat Mar  9 17:05:45 2002
Fast Shutdown request at Sat Mar  9 17:05:48 2002
DEBUG:  Data Base System shutting down at Sat Mar  9 17:05:48 2002
DEBUG:  Data Base System shut down at Sat Mar  9 17:05:48 2002

Can force it to return to command prompt by adding a "&" and doublt cr

web1# /usr/local/etc/rc.d/010.pgsql.sh start &
[1] 4635
web1#
[1]  + Suspended (tty output)        /usr/local/etc/rc.d/010.pgsql.sh start
web1#

and postgres stays up and frees the terminal.  Output in errlog for this is...

Waiting for postmaster starting up..DEBUG:  Data Base System is starting up at
Sat Mar  9 17:07:21 2002
DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:48 2002
DEBUG:  Data Base System is in production state at Sat Mar  9 17:07:21 2002

No idea what could be causing the script not to function as it is the EXACT same
script as on the other servers we are operating (did a diff just to be sure)

In the interim we removed the script from the startup dir...   any ideas as to
why this is occuring?

Installed from port, left the port startup script as is... listed below.
Appreciate any feedback/comments.

Dave

# $FreeBSD: ports/databases/postgresql7/files/pgsql.sh.tmpl,v 1.9 2000/12/11
03:22:07 steve Exp $
#
# For postmaster startup options, edit $PGDATA/postmaster.opts.default
# Preinstalled options are -i -o "-F"

case $1 in
start)
    [ -d /usr/local/pgsql/lib ] && /sbin/ldconfig -m /usr/local/pgsql/lib
    [ -x /usr/local/pgsql/bin/pg_ctl ] && {
        su -l pgsql -c \
            'exec /usr/local/pgsql/bin/pg_ctl -w start > /usr/local/pgsql/errlog
2>&1'
        echo -n ' pgsql'
    }
    ;;

stop)
    [ -x /usr/local/pgsql/bin/pg_ctl ] && {
        exec su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl -w -m fast stop'
    }
    ;;

status)
    [ -x /usr/local/pgsql/bin/pg_ctl ] && {
        exec su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl status'
    }
    ;;

*)
    echo "usage: `basename $0` {start|stop|status}" >&2
    exit 64
    ;;
esac


"Dave" <dave@hawk-systems.com> writes:
> DEBUG:  Data Base System is starting up at Sat Mar  9 17:05:45 2002
> DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:39 2002
> DEBUG:  Data Base System is in production state at Sat Mar  9 17:05:45 2002
> Fast Shutdown request at Sat Mar  9 17:05:48 2002
> DEBUG:  Data Base System shutting down at Sat Mar  9 17:05:48 2002
> DEBUG:  Data Base System shut down at Sat Mar  9 17:05:48 2002

It looks like something is hitting the postmaster with a SIGINT signal
as soon as it starts.  Got any idea what might be doing that?  It's
not pg_ctl, for sure (unless the "something" is firing your init
script with a 'stop' option).  In any case I think you should be looking
for outside agencies, not a problem directly in this init script.

            regards, tom lane

Sorry,  should point out that the stop is resulting from executing a ^c after
running the script manually.  Since the script runs...  postgres starts, but
from reading the startup script, it is waiting for the pid file to appear before
reporting suscess...  and it isn't doing this.  Or at least not exiting and
leaving the postmaster running.  It just sits there... thus the ^c to regain the
terminal.

opening two terminals, I can run the start script, and while the first terminal
is sitting there waiting for the script to release control, move to the second
terminal and view the results...  postmaster running fine, pid file there, all
normal.

if I execute the script with the & behind it, it allows everything through after
entering another <cr> which from what I can see suspends the session which then
clears normally.   (making sense?)

Confused still as to the cause or how to rectify.

Dave

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Sunday, March 10, 2002 11:22 AM
>To: Dave
>Cc: pgsql-admin@postgresql.org
>Subject: Re: [ADMIN] Postgres not starting at boot(FreeBSD) - startup
>script not releasing
>
>
>"Dave" <dave@hawk-systems.com> writes:
>> DEBUG:  Data Base System is starting up at Sat Mar  9 17:05:45 2002
>> DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:39 2002
>> DEBUG:  Data Base System is in production state at Sat Mar  9 17:05:45 2002
>> Fast Shutdown request at Sat Mar  9 17:05:48 2002
>> DEBUG:  Data Base System shutting down at Sat Mar  9 17:05:48 2002
>> DEBUG:  Data Base System shut down at Sat Mar  9 17:05:48 2002
>
>It looks like something is hitting the postmaster with a SIGINT signal
>as soon as it starts.  Got any idea what might be doing that?  It's
>not pg_ctl, for sure (unless the "something" is firing your init
>script with a 'stop' option).  In any case I think you should be looking
>for outside agencies, not a problem directly in this init script.
>
>            regards, tom lane
>


hold the farm...

>>> Try this on for size...   recently during a reboot (first in about 3
>>months for
>>> this particular server) our entire rc.d directory failed to start...
>> after some
>>> hacking of the rc file to output some helpful debuggin, it was
>>apparent that the
>>> 010.pgsql.sh script in /usr/local/etc/rc.d was timing out and causing any
>>> directives thereafter not to be processed.
>>
>>have you tried manually doing "pg_ctl restart" to see if any problems
>>pop-up? Maybe it is not a script error, but some other issue with the db
>>server.

did the following,  stopped the server totally...  then ran the following.

web5# su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl start'
postmaster successfully started up.
web5# DEBUG:  Data Base System is starting up at Sun Mar 10 14:32:46 2002
DEBUG:  Data Base System was shut down at Sun Mar 10 14:32:04 2002
DEBUG:  Data Base System is in production state at Sun Mar 10 14:32:46 2002

web5#
web5# su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl restart'
Smart Shutdown request at Sun Mar 10 14:33:25 2002
Waiting for postmaster shutting down..................................The Data
Base System is shutting down
..........The Data Base System is shutting down
...The Data Base System is shutting down
....The Data Base System is shutting down
...The Data Base System is shutting down
.........pg_ctl: postmaster does not shut down
web5# The Data Base System is shutting down
The Data Base System is shutting down
The Data Base System is shutting down
The Data Base System is shutting down

    Hmmm...  check that its still running...

web5# ps -aux | grep pgsql
pgsql  81016  0.0  0.1   628  452  p0  I     2:32PM   0:00.00 /bin/sh /usr/loca
pgsql  81018  0.0  0.3  4080 2404  p0  I     2:32PM   0:00.03 /usr/local/pgsql/
pgsql  81082  0.0  0.4  4508 3008  p0  I     2:33PM   0:00.03 /usr/local/pgsql/
pgsql  81083  0.0  0.4  4556 3364  p0  I     2:33PM   0:00.06 /usr/local/pgsql/
web5#

    ok, lets try and use the rc.d script...

web5# /usr/local/etc/rc.d/010* stop
Fast Shutdown request at Sun Mar 10 14:37:28 2002
Aborting any active transaction...
Waiting for postmaster shutting down..FATAL 1:  The system is shutting down
FATAL 1:  The system is shutting down
NOTICE:  AbortTransaction and not in in-progress state
.NOTICE:  AbortTransaction and not in in-progress state
DEBUG:  Data Base System shutting down at Sun Mar 10 14:37:28 2002
DEBUG:  Data Base System shut down at Sun Mar 10 14:37:28 2002
done.
postmaster successfully shut down.
web5#

    Thats interesting,  perhaps pg_ctl is hosed?

web5# ps -aux | grep pgsql
web5#


Ideas?

Dave


Re: Postgres not starting at boot(FreeBSD) - startup script

From
Dmitry Morozovsky
Date:
On Sun, 10 Mar 2002, Dave wrote:

I use the following lines (at /usr/local/etc/rc.d/pgsql.sh)

-- 8< --
#!/bin/sh
PGBIN=/usr/local/pgsql/bin

cmd="$1"
: ${cmd:=start}

case $cmd in
start)
    [ -d /usr/local/pgsql/lib ] && /sbin/ldconfig -m /usr/local/pgsql/lib
    [ -x ${PGBIN}/pg_ctl ] && {
    echo -n 'pgsql '
    su -l pgsql -c \
        '[ -d ${PGDATA} ] && exec /usr/local/pgsql/bin/pg_ctl start -s -l ~pgsql/log/errlog'
    }
    ;;

stop)
    [ -x ${PGBIN}/pg_ctl ] && {
    echo -n 'pgsql '
    su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl stop -s -m fast'
    }
    ;;

status)
    [ -x ${PGBIN}/pg_ctl ] && {
    exec su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl status'
    }
    ;;

*)
    echo "usage: `basename $0` {start|stop|status}" >&2
    exit 64
    ;;
esac

-- 8< --


D> Try this on for size...   recently during a reboot (first in about 3 months for
D> this particular server) our entire rc.d directory failed to start...  after some
D> hacking of the rc file to output some helpful debuggin, it was apparent that the
D> 010.pgsql.sh script in /usr/local/etc/rc.d was timing out and causing any
D> directives thereafter not to be processed.
D>
D> Running the script manually as root starts the postmaster but doesn't return you
D> to the command prompt. ^C and checking the errlog shows
D>
D> Waiting for postmaster starting up..DEBUG:  Data Base System is starting up at
D> Sat Mar  9 17:05:45 2002
D> DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:39 2002
D> DEBUG:  Data Base System is in production state at Sat Mar  9 17:05:45 2002
D> Fast Shutdown request at Sat Mar  9 17:05:48 2002
D> DEBUG:  Data Base System shutting down at Sat Mar  9 17:05:48 2002
D> DEBUG:  Data Base System shut down at Sat Mar  9 17:05:48 2002
D>
D> Can force it to return to command prompt by adding a "&" and doublt cr
D>
D> web1# /usr/local/etc/rc.d/010.pgsql.sh start &
D> [1] 4635
D> web1#
D> [1]  + Suspended (tty output)        /usr/local/etc/rc.d/010.pgsql.sh start
D> web1#
D>
D> and postgres stays up and frees the terminal.  Output in errlog for this is...
D>
D> Waiting for postmaster starting up..DEBUG:  Data Base System is starting up at
D> Sat Mar  9 17:07:21 2002
D> DEBUG:  Data Base System was shut down at Sat Mar  9 17:05:48 2002
D> DEBUG:  Data Base System is in production state at Sat Mar  9 17:07:21 2002
D>
D> No idea what could be causing the script not to function as it is the EXACT same
D> script as on the other servers we are operating (did a diff just to be sure)
D>
D> In the interim we removed the script from the startup dir...   any ideas as to
D> why this is occuring?
D>
D> Installed from port, left the port startup script as is... listed below.
D> Appreciate any feedback/comments.
D>
D> Dave
D>
D> # $FreeBSD: ports/databases/postgresql7/files/pgsql.sh.tmpl,v 1.9 2000/12/11
D> 03:22:07 steve Exp $
D> #
D> # For postmaster startup options, edit $PGDATA/postmaster.opts.default
D> # Preinstalled options are -i -o "-F"
D>
D> case $1 in
D> start)
D>     [ -d /usr/local/pgsql/lib ] && /sbin/ldconfig -m /usr/local/pgsql/lib
D>     [ -x /usr/local/pgsql/bin/pg_ctl ] && {
D>         su -l pgsql -c \
D>             'exec /usr/local/pgsql/bin/pg_ctl -w start > /usr/local/pgsql/errlog
D> 2>&1'
D>         echo -n ' pgsql'
D>     }
D>     ;;
D>
D> stop)
D>     [ -x /usr/local/pgsql/bin/pg_ctl ] && {
D>         exec su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl -w -m fast stop'
D>     }
D>     ;;
D>
D> status)
D>     [ -x /usr/local/pgsql/bin/pg_ctl ] && {
D>         exec su -l pgsql -c 'exec /usr/local/pgsql/bin/pg_ctl status'
D>     }
D>     ;;
D>
D> *)
D>     echo "usage: `basename $0` {start|stop|status}" >&2
D>     exit 64
D>     ;;
D> esac
D>
D>
D> ---------------------------(end of broadcast)---------------------------
D> TIP 4: Don't 'kill -9' the postmaster
D>

Sincerely,
D.Marck                                   [DM5020, DM268-RIPE, DM3-RIPN]
------------------------------------------------------------------------
*** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru ***
------------------------------------------------------------------------


Re: Postgres not starting at boot(FreeBSD) - startup script not releasing

From
"Matthew D. Fuller"
Date:
On Sun, Mar 10, 2002 at 09:11:11AM -0500 I heard the voice of
Dave, and lo! it spake thus:
> Try this on for size...   recently during a reboot (first in about 3 months for
> this particular server) our entire rc.d directory failed to start...  after some
> hacking of the rc file to output some helpful debuggin, it was apparent that the
> 010.pgsql.sh script in /usr/local/etc/rc.d was timing out and causing any
> directives thereafter not to be processed.

At a guess, you've set it up to not automatically trust local users, so
the default options which 'wait' for the server to come up (and "waits"
by having psql try connecting as the postgres user) waits for a long long
time for somebody to give it the password it now requires.

I find that rather annoying, and miss it every time, until the rc script
hangs.  Check the options and figure out which one it is you have to take
out, I can't recall offhand.



--
Matthew Fuller     (MF4839)     |    fullermd@over-yonder.net
Unix Systems Administrator      |    fullermd@futuresouth.com
Specializing in FreeBSD         |    http://www.over-yonder.net/

"The only reason I'm burning my candle at both ends, is because I
      haven't figured out how to light the middle yet"

Bingo!  Dumb move.  Dropped everything to password a few months back,  never had
the occasion to restart after that.  Will work on tweaking the pg_hba.conf

Thanks Matthew...  if you are ever in Toronto, I owe you a beer.

Dave

>At a guess, you've set it up to not automatically trust local users, so
>the default options which 'wait' for the server to come up (and "waits"
>by having psql try connecting as the postgres user) waits for a long long
>time for somebody to give it the password it now requires.
>
>I find that rather annoying, and miss it every time, until the rc script
>hangs.  Check the options and figure out which one it is you have to take
>out, I can't recall offhand.
>
>
>
>--
>Matthew Fuller     (MF4839)     |    fullermd@over-yonder.net
>Unix Systems Administrator      |    fullermd@futuresouth.com
>Specializing in FreeBSD         |    http://www.over-yonder.net/
>


Re: Postgres not starting at boot(FreeBSD) - startup script not releasing < solved

From
"Matthew D. Fuller"
Date:
On Sun, Mar 10, 2002 at 06:11:21PM -0500 I heard the voice of
Dave, and lo! it spake thus:
> Bingo!  Dumb move.  Dropped everything to password a few months back,  never had
> the occasion to restart after that.  Will work on tweaking the pg_hba.conf

FWIW (after a quick glance at the default script and the manpage), "-w"
is the pg_ctl option that makes it wait.  I just take it out; it only
takes PG a few seconds to initialize, so it's ready to go long before
something would need to connect to it.

It could also be said that having -w implemented as invoking psql to try
to connect as the DB superuser assuming no password is a rather
inappropriate way of going about it, but that's another can of worms.



--
Matthew Fuller     (MF4839)     |    fullermd@over-yonder.net
Unix Systems Administrator      |    fullermd@futuresouth.com
Specializing in FreeBSD         |    http://www.over-yonder.net/

"The only reason I'm burning my candle at both ends, is because I
      haven't figured out how to light the middle yet"

Re: Postgres not starting at boot(FreeBSD) - startup

From
"Chad R. Larson"
Date:
At 02:54 AM 3/11/2002 , Matthew D. Fuller wrote:
>It could also be said that having -w implemented as invoking psql to try
>to connect as the DB superuser assuming no password is a rather
>inappropriate way of going about it, but that's another can of worms.

It doesn't wait for the PID file to be created (at least, no on our 7.1.2
systems).  It attempts to connect to a database using psql, and loops until
that connection is successful.  Which it won't be if you've got a
password,  because the script will wait for some entity to type the
password, and hang.

My fix here was a hack in pg_ctl, right at the bottom where the script is
looping on a psql attempt to connect to a database to prove the system is
up.  I added a "-h localhost" to the psql invocation to force a TCP
connection, and then used "ident" instead of password for the authorization.


         -crl
--
Chad R. Larson (CRL22)    chad@eldocomp.com
   Eldorado Computing, Inc.   602-604-3100
      5353 North 16th Street, Suite 400
        Phoenix, Arizona  85016-3228