Thread: postmaster.pid disappeared

postmaster.pid disappeared

From
Junaili Lie
Date:
Hi,
I was redirected to this maillist when i asked questions on irc. I
hope this is the right mailing list.
I am running postgresql 7.4.8 on solaris 10 (and I compile and
installed slony). Everytime I am trying to reload the configuration
using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
didn't create a new one. So, after reload, the only way I can restart
the server is by kill -9 and then start the server again. I check the
log, nothing is meaningful except the last line:
LOG:  received SIGHUP, reloading configuration files
I am wondering if anybody has any idea?

I also noticed that the pg_ctl stop $PGDATA -m fast and smart takes
forever. When I do ps -ef, i saw several instances of <defunct>. I
have to kill -9 almost all the time to shutdown the server.

Thank you in advance,

J


Re: postmaster.pid disappeared

From
Josh Berkus
Date:
Junaili,

> I am running postgresql 7.4.8 on solaris 10 (and I compile and
> installed slony). Everytime I am trying to reload the configuration
> using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> didn't create a new one. So, after reload, the only way I can restart
> the server is by kill -9 and then start the server again. I check the
> log, nothing is meaningful except the last line:
> LOG:  received SIGHUP, reloading configuration files
> I am wondering if anybody has any idea?

Hmmm ... you didn't answer my question on IRC: are you using an alternate
database location defined in postgresql.conf?

--
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


Re: postmaster.pid disappeared

From
Tom Lane
Date:
Junaili Lie <junaili@gmail.com> writes:
> I am running postgresql 7.4.8 on solaris 10 (and I compile and
> installed slony). Everytime I am trying to reload the configuration
> using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> didn't create a new one.

That's very strange.  The pg_ctl script itself doesn't delete
the postmaster.pid file under any circumstances (unless maybe
you are using a locally modified version?), and the postmaster
shouldn't delete it either unless exiting.  Can you determine
exactly where the unlink call is coming from?  strace or local
equivalent may help.
        regards, tom lane


Re: postmaster.pid disappeared

From
Josh Berkus
Date:
Folks,

> > > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > > installed slony). Everytime I am trying to reload the configuration
> > > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > > didn't create a new one. So, after reload, the only way I can restart
> > > the server is by kill -9 and then start the server again. I check the
> > > log, nothing is meaningful except the last line:
> > > LOG: received SIGHUP, reloading configuration files
> > > I am wondering if anybody has any idea?

Looking at his report, what's happening is that the postmaster is shutting 
down, but the other backends are not ... they're hanging around as zombies.   
Not sure why, but I'm chatting with him on IRC.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


Re: postmaster.pid disappeared

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Looking at his report, what's happening is that the postmaster is shutting 
> down, but the other backends are not ... they're hanging around as
> zombies.

The zombies couldn't be dead backends if the postmaster has gone away:
in every Unix I know, a zombie process disappears instantly if its
parent dies (since the only reason for a zombie in the first place
is to hold the process' exit status until the parent reads it with
wait()).

> Not sure why, but I'm chatting with him on IRC.

Find out what the parent process of the zombies actually is.
        regards, tom lane


Re: postmaster.pid disappeared

From
Josh Berkus
Date:
Tom,

> The zombies couldn't be dead backends if the postmaster has gone away:
> in every Unix I know, a zombie process disappears instantly if its
> parent dies (since the only reason for a zombie in the first place
> is to hold the process' exit status until the parent reads it with
> wait()).

yeah, I think I spoke too soon.  What it looks like is that pg_ctl is 
reporting success while actually failing to shut down the postmaster.   
Solaris makes it a little hard to read; parent-process relationships aren't 
as clear as they are in Linux.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco


Re: postmaster.pid disappeared

From
Junaili Lie
Date:
Hi,
Thank you all for the respond.
I should probably mentioned that postgres is maintained by smf, which
is a service management tool in solaris 10.
I asked our sys admin to remove postgres from being managed by smf.
he did that. But right now he is having problem because the system
could not start because of some mounting problems.
I will report back any progress I have.
In the meantime, any ideas or suggestions or things that I can do to
provide more infor will be greatly appreciated.
Thanks,


J


On 5/24/05, Josh Berkus <josh@agliodbs.com> wrote:
> Tom,
>
> > The zombies couldn't be dead backends if the postmaster has gone away:
> > in every Unix I know, a zombie process disappears instantly if its
> > parent dies (since the only reason for a zombie in the first place
> > is to hold the process' exit status until the parent reads it with
> > wait()).
>
> yeah, I think I spoke too soon.  What it looks like is that pg_ctl is
> reporting success while actually failing to shut down the postmaster.
> Solaris makes it a little hard to read; parent-process relationships aren't
> as clear as they are in Linux.
>
> --
> --Josh
>
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>      joining column's datatypes do not match
>


Re: postmaster.pid disappeared

From
Junaili Lie
Date:
Tom,
I am not too sure how to determine the unlink call.
Can you provide more information/instructions?

In my case the pg_ctl reload -D /usr/local/pgsql deleted the
postmaster.pid without creating a new one. I am not too sure if this
is normal.

J


On 5/24/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Junaili Lie <junaili@gmail.com> writes:
> > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > installed slony). Everytime I am trying to reload the configuration
> > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > didn't create a new one.
>
> That's very strange.  The pg_ctl script itself doesn't delete
> the postmaster.pid file under any circumstances (unless maybe
> you are using a locally modified version?), and the postmaster
> shouldn't delete it either unless exiting.  Can you determine
> exactly where the unlink call is coming from?  strace or local
> equivalent may help.
>
>                        regards, tom lane
>


Re: postmaster.pid disappeared

From
Junaili Lie
Date:
Hi,
I reinstall postgresql 7.4.6 instead of 7.4.8 (still on Solaris 10)
and didn't include postgresql as services that is managed by SMF, and
it works fine so far. Also, I should mentioned that I configured
postgresql 7.4.6 with --enable-thread-safety option, don't know if
this will have anything to do with this issue.
Thanks for all the help,

J

On 5/24/05, Junaili Lie <junaili@gmail.com> wrote:
> Tom,
> I am not too sure how to determine the unlink call.
> Can you provide more information/instructions?
>
> In my case the pg_ctl reload -D /usr/local/pgsql deleted the
> postmaster.pid without creating a new one. I am not too sure if this
> is normal.
>
> J
>
>
> On 5/24/05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > Junaili Lie <junaili@gmail.com> writes:
> > > I am running postgresql 7.4.8 on solaris 10 (and I compile and
> > > installed slony). Everytime I am trying to reload the configuration
> > > using pg_ctl reload -D $PGDATA, it deleted the postmaster.pid and
> > > didn't create a new one.
> >
> > That's very strange.  The pg_ctl script itself doesn't delete
> > the postmaster.pid file under any circumstances (unless maybe
> > you are using a locally modified version?), and the postmaster
> > shouldn't delete it either unless exiting.  Can you determine
> > exactly where the unlink call is coming from?  strace or local
> > equivalent may help.
> >
> >                        regards, tom lane
> >
>