Thread: Re: sick DB - ??

Re: sick DB - ??

From
Pete Leonard
Date:
As a followup - the line from top:

1641 postgres   105   0  2684K  1384K CPU1   0   8:26 99.02% 99.02%
postgres

As you can see, it's barely taking up any RAM - the process is going nuts
right off the bat..

On Wed, 18 Jul 2001, Pete Leonard wrote:

>
> Postgres 7.1.2, FreeBSD 3.4
>
> Box got sick, had to bounce it.  Postgres wasn't brought down in a
> graceful fashion..
>
> restart didn't bring the DB back properly, so as the postgres user, did
> the following:
>
> /usr/local/pgsql/bin/postmaster -d5 start
>
> it dumps the initial environment variables, and then returns nothing.  CPU
> is pegged at 100%.  No reporting, no information as to what's happening.
>
> Solutions?  It the DB corrupted badly?  Where do I go from here?
>
> thanks,
>
>     --pete
>
>
>


Re: sick DB - ??

From
Pete Leonard
Date:
Followup ^2 -

The reason this happened was that for whatever reason (we're still
investigating), /tmp was writeable only by root.

I only noticed this when using initdb to create a new data directory.

postmaster offered no suggestion that there was a problem here, even when
running at -d5.

chmod 777 /tmp fixed everything.

my best guess (I don't know how postmaster is operating, I didn't run any
of the system-level diagnostic tools to check) is that if postmaster fails
on opening a pipe/tmpfile, rather than check the error properly, it
changes the filename and tries again ad infinitum?  Perhaps printing some
error code (especially at debug level 5) would help?

thanks,

    --pete



On Wed, 18 Jul 2001, Pete Leonard wrote:

>
> As a followup - the line from top:
>
> 1641 postgres   105   0  2684K  1384K CPU1   0   8:26 99.02% 99.02%
> postgres
>
> As you can see, it's barely taking up any RAM - the process is going nuts
> right off the bat..
>
> On Wed, 18 Jul 2001, Pete Leonard wrote:
>
> >
> > Postgres 7.1.2, FreeBSD 3.4
> >
> > Box got sick, had to bounce it.  Postgres wasn't brought down in a
> > graceful fashion..
> >
> > restart didn't bring the DB back properly, so as the postgres user, did
> > the following:
> >
> > /usr/local/pgsql/bin/postmaster -d5 start
> >
> > it dumps the initial environment variables, and then returns nothing.  CPU
> > is pegged at 100%.  No reporting, no information as to what's happening.
> >
> > Solutions?  It the DB corrupted badly?  Where do I go from here?
> >
> > thanks,
> >
> >     --pete
> >
> >
> >
>
>


Re: Re: sick DB - ??

From
Tom Lane
Date:
Pete Leonard <pete@hero.com> writes:
>> restart didn't bring the DB back properly, so as the postgres user, did
>> the following:
>> /usr/local/pgsql/bin/postmaster -d5 start
>> it dumps the initial environment variables, and then returns nothing.  CPU
>> is pegged at 100%.  No reporting, no information as to what's happening.

This is kind of a random guess, but we recently noticed that 7.1 has a
bug whereby the postmaster can go into an infinite loop at startup if
the $PGDATA directory is not writable.  Check permissions.  It might
also be a good idea to remove the old postmaster.pid file by hand.

            regards, tom lane

Re: Re: sick DB - ??

From
Mike Castle
Date:
On Wed, Jul 18, 2001 at 09:36:38AM -0700, Pete Leonard wrote:
> chmod 777 /tmp fixed everything.


That should be 1777.

mrc
--
     Mike Castle      dalgoda@ix.netcom.com      www.netcom.com/~dalgoda/
    We are all of us living in the shadow of Manhattan.  -- Watchmen
fatal ("You are in a maze of twisty compiler features, all different"); -- gcc

Re: Re: sick DB - ??

From
Tom Lane
Date:
Pete Leonard <pete@hero.com> writes:
> The reason this happened was that for whatever reason (we're still
> investigating), /tmp was writeable only by root.

Ah.  Hadn't thought about it before, but the infinite-loop-on-
nonwritable-$PGDATA bug would also trigger for nonwritable /tmp.
(The bug was actually in CreateLockFile, which is used both to
create a lockfile in $PGDATA and one in /tmp.  Sigh.)

This is fixed in current sources.  If we were going to do a 7.1.3
then I'd backpatch the fix into the REL7_1 branch, but at this point
I suspect there won't be a 7.1.3 --- we'll probably go into 7.2 beta
in another five or six weeks, so there's not much point.

            regards, tom lane