Thread: DB Cluster hanging

DB Cluster hanging

From
Nigel Bishop
Date:
Hi,

I'm hoping that someone will be able to answer this query:

Last night at 3am our Postgresql DB cluster hung.  At the time data
was being loaded.  The parameter log_statement_stats in the
postgresql.conf file was set to true.  This was churning out data into
the logfile which was switching every 10Mb.  Eventually the partition
where the logfiles are written to filled up – fair enough – this had
been going on since about 5.30pm the previous evening and the logfiles
were being generated at the rate of 4/5 a minute.  The partition was
cleared of old logs and I expected the DB to spring in to life, but no
it just sat there.  I could not connect with psql or pg_ctl to
shutdown the cluster.

Eventually I had to issue a kill -9 on the postmaster, set
log_statement_stats to false and restarted the cluster, It recovered
itself and the data load carried on.

My question is, is this normal behaviour when the logfile destination
fills up?  There was nothing in the logfile being used at the time of
the hang, just stats data.

 PG version 8.0.3 with archived WAL logs enabled

O/S version RH ES4 with 2 CPUs & 2Gb RAM

Thanks very much for any input.


Regards,

Nigel Bishop

Re: DB Cluster hanging

From
Tom Lane
Date:
Nigel Bishop <nigel.bishop@gmail.com> writes:
> Last night at 3am our Postgresql DB cluster hung.  At the time data
> was being loaded.  The parameter log_statement_stats in the
> postgresql.conf file was set to true.  This was churning out data into
> the logfile which was switching every 10Mb.  Eventually the partition
> where the logfiles are written to filled up � fair enough � this had
> been going on since about 5.30pm the previous evening and the logfiles
> were being generated at the rate of 4/5 a minute.  The partition was
> cleared of old logs and I expected the DB to spring in to life, but no
> it just sat there.  I could not connect with psql or pg_ctl to
> shutdown the cluster.

How was the logging being done exactly?  syslog?  redirect_stderr?
If redirect_stderr, this could well be a PG bug --- I kinda doubt
anyone has tested that scenario.

            regards, tom lane

Re: DB Cluster hanging

From
"codeWarrior"
Date:
Are you possibly running logrotate on the postgreSQL server logs ? Logrotate
is usually set to signal (SIGHUP) the log owner (in this case -- postgreSQL)
after rotating the log file...

You normally can't just delete the logfiles and expect postgreSQL to
continue wherever you left it... you normally need to pg_ctl reload or
pg_ctl restart after dinking with the log files...

"Nigel Bishop" <nigel.bishop@gmail.com> wrote in message
news:7e974e490510130822q3a853039n4c96991179deae5a@mail.gmail.com...
> Hi,
>
> I'm hoping that someone will be able to answer this query:
>
> Last night at 3am our Postgresql DB cluster hung.  At the time data
> was being loaded.  The parameter log_statement_stats in the
> postgresql.conf file was set to true.  This was churning out data into
> the logfile which was switching every 10Mb.  Eventually the partition
> where the logfiles are written to filled up � fair enough � this had
> been going on since about 5.30pm the previous evening and the logfiles
> were being generated at the rate of 4/5 a minute.  The partition was
> cleared of old logs and I expected the DB to spring in to life, but no
> it just sat there.  I could not connect with psql or pg_ctl to
> shutdown the cluster.
>
> Eventually I had to issue a kill -9 on the postmaster, set
> log_statement_stats to false and restarted the cluster, It recovered
> itself and the data load carried on.
>
> My question is, is this normal behaviour when the logfile destination
> fills up?  There was nothing in the logfile being used at the time of
> the hang, just stats data.
>
> PG version 8.0.3 with archived WAL logs enabled
>
> O/S version RH ES4 with 2 CPUs & 2Gb RAM
>
> Thanks very much for any input.
>
>
> Regards,
>
> Nigel Bishop
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend
>



Re: DB Cluster hanging

From
Scott Marlowe
Date:
If you can't / won't / don't use syslog and want your logs rotated
anyway, look at apache's log rotator.  It works a charm, and rotates
every night at midnight the way I use it.  Quite simple to set up and
very reliable.

On Thu, 2005-10-13 at 10:48, codeWarrior wrote:
> Are you possibly running logrotate on the postgreSQL server logs ? Logrotate
> is usually set to signal (SIGHUP) the log owner (in this case -- postgreSQL)
> after rotating the log file...
>
> You normally can't just delete the logfiles and expect postgreSQL to
> continue wherever you left it... you normally need to pg_ctl reload or
> pg_ctl restart after dinking with the log files...
>
> "Nigel Bishop" <nigel.bishop@gmail.com> wrote in message
> news:7e974e490510130822q3a853039n4c96991179deae5a@mail.gmail.com...
> > Hi,
> >
> > I'm hoping that someone will be able to answer this query:
> >
> > Last night at 3am our Postgresql DB cluster hung.  At the time data
> > was being loaded.  The parameter log_statement_stats in the
> > postgresql.conf file was set to true.  This was churning out data into
> > the logfile which was switching every 10Mb.  Eventually the partition
> > where the logfiles are written to filled up  fair enough  this had
> > been going on since about 5.30pm the previous evening and the logfiles
> > were being generated at the rate of 4/5 a minute.  The partition was
> > cleared of old logs and I expected the DB to spring in to life, but no
> > it just sat there.  I could not connect with psql or pg_ctl to
> > shutdown the cluster.
> >
> > Eventually I had to issue a kill -9 on the postmaster, set
> > log_statement_stats to false and restarted the cluster, It recovered
> > itself and the data load carried on.
> >
> > My question is, is this normal behaviour when the logfile destination
> > fills up?  There was nothing in the logfile being used at the time of
> > the hang, just stats data.
> >
> > PG version 8.0.3 with archived WAL logs enabled
> >
> > O/S version RH ES4 with 2 CPUs & 2Gb RAM
> >
> > Thanks very much for any input.
> >
> >
> > Regards,
> >
> > Nigel Bishop
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 6: explain analyze is your friend
> >
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>                http://archives.postgresql.org

Re: DB Cluster hanging

From
"Nigel Bishop"
Date:
Hi

Hmm...  a possible bug eh!  I'll make sure that the log destination
doesn't fill again

This is what I have in the postgresql.conf file:

log_destination = 'stderr'
redirect_stderr = true
log_directory = '/opt/postgres/admin/log'
log_filename = 'PG-%Y-%m-%d_%H%M%S.log'


Thanks,
Nigel



-----Original Message-----
From: pgsql-admin-owner@postgresql.org
[mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Tom Lane
Sent: 13 October 2005 17:12
To: Nigel Bishop
Cc: pgsql-admin@postgresql.org
Subject: Re: [ADMIN] DB Cluster hanging

Nigel Bishop <nigel.bishop@gmail.com> writes:
> Last night at 3am our Postgresql DB cluster hung.  At the time data
> was being loaded.  The parameter log_statement_stats in the
> postgresql.conf file was set to true.  This was churning out data into
> the logfile which was switching every 10Mb.  Eventually the partition
> where the logfiles are written to filled up - fair enough - this had
> been going on since about 5.30pm the previous evening and the logfiles
> were being generated at the rate of 4/5 a minute.  The partition was
> cleared of old logs and I expected the DB to spring in to life, but no
> it just sat there.  I could not connect with psql or pg_ctl to
> shutdown the cluster.

How was the logging being done exactly?  syslog?  redirect_stderr?
If redirect_stderr, this could well be a PG bug --- I kinda doubt
anyone has tested that scenario.

            regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend


Communications on or through ioko's computer systems may be monitored or recorded to secure effective system operation
andfor other lawful purposes. 

Unless otherwise agreed expressly in writing, this communication is to be treated as confidential and the information
init may not be used or disclosed except for the purpose for which it has been sent. If you have reason to believe that
youare not the intended recipient of this communication, please contact the sender immediately. No employee is
authorisedto conclude any binding agreement on behalf of ioko with another party by e-mail without prior express
writtenconfirmation. 

ioko365 Ltd.  VAT reg 656 2443 31. Reg no 3048367. All rights reserved.

Re: DB Cluster hanging

From
Tom Lane
Date:
"Nigel Bishop" <Nigel.Bishop@ioko.com> writes:
> Hmm...  a possible bug eh!  I'll make sure that the log destination
> doesn't fill again

> This is what I have in the postgresql.conf file:

> log_destination = 'stderr'
> redirect_stderr = true

I tried to reproduce the problem, without any success.  What I did:
* set up a small loopback filesystem, so I didn't have to actually fill
  my whole disk;
* point Postgres logging into the loopback filesystem;
* deliberately fill the filesystem.

I didn't see any lockup.  The syslogger subprocess started bleating on
the original postmaster stderr:
    could not write to log file: No space left on device
    could not write to log file: No space left on device
but it dropped the messages rather than hanging up, and the rest
of the database sailed on just fine.  When I freed up space on the
loopback filesystem, logging resumed without any problem.

So I dunno what went wrong for you.  System-specific issue maybe?
I tried this on Fedora Core 4 with CVS-tip Postgres (but the syslogger
code hasn't changed materially since 8.0.2).

Also, are you sure the postmaster is 8.0.3?  There was an
infinite-recursion-on-error bug in syslogger in 8.0 and 8.0.1.

            regards, tom lane