Thread: DB Cluster hanging
Hi, I'm hoping that someone will be able to answer this query: Last night at 3am our Postgresql DB cluster hung. At the time data was being loaded. The parameter log_statement_stats in the postgresql.conf file was set to true. This was churning out data into the logfile which was switching every 10Mb. Eventually the partition where the logfiles are written to filled up – fair enough – this had been going on since about 5.30pm the previous evening and the logfiles were being generated at the rate of 4/5 a minute. The partition was cleared of old logs and I expected the DB to spring in to life, but no it just sat there. I could not connect with psql or pg_ctl to shutdown the cluster. Eventually I had to issue a kill -9 on the postmaster, set log_statement_stats to false and restarted the cluster, It recovered itself and the data load carried on. My question is, is this normal behaviour when the logfile destination fills up? There was nothing in the logfile being used at the time of the hang, just stats data. PG version 8.0.3 with archived WAL logs enabled O/S version RH ES4 with 2 CPUs & 2Gb RAM Thanks very much for any input. Regards, Nigel Bishop
Nigel Bishop <nigel.bishop@gmail.com> writes: > Last night at 3am our Postgresql DB cluster hung. At the time data > was being loaded. The parameter log_statement_stats in the > postgresql.conf file was set to true. This was churning out data into > the logfile which was switching every 10Mb. Eventually the partition > where the logfiles are written to filled up � fair enough � this had > been going on since about 5.30pm the previous evening and the logfiles > were being generated at the rate of 4/5 a minute. The partition was > cleared of old logs and I expected the DB to spring in to life, but no > it just sat there. I could not connect with psql or pg_ctl to > shutdown the cluster. How was the logging being done exactly? syslog? redirect_stderr? If redirect_stderr, this could well be a PG bug --- I kinda doubt anyone has tested that scenario. regards, tom lane
Are you possibly running logrotate on the postgreSQL server logs ? Logrotate is usually set to signal (SIGHUP) the log owner (in this case -- postgreSQL) after rotating the log file... You normally can't just delete the logfiles and expect postgreSQL to continue wherever you left it... you normally need to pg_ctl reload or pg_ctl restart after dinking with the log files... "Nigel Bishop" <nigel.bishop@gmail.com> wrote in message news:7e974e490510130822q3a853039n4c96991179deae5a@mail.gmail.com... > Hi, > > I'm hoping that someone will be able to answer this query: > > Last night at 3am our Postgresql DB cluster hung. At the time data > was being loaded. The parameter log_statement_stats in the > postgresql.conf file was set to true. This was churning out data into > the logfile which was switching every 10Mb. Eventually the partition > where the logfiles are written to filled up � fair enough � this had > been going on since about 5.30pm the previous evening and the logfiles > were being generated at the rate of 4/5 a minute. The partition was > cleared of old logs and I expected the DB to spring in to life, but no > it just sat there. I could not connect with psql or pg_ctl to > shutdown the cluster. > > Eventually I had to issue a kill -9 on the postmaster, set > log_statement_stats to false and restarted the cluster, It recovered > itself and the data load carried on. > > My question is, is this normal behaviour when the logfile destination > fills up? There was nothing in the logfile being used at the time of > the hang, just stats data. > > PG version 8.0.3 with archived WAL logs enabled > > O/S version RH ES4 with 2 CPUs & 2Gb RAM > > Thanks very much for any input. > > > Regards, > > Nigel Bishop > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend >
If you can't / won't / don't use syslog and want your logs rotated anyway, look at apache's log rotator. It works a charm, and rotates every night at midnight the way I use it. Quite simple to set up and very reliable. On Thu, 2005-10-13 at 10:48, codeWarrior wrote: > Are you possibly running logrotate on the postgreSQL server logs ? Logrotate > is usually set to signal (SIGHUP) the log owner (in this case -- postgreSQL) > after rotating the log file... > > You normally can't just delete the logfiles and expect postgreSQL to > continue wherever you left it... you normally need to pg_ctl reload or > pg_ctl restart after dinking with the log files... > > "Nigel Bishop" <nigel.bishop@gmail.com> wrote in message > news:7e974e490510130822q3a853039n4c96991179deae5a@mail.gmail.com... > > Hi, > > > > I'm hoping that someone will be able to answer this query: > > > > Last night at 3am our Postgresql DB cluster hung. At the time data > > was being loaded. The parameter log_statement_stats in the > > postgresql.conf file was set to true. This was churning out data into > > the logfile which was switching every 10Mb. Eventually the partition > > where the logfiles are written to filled up fair enough this had > > been going on since about 5.30pm the previous evening and the logfiles > > were being generated at the rate of 4/5 a minute. The partition was > > cleared of old logs and I expected the DB to spring in to life, but no > > it just sat there. I could not connect with psql or pg_ctl to > > shutdown the cluster. > > > > Eventually I had to issue a kill -9 on the postmaster, set > > log_statement_stats to false and restarted the cluster, It recovered > > itself and the data load carried on. > > > > My question is, is this normal behaviour when the logfile destination > > fills up? There was nothing in the logfile being used at the time of > > the hang, just stats data. > > > > PG version 8.0.3 with archived WAL logs enabled > > > > O/S version RH ES4 with 2 CPUs & 2Gb RAM > > > > Thanks very much for any input. > > > > > > Regards, > > > > Nigel Bishop > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 6: explain analyze is your friend > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Have you searched our list archives? > > http://archives.postgresql.org
Hi Hmm... a possible bug eh! I'll make sure that the log destination doesn't fill again This is what I have in the postgresql.conf file: log_destination = 'stderr' redirect_stderr = true log_directory = '/opt/postgres/admin/log' log_filename = 'PG-%Y-%m-%d_%H%M%S.log' Thanks, Nigel -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Tom Lane Sent: 13 October 2005 17:12 To: Nigel Bishop Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] DB Cluster hanging Nigel Bishop <nigel.bishop@gmail.com> writes: > Last night at 3am our Postgresql DB cluster hung. At the time data > was being loaded. The parameter log_statement_stats in the > postgresql.conf file was set to true. This was churning out data into > the logfile which was switching every 10Mb. Eventually the partition > where the logfiles are written to filled up - fair enough - this had > been going on since about 5.30pm the previous evening and the logfiles > were being generated at the rate of 4/5 a minute. The partition was > cleared of old logs and I expected the DB to spring in to life, but no > it just sat there. I could not connect with psql or pg_ctl to > shutdown the cluster. How was the logging being done exactly? syslog? redirect_stderr? If redirect_stderr, this could well be a PG bug --- I kinda doubt anyone has tested that scenario. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend Communications on or through ioko's computer systems may be monitored or recorded to secure effective system operation andfor other lawful purposes. Unless otherwise agreed expressly in writing, this communication is to be treated as confidential and the information init may not be used or disclosed except for the purpose for which it has been sent. If you have reason to believe that youare not the intended recipient of this communication, please contact the sender immediately. No employee is authorisedto conclude any binding agreement on behalf of ioko with another party by e-mail without prior express writtenconfirmation. ioko365 Ltd. VAT reg 656 2443 31. Reg no 3048367. All rights reserved.
"Nigel Bishop" <Nigel.Bishop@ioko.com> writes: > Hmm... a possible bug eh! I'll make sure that the log destination > doesn't fill again > This is what I have in the postgresql.conf file: > log_destination = 'stderr' > redirect_stderr = true I tried to reproduce the problem, without any success. What I did: * set up a small loopback filesystem, so I didn't have to actually fill my whole disk; * point Postgres logging into the loopback filesystem; * deliberately fill the filesystem. I didn't see any lockup. The syslogger subprocess started bleating on the original postmaster stderr: could not write to log file: No space left on device could not write to log file: No space left on device but it dropped the messages rather than hanging up, and the rest of the database sailed on just fine. When I freed up space on the loopback filesystem, logging resumed without any problem. So I dunno what went wrong for you. System-specific issue maybe? I tried this on Fedora Core 4 with CVS-tip Postgres (but the syslogger code hasn't changed materially since 8.0.2). Also, are you sure the postmaster is 8.0.3? There was an infinite-recursion-on-error bug in syslogger in 8.0 and 8.0.1. regards, tom lane