Thread: Postgres has stopped logging
This is a really strange one. We are running PostgreSQL 8.3.7 on a Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8). On Wednesday I was reading about the upcoming change in the default value of default_statistics_target from 10 to 100 - we had never changed this value, so I bumped it up a bit and also changed the value of log_autovacuum_min_duration so that I could monitor the results. Yesterday I tweaked the settings a bit more and continued to monitor the logs. About two hours later, the log file stopped growing. At first I didn't fully realize what had happened, I thought there was just low activity. Today I checked again and realized there had been no logs since yesterday afternoon. I made another minor adjustment to the config file and issued pg_ctl reload, to see if I could force something to show up in the log - nothing. The time stamp on the latest log file is still from yesterday. The log location is under the /var partition, the df command shows usage at 3%, so this is not being caused by a full partition. I am baffled as to what I can try next, other than a server restart, which I am reluctant to do since this is a production server. The server is working normally, as far as I can tell, apart from this. I would be most grateful if anyone can suggest the next step for troubleshooting this issue. Thank you. Lewis Kapell
Yes, the last message in /var/log/messages was just 15 minutes ago. Scott Whitney wrote: > I had a similar problem, but it wasn't just PG. Are you sure that syslog is > logging? You have _anything_ in /var/log/messages? > > -----Original Message----- > From: pgsql-admin-owner@postgresql.org > [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Lewis Kapell > Sent: Friday, June 05, 2009 10:44 AM > To: pgsql-admin@postgresql.org > Subject: [ADMIN] Postgres has stopped logging > > This is a really strange one. We are running PostgreSQL 8.3.7 on a > Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8). > > On Wednesday I was reading about the upcoming change in the default > value of default_statistics_target from 10 to 100 - we had never changed > this value, so I bumped it up a bit and also changed the value of > log_autovacuum_min_duration so that I could monitor the results. > Yesterday I tweaked the settings a bit more and continued to monitor the > logs. About two hours later, the log file stopped growing. > > At first I didn't fully realize what had happened, I thought there was > just low activity. Today I checked again and realized there had been no > logs since yesterday afternoon. I made another minor adjustment to the > config file and issued pg_ctl reload, to see if I could force something > to show up in the log - nothing. The time stamp on the latest log file > is still from yesterday. > > The log location is under the /var partition, the df command shows usage > at 3%, so this is not being caused by a full partition. I am baffled as > to what I can try next, other than a server restart, which I am > reluctant to do since this is a production server. > > The server is working normally, as far as I can tell, apart from this. > > I would be most grateful if anyone can suggest the next step for > troubleshooting this issue. > > Thank you. > > Lewis Kapell > >
Lewis Kapell <lkapell@setonhome.org> writes: > This is a really strange one. We are running PostgreSQL 8.3.7 on a > Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8). > On Wednesday I was reading about the upcoming change in the default > value of default_statistics_target from 10 to 100 - we had never changed > this value, so I bumped it up a bit and also changed the value of > log_autovacuum_min_duration so that I could monitor the results. > Yesterday I tweaked the settings a bit more and continued to monitor the > logs. About two hours later, the log file stopped growing. Are you watching the right log file? I believe most Fedora packagings of PG are set up to rotate log output among multiple files. In any case it's hard to diagnose this without knowing what logging settings you're using. regards, tom lane
Sorry, a couple of things that I intended to mention in my initial post - I must have forgotten. 1) We (actually I) compiled Postgres from source. 2) The relevant section of the config file looks like this: #log_destination = 'stderr' logging_collector = on log_directory = '/var/log/postgres' log_filename = 'postgresql-%Y-%m-%d.log' #log_truncate_on_rotation = off log_rotation_age = 7d #log_rotation_size = 10MB 3) If I issue ls -l I see the following (last few lines only): 2009-06-03 19:55 postgresql-2009-05-28.log 2009-06-03 23:32 postgresql-2009-06-03.log 2009-06-04 15:46 postgresql-2009-06-04.log Thank you, Lewis Kapell Computer Operations Seton Home Study School ------------------------------------------------------- For a free subscription to the Seton Home Study School monthly electronic newsletter, send an e-mail to newsletter-sub@setonhome.org ------------------------------------------------------- Tom Lane wrote: > Lewis Kapell <lkapell@setonhome.org> writes: >> This is a really strange one. We are running PostgreSQL 8.3.7 on a >> Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8). > >> On Wednesday I was reading about the upcoming change in the default >> value of default_statistics_target from 10 to 100 - we had never changed >> this value, so I bumped it up a bit and also changed the value of >> log_autovacuum_min_duration so that I could monitor the results. >> Yesterday I tweaked the settings a bit more and continued to monitor the >> logs. About two hours later, the log file stopped growing. > > Are you watching the right log file? I believe most Fedora packagings > of PG are set up to rotate log output among multiple files. In any > case it's hard to diagnose this without knowing what logging settings > you're using. > > regards, tom lane
Lewis Kapell <lkapell@setonhome.org> writes: > 2) The relevant section of the config file looks like this: > #log_destination = 'stderr' > logging_collector = on > log_directory = '/var/log/postgres' > log_filename = 'postgresql-%Y-%m-%d.log' > #log_truncate_on_rotation = off > log_rotation_age = 7d > #log_rotation_size = 10MB Those look reasonably sane. Have you checked that the logger subprocess is still running? The postmaster is supposed to restart it if it dies, but I wouldn't be astonished if something went wrong in that ... regards, tom lane
If I search the output of ps -ax for entries containing both "postgres:" and "process" I get the following: 2259 ? Ss 0:03 postgres: logger process 2261 ? Ss 0:29 postgres: writer process 2262 ? Ss 0:17 postgres: wal writer process 2263 ? Ss 0:10 postgres: autovacuum launcher process 2264 ? Ss 4:51 postgres: stats collector process thanks Lewis Kapell Tom Lane wrote: > Lewis Kapell <lkapell@setonhome.org> writes: >> 2) The relevant section of the config file looks like this: > >> #log_destination = 'stderr' >> logging_collector = on >> log_directory = '/var/log/postgres' >> log_filename = 'postgresql-%Y-%m-%d.log' >> #log_truncate_on_rotation = off >> log_rotation_age = 7d >> #log_rotation_size = 10MB > > Those look reasonably sane. Have you checked that the logger > subprocess is still running? The postmaster is supposed to restart > it if it dies, but I wouldn't be astonished if something went wrong > in that ... > > regards, tom lane
Lewis Kapell <lkapell@setonhome.org> writes: > If I search the output of ps -ax for entries containing both "postgres:" > and "process" I get the following: > 2259 ? Ss 0:03 postgres: logger process Well, that looks reasonable. You might try strace'ing that process while you do something that's certain to provoke a log entry (maybe "SELECT 1/0;" in a psql session). The only likely problem that I can think of at this point is that SELinux might think the process shouldn't be allowed to write in /var/log/postgres/, but I don't know why that would have just suddenly started to be a problem after working before. Unless maybe someone did a system-wide restorecon or some such. regards, tom lane
Using strace, I was able to determine that the logger process was attempting to output its usual messages when an invalid SQL query was issued. As an experiment, I changed the value of log_directory in postgresql.conf (to a directory which I had created in a different partition than /var) and reloaded - issued a command which would guarantee an error - voila, a new log file was created in that directory. I then changed log_directory to its original value, reloaded, issued an invalid query - and a new log file was created in the proper location. Everything seems back to normal. So my problem is solved, although I wish I had some idea what had caused it in the first place. Tom Lane wrote: > Lewis Kapell <lkapell@setonhome.org> writes: >> If I search the output of ps -ax for entries containing both "postgres:" >> and "process" I get the following: > >> 2259 ? Ss 0:03 postgres: logger process > > Well, that looks reasonable. You might try strace'ing that process > while you do something that's certain to provoke a log entry > (maybe "SELECT 1/0;" in a psql session). > > The only likely problem that I can think of at this point is that > SELinux might think the process shouldn't be allowed to write in > /var/log/postgres/, but I don't know why that would have just suddenly > started to be a problem after working before. Unless maybe someone > did a system-wide restorecon or some such. > > regards, tom lane
Lewis Kapell <lkapell@setonhome.org> wrote: > So my problem is solved, although I wish I had some idea what had > caused it in the first place. Just a wild shot in the dark -- does it not behave well if the file already exists? I noticed that your filename includes date but not time. Perhaps time healed this wound by moving on to a new date? -Kevin
The filename includes date, but not time, because that is how the relevant options are set up in the configuration file. I could have specified that the file name also include hour, minute and second, but that should make no difference. And anyway, there have now been three additional messages written to the log file since it was first created. So the fact that the file already exists doesn't seem to be the root of the problem. I can only guess that there was some glitch in the operating system I/O functions. By changing the log directory, I must have caused the logger process to obtain a new file handle, or something along those lines. Thank you, Lewis Kapell Computer Operations Seton Home Study School ------------------------------------------------------- For a free subscription to the Seton Home Study School monthly electronic newsletter, send an e-mail to newsletter-sub@setonhome.org ------------------------------------------------------- Kevin Grittner wrote: > Lewis Kapell <lkapell@setonhome.org> wrote: > >> So my problem is solved, although I wish I had some idea what had >> caused it in the first place. > > Just a wild shot in the dark -- does it not behave well if the file > already exists? I noticed that your filename includes date but not > time. Perhaps time healed this wound by moving on to a new date? > > -Kevin
Lewis Kapell <lkapell@setonhome.org> wrote: > And anyway, there have now been three additional messages written to > the log file since it was first created. As long as it continues to log after you start PostgreSQL with the file already in existence, I guess that's not it. You did try a stop and a start, and ensure logging still works? > By changing the log directory, I must have caused the logger > process to obtain a new file handle, or something along those lines. Perhaps you somehow deleted the file while PostgreSQL was still running? It would still be linked to the old copy, and would happily write to it even though you couldn't see it.... Just brainstorming.... -Kevin
This is a production server, so I'm not going to stop & restart unless it's really necessary. Interesting idea about the old log file being deleted. I could still see the file listed in the directory, though, and could read/edit it. Thanks for sharing the ideas. Lewis Kevin Grittner wrote: > Lewis Kapell <lkapell@setonhome.org> wrote: > >> And anyway, there have now been three additional messages written to >> the log file since it was first created. > > As long as it continues to log after you start PostgreSQL with the > file already in existence, I guess that's not it. You did try a stop > and a start, and ensure logging still works? > >> By changing the log directory, I must have caused the logger >> process to obtain a new file handle, or something along those lines. > > Perhaps you somehow deleted the file while PostgreSQL was still > running? It would still be linked to the old copy, and would happily > write to it even though you couldn't see it.... > > Just brainstorming.... > > -Kevin
Lewis Kapell <lkapell@setonhome.org> wrote: > Interesting idea about the old log file being deleted. I could > still see the file listed in the directory, though, I guess I got confused on that point because the last log file you listed was for the 4th and all of your posts were on the 5th. I thought maybe you were looking at the one from the 4th while PostgreSQL was logging to a deleted file from the 5th. > and could read/edit it. Editing? Like in saving over the top of it? -Kevin