Thread: Postgres has stopped logging

Postgres has stopped logging

From
Lewis Kapell
Date:
This is a really strange one.  We are running PostgreSQL 8.3.7 on a
Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8).

On Wednesday I was reading about the upcoming change in the default
value of default_statistics_target from 10 to 100 - we had never changed
this value, so I bumped it up a bit and also changed the value of
log_autovacuum_min_duration so that I could monitor the results.
Yesterday I tweaked the settings a bit more and continued to monitor the
logs.  About two hours later, the log file stopped growing.

At first I didn't fully realize what had happened, I thought there was
just low activity.  Today I checked again and realized there had been no
logs since yesterday afternoon.  I made another minor adjustment to the
config file and issued pg_ctl reload, to see if I could force something
to show up in the log - nothing.  The time stamp on the latest log file
is still from yesterday.

The log location is under the /var partition, the df command shows usage
at 3%, so this is not being caused by a full partition.  I am baffled as
to what I can try next, other than a server restart, which I am
reluctant to do since this is a production server.

The server is working normally, as far as I can tell, apart from this.

I would be most grateful if anyone can suggest the next step for
troubleshooting this issue.

Thank you.

Lewis Kapell


Re: Postgres has stopped logging

From
Lewis Kapell
Date:
Yes, the last message in /var/log/messages was just 15 minutes ago.


Scott Whitney wrote:
> I had a similar problem, but it wasn't just PG. Are you sure that syslog is
> logging? You have _anything_ in /var/log/messages?
>
> -----Original Message-----
> From: pgsql-admin-owner@postgresql.org
> [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Lewis Kapell
> Sent: Friday, June 05, 2009 10:44 AM
> To: pgsql-admin@postgresql.org
> Subject: [ADMIN] Postgres has stopped logging
>
> This is a really strange one.  We are running PostgreSQL 8.3.7 on a
> Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8).
>
> On Wednesday I was reading about the upcoming change in the default
> value of default_statistics_target from 10 to 100 - we had never changed
> this value, so I bumped it up a bit and also changed the value of
> log_autovacuum_min_duration so that I could monitor the results.
> Yesterday I tweaked the settings a bit more and continued to monitor the
> logs.  About two hours later, the log file stopped growing.
>
> At first I didn't fully realize what had happened, I thought there was
> just low activity.  Today I checked again and realized there had been no
> logs since yesterday afternoon.  I made another minor adjustment to the
> config file and issued pg_ctl reload, to see if I could force something
> to show up in the log - nothing.  The time stamp on the latest log file
> is still from yesterday.
>
> The log location is under the /var partition, the df command shows usage
> at 3%, so this is not being caused by a full partition.  I am baffled as
> to what I can try next, other than a server restart, which I am
> reluctant to do since this is a production server.
>
> The server is working normally, as far as I can tell, apart from this.
>
> I would be most grateful if anyone can suggest the next step for
> troubleshooting this issue.
>
> Thank you.
>
> Lewis Kapell
>
>


Re: Postgres has stopped logging

From
Tom Lane
Date:
Lewis Kapell <lkapell@setonhome.org> writes:
> This is a really strange one.  We are running PostgreSQL 8.3.7 on a
> Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8).

> On Wednesday I was reading about the upcoming change in the default
> value of default_statistics_target from 10 to 100 - we had never changed
> this value, so I bumped it up a bit and also changed the value of
> log_autovacuum_min_duration so that I could monitor the results.
> Yesterday I tweaked the settings a bit more and continued to monitor the
> logs.  About two hours later, the log file stopped growing.

Are you watching the right log file?  I believe most Fedora packagings
of PG are set up to rotate log output among multiple files.  In any
case it's hard to diagnose this without knowing what logging settings
you're using.

            regards, tom lane

Re: Postgres has stopped logging

From
Lewis Kapell
Date:
Sorry, a couple of things that I intended to mention in my initial post
- I must have forgotten.

1) We (actually I) compiled Postgres from source.

2) The relevant section of the config file looks like this:

#log_destination = 'stderr'
logging_collector = on
log_directory = '/var/log/postgres'
log_filename = 'postgresql-%Y-%m-%d.log'
#log_truncate_on_rotation = off
log_rotation_age = 7d
#log_rotation_size = 10MB

3) If I issue ls -l I see the following (last few lines only):

2009-06-03 19:55 postgresql-2009-05-28.log
2009-06-03 23:32 postgresql-2009-06-03.log
2009-06-04 15:46 postgresql-2009-06-04.log

Thank you,

Lewis Kapell
Computer Operations
Seton Home Study School

-------------------------------------------------------
For a free subscription to the Seton Home Study School monthly
electronic newsletter, send an e-mail to newsletter-sub@setonhome.org
-------------------------------------------------------


Tom Lane wrote:
> Lewis Kapell <lkapell@setonhome.org> writes:
>> This is a really strange one.  We are running PostgreSQL 8.3.7 on a
>> Fedora Core 8 system (uname -a gives: 2.6.23.1-42.fc8).
>
>> On Wednesday I was reading about the upcoming change in the default
>> value of default_statistics_target from 10 to 100 - we had never changed
>> this value, so I bumped it up a bit and also changed the value of
>> log_autovacuum_min_duration so that I could monitor the results.
>> Yesterday I tweaked the settings a bit more and continued to monitor the
>> logs.  About two hours later, the log file stopped growing.
>
> Are you watching the right log file?  I believe most Fedora packagings
> of PG are set up to rotate log output among multiple files.  In any
> case it's hard to diagnose this without knowing what logging settings
> you're using.
>
>             regards, tom lane


Re: Postgres has stopped logging

From
Tom Lane
Date:
Lewis Kapell <lkapell@setonhome.org> writes:
> 2) The relevant section of the config file looks like this:

> #log_destination = 'stderr'
> logging_collector = on
> log_directory = '/var/log/postgres'
> log_filename = 'postgresql-%Y-%m-%d.log'
> #log_truncate_on_rotation = off
> log_rotation_age = 7d
> #log_rotation_size = 10MB

Those look reasonably sane.  Have you checked that the logger
subprocess is still running?  The postmaster is supposed to restart
it if it dies, but I wouldn't be astonished if something went wrong
in that ...

            regards, tom lane

Re: Postgres has stopped logging

From
Lewis Kapell
Date:
If I search the output of ps -ax for entries containing both "postgres:"
and "process" I get the following:

  2259 ?        Ss     0:03 postgres: logger process
  2261 ?        Ss     0:29 postgres: writer process
  2262 ?        Ss     0:17 postgres: wal writer process
  2263 ?        Ss     0:10 postgres: autovacuum launcher process
  2264 ?        Ss     4:51 postgres: stats collector process

thanks
Lewis Kapell


Tom Lane wrote:
> Lewis Kapell <lkapell@setonhome.org> writes:
>> 2) The relevant section of the config file looks like this:
>
>> #log_destination = 'stderr'
>> logging_collector = on
>> log_directory = '/var/log/postgres'
>> log_filename = 'postgresql-%Y-%m-%d.log'
>> #log_truncate_on_rotation = off
>> log_rotation_age = 7d
>> #log_rotation_size = 10MB
>
> Those look reasonably sane.  Have you checked that the logger
> subprocess is still running?  The postmaster is supposed to restart
> it if it dies, but I wouldn't be astonished if something went wrong
> in that ...
>
>             regards, tom lane


Re: Postgres has stopped logging

From
Tom Lane
Date:
Lewis Kapell <lkapell@setonhome.org> writes:
> If I search the output of ps -ax for entries containing both "postgres:"
> and "process" I get the following:

>   2259 ?        Ss     0:03 postgres: logger process

Well, that looks reasonable.  You might try strace'ing that process
while you do something that's certain to provoke a log entry
(maybe "SELECT 1/0;" in a psql session).

The only likely problem that I can think of at this point is that
SELinux might think the process shouldn't be allowed to write in
/var/log/postgres/, but I don't know why that would have just suddenly
started to be a problem after working before.  Unless maybe someone
did a system-wide restorecon or some such.

            regards, tom lane

Re: Postgres has stopped logging

From
Lewis Kapell
Date:
Using strace, I was able to determine that the logger process was
attempting to output its usual messages when an invalid SQL query was
issued.

As an experiment, I changed the value of log_directory in
postgresql.conf (to a directory which I had created in a different
partition than /var) and reloaded - issued a command which would
guarantee an error - voila, a new log file was created in that directory.

I then changed log_directory to its original value, reloaded, issued an
invalid query - and a new log file was created in the proper location.
Everything seems back to normal.

So my problem is solved, although I wish I had some idea what had caused
it in the first place.



Tom Lane wrote:
> Lewis Kapell <lkapell@setonhome.org> writes:
>> If I search the output of ps -ax for entries containing both "postgres:"
>> and "process" I get the following:
>
>>   2259 ?        Ss     0:03 postgres: logger process
>
> Well, that looks reasonable.  You might try strace'ing that process
> while you do something that's certain to provoke a log entry
> (maybe "SELECT 1/0;" in a psql session).
>
> The only likely problem that I can think of at this point is that
> SELinux might think the process shouldn't be allowed to write in
> /var/log/postgres/, but I don't know why that would have just suddenly
> started to be a problem after working before.  Unless maybe someone
> did a system-wide restorecon or some such.
>
>             regards, tom lane


Re: Postgres has stopped logging

From
"Kevin Grittner"
Date:
Lewis Kapell <lkapell@setonhome.org> wrote:

> So my problem is solved, although I wish I had some idea what had
> caused it in the first place.

Just a wild shot in the dark -- does it not behave well if the file
already exists?  I noticed that your filename includes date but not
time.  Perhaps time healed this wound by moving on to a new date?

-Kevin

Re: Postgres has stopped logging

From
Lewis Kapell
Date:
The filename includes date, but not time, because that is how the
relevant options are set up in the configuration file.  I could have
specified that the file name also include hour, minute and second, but
that should make no difference.

And anyway, there have now been three additional messages written to the
log file since it was first created.  So the fact that the file already
exists doesn't seem to be the root of the problem.

I can only guess that there was some glitch in the operating system I/O
functions.  By changing the log directory, I must have caused the logger
process to obtain a new file handle, or something along those lines.

Thank you,

Lewis Kapell
Computer Operations
Seton Home Study School

-------------------------------------------------------
For a free subscription to the Seton Home Study School monthly
electronic newsletter, send an e-mail to newsletter-sub@setonhome.org
-------------------------------------------------------


Kevin Grittner wrote:
> Lewis Kapell <lkapell@setonhome.org> wrote:
>
>> So my problem is solved, although I wish I had some idea what had
>> caused it in the first place.
>
> Just a wild shot in the dark -- does it not behave well if the file
> already exists?  I noticed that your filename includes date but not
> time.  Perhaps time healed this wound by moving on to a new date?
>
> -Kevin


Re: Postgres has stopped logging

From
"Kevin Grittner"
Date:
Lewis Kapell <lkapell@setonhome.org> wrote:

> And anyway, there have now been three additional messages written to
> the log file since it was first created.

As long as it continues to log after you start PostgreSQL with the
file already in existence, I guess that's not it.  You did try a stop
and a start, and ensure logging still works?

> By changing the log directory, I must have caused the logger
> process to obtain a new file handle, or something along those lines.

Perhaps you somehow deleted the file while PostgreSQL was still
running?  It would still be linked to the old copy, and would happily
write to it even though you couldn't see it....

Just brainstorming....

-Kevin

Re: Postgres has stopped logging

From
Lewis Kapell
Date:
This is a production server, so I'm not going to stop & restart unless
it's really necessary.

Interesting idea about the old log file being deleted.  I could still
see the file listed in the directory, though, and could read/edit it.

Thanks for sharing the ideas.

Lewis

Kevin Grittner wrote:
> Lewis Kapell <lkapell@setonhome.org> wrote:
>
>> And anyway, there have now been three additional messages written to
>> the log file since it was first created.
>
> As long as it continues to log after you start PostgreSQL with the
> file already in existence, I guess that's not it.  You did try a stop
> and a start, and ensure logging still works?
>
>> By changing the log directory, I must have caused the logger
>> process to obtain a new file handle, or something along those lines.
>
> Perhaps you somehow deleted the file while PostgreSQL was still
> running?  It would still be linked to the old copy, and would happily
> write to it even though you couldn't see it....
>
> Just brainstorming....
>
> -Kevin


Re: Postgres has stopped logging

From
"Kevin Grittner"
Date:
Lewis Kapell <lkapell@setonhome.org> wrote:

> Interesting idea about the old log file being deleted.  I could
> still see the file listed in the directory, though,

I guess I got confused on that point because the last log file you
listed was for the 4th and all of your posts were on the 5th.  I
thought maybe you were looking at the one from the 4th while
PostgreSQL was logging to a deleted file from the 5th.

>  and could read/edit it.

Editing?  Like in saving over the top of it?

-Kevin