Thread: log_truncate_on_rotation=on is not truncating

log_truncate_on_rotation=on is not truncating

From

Chan Pham

Date:

20 June 2012, 15:24:00

We discovered that on our servers where we do daily shutdown and startup
for a cold backup, the pg_log logs are not truncated when it switches to
the new day, when log_truncation_on_rotation=on. The shutdown/startup of
postgresql happens daily at 16:00 system time. Also observed that only when
the shutdown event happen at 16:00 pm, it is only when the log is rotated
for the current day, and the event message is appended to last week's
postgreql-%a.log.
On our other servers where there is not a daily shutdown/startup, the
postgresql-%a.log is rotated to the new day at 00:00 hour and is truncated
of all last week's content. We first noticed this only our standby servers
where we do daily cold backup and restart postgresql, and no other
activities. Then I was able to reproduce this issue on a standalone server
by scheduling a daily shutdown/restart.
*
*
*Here are settings and log details:*
*
*
Postgresql is shutdown/startup via service:
   /sbin/service postgresql-9.1 stop/start  -- calls pg_ctl

OS versions:
 centos-release-notes-5.5-0
  and
 centos-release-6-2.el6.centos.7.x86_64

PostgreSQL version:
   PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit
   and
   PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC)
4.4.6 20110731 (Red Hat 4.4.6-3), 64-bit


Postgresql.conf settings for pg_logs related:
   logging_collector=on
   log_directory='pg_log'
   log_filename='postgresql-%a.log'
   log_truncate_on_rotation=on
   log_rotation_age=1d
   log_rotation_size=0

   timezone = 'UTC'

Service timezone:
>date
Tue Jun 19 11:25:56 PDT 2012

Content of a postgresql-Tue.log in its entirety. Note this is a standby
slave so there are not much activities other than shutdown/restart
performed for a cold backup. The *bold *date is just to de-mark the current
day where the entries were appended. The entries of 2012-06-19 was an
intentional syntax error to get the an entry for the current day.

2012-06-12 16:00:02 PDT   23761 4fd67fc4.5cd1 LOG:  received fast shutdown
request
2012-06-12 16:00:02 PDT   23761 4fd67fc4.5cd1 LOG:  aborting any active
transactions
2012-06-12 16:00:02 PDT   23800 4fd67fc8.5cf8 FATAL:  terminating
walreceiver process due to administrator command
2012-06-12 16:00:02 PDT   23765 4fd67fc5.5cd5 LOG:  shutting down
2012-06-12 16:00:02 PDT   23765 4fd67fc5.5cd5 LOG:  database system is shut
down
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  database system was
shut down in recovery at 2012-06-12 16:00:02 PDT
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  entering standby mode
2012-06-12 16:31:08 PDT   30458 4fd7d13c.76fa LOG:  redo starts at
6B/D3E1EA70
2012-06-12 16:31:10 PDT   30458 4fd7d13c.76fa LOG:  consistent recovery
state reached at 6B/DB8FDFE8
2012-06-12 16:31:10 PDT   30458 4fd7d13c.76fa LOG:  unexpected pageaddr
69/D68FE000 in log file 107, segment 219, offset 9428992
2012-06-12 16:31:10 PDT   30454 4fd7d13a.76f6 LOG:  database system is
ready to accept read only connections
2012-06-12 16:31:11 PDT   30494 4fd7d13e.771e LOG:  streaming replication
successfully connected to primary
*2012-06-19 *11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea ERROR:
 syntax error at or near ":" at character 17
*2012-06-19 *11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea STATEMENT:
 select version():

Listing of pg_log directory. You can see here that the last entry of each
day (file timestamp) is when postgresql is restarted at 16:30. Except June
19, I just force a syntax error to create an entry for today, June 19:

-rw------- 1 postgres postgres 1403 Jun 13 16:30 postgresql-Wed.log
-rw------- 1 postgres postgres 2868 Jun 14 16:31 postgresql-Thu.log
-rw------- 1 postgres postgres 2418 Jun 15 16:31 postgresql-Fri.log
-rw------- 1 postgres postgres 2222 Jun 16 16:31 postgresql-Sat.log
-rw------- 1 postgres postgres 2208 Jun 17 16:30 postgresql-Sun.log
-rw------- 1 postgres postgres 2832 Jun 18 16:31 postgresql-Mon.log
-rw------- 1 postgres postgres 1355 Jun 19 11:22 postgresql-Tue.log

Here's normal behavior on a server that does NOT get restarted daily. The
logs are all rotated and truncated at 00:00 as expected:
$ pwd
/data/postgres/9.1/pg_log
ls -ltr
total 4
-rw------- 1 postgres postgres    0 Jun 13 00:00 postgresql-Wed.log
-rw------- 1 postgres postgres    0 Jun 14 00:00 postgresql-Thu.log
-rw------- 1 postgres postgres 1563 Jun 15 14:12 postgresql-Fri.log
-rw------- 1 postgres postgres    0 Jun 16 00:00 postgresql-Sat.log
-rw------- 1 postgres postgres    0 Jun 17 00:00 postgresql-Sun.log
-rw------- 1 postgres postgres    0 Jun 18 00:00 postgresql-Mon.log
-rw------- 1 postgres postgres    0 Jun 19 00:00 postgresql-Tue.log
--
*



Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com

<http://www.iovation.com/eml-msg/>
*

Re: log_truncate_on_rotation=on is not truncating

From

Viswanatham Kiran Kumar

Date:

21 June 2012, 10:04:21

Hi Chan,

    I think this is configuration issue.

    This issue is happening because log files will always append when
server is restarted. What I can see from your configuration,
log_rotation_age=3D1d (1 day) which is greater than daily schedule
shutdown/restart time. So this is the reason why log files are growing.

    You can change the log_rotation_age=3D8h (i.e. any time less than the
schedule shutdown/restart time) as log file name=20
Is postgreql-%a.log this is sufficient to preserve till last week's log, as
at the time of rotation after 8 hours if the day is not changed then log
file name will be same as previous then truncation will not happen.

=A0
Thanks & Regards,
Kiran Kumar
=A0=A0=A0

--------------------------
VISWANATHAM=A0 KIRAN KUMAR
SENIOR TECHNICAL LEADER.
HUAWEI TECHNOLOGIES INDIA PVT. LTD.
EXTN: 7560; LEVEL-7 C & D TOWER
DIAMOND DISTRICT, OLD AIRPORT ROAD
BANGALORE-08
MOBILE: +91-9945115987
=A0


From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org] On Behalf Of Chan Pham
Sent: Wednesday, June 20, 2012 11:54 PM
To: pgsql-bugs@postgresql.org
Subject: [BUGS] log_truncate_on_rotation=3Don is not truncating

We discovered that on our servers where we do daily shutdown and startup for
a cold backup, the pg_log logs are not truncated when it switches to the new
day, when log_truncation_on_rotation=3Don. The shutdown/startup of postgres=
ql
happens daily at 16:00 system time. Also observed that only when the
shutdown event happen at 16:00 pm, it is only when the log is rotated for
the current day, and the event message is appended to last week's
postgreql-%a.log.=A0
On our other servers where there is not a daily shutdown/startup, the
postgresql-%a.log is rotated to the new day at 00:00 hour and is truncated
of all last week's content. We first noticed this only our standby servers
where we do daily cold backup and restart postgresql, and no other
activities. Then I was able to reproduce this issue on a standalone server
by scheduling a daily shutdown/restart.

Here are settings and log details:

Postgresql is shutdown/startup via service:
=A0 =A0/sbin/service postgresql-9.1 stop/start =A0-- calls pg_ctl

OS versions:
=A0centos-release-notes-5.5-0
=A0 and
=A0centos-release-6-2.el6.centos.7.x86_64

PostgreSQL version:
=A0 =A0PostgreSQL 9.1.3 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) =
4.4.6
20110731 (Red Hat 4.4.6-3), 64-bit
=A0 =A0and
=A0 =A0PostgreSQL 9.1.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) =
4.4.6
20110731 (Red Hat 4.4.6-3), 64-bit


Postgresql.conf settings for pg_logs related:
=A0 =A0logging_collector=3Don
=A0 =A0log_directory=3D'pg_log'
=A0 =A0log_filename=3D'postgresql-%a.log'
=A0 =A0log_truncate_on_rotation=3Don
=A0 =A0log_rotation_age=3D1d
=A0 =A0log_rotation_size=3D0

=A0 =A0timezone =3D 'UTC'

Service timezone: =A0
>date
Tue Jun 19 11:25:56 PDT 2012

Content of a postgresql-Tue.log in its entirety. Note this is a standby
slave so there are not much activities other than shutdown/restart performed
for a cold backup. The bold date is just to de-mark the current day where
the entries were appended. The entries of 2012-06-19 was an intentional
syntax error to get the an entry for the current day.

2012-06-12 16:00:02 PDT =A0 23761 4fd67fc4.5cd1 LOG: =A0received fast shutd=
own
request
2012-06-12 16:00:02 PDT =A0 23761 4fd67fc4.5cd1 LOG: =A0aborting any active
transactions
2012-06-12 16:00:02 PDT =A0 23800 4fd67fc8.5cf8 FATAL: =A0terminating
walreceiver process due to administrator command
2012-06-12 16:00:02 PDT =A0 23765 4fd67fc5.5cd5 LOG: =A0shutting down
2012-06-12 16:00:02 PDT =A0 23765 4fd67fc5.5cd5 LOG: =A0database system is =
shut
down
2012-06-12 16:31:08 PDT =A0 30458 4fd7d13c.76fa LOG: =A0database system was=
 shut
down in recovery at 2012-06-12 16:00:02 PDT
2012-06-12 16:31:08 PDT =A0 30458 4fd7d13c.76fa LOG: =A0entering standby mo=
de
2012-06-12 16:31:08 PDT =A0 30458 4fd7d13c.76fa LOG: =A0redo starts at
6B/D3E1EA70
2012-06-12 16:31:10 PDT =A0 30458 4fd7d13c.76fa LOG: =A0consistent recovery
state reached at 6B/DB8FDFE8
2012-06-12 16:31:10 PDT =A0 30458 4fd7d13c.76fa LOG: =A0unexpected pageaddr
69/D68FE000 in log file 107, segment 219, offset 9428992
2012-06-12 16:31:10 PDT =A0 30454 4fd7d13a.76f6 LOG: =A0database system is =
ready
to accept read only connections
2012-06-12 16:31:11 PDT =A0 30494 4fd7d13e.771e LOG: =A0streaming replicati=
on
successfully connected to primary
2012-06-19 11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea ERROR: =A0synt=
ax
error at or near ":" at character 17
2012-06-19 11:22:04 PDT postgres [local] 30698 4fe0c34c.77ea STATEMENT:
=A0select version():

Listing of pg_log directory. You can see here that the last entry of each
day (file timestamp) is when postgresql is restarted at 16:30. Except June
19, I just force a syntax error to create an entry for today, June 19:

-rw------- 1 postgres postgres 1403 Jun 13 16:30 postgresql-Wed.log
-rw------- 1 postgres postgres 2868 Jun 14 16:31 postgresql-Thu.log
-rw------- 1 postgres postgres 2418 Jun 15 16:31 postgresql-Fri.log
-rw------- 1 postgres postgres 2222 Jun 16 16:31 postgresql-Sat.log
-rw------- 1 postgres postgres 2208 Jun 17 16:30 postgresql-Sun.log
-rw------- 1 postgres postgres 2832 Jun 18 16:31 postgresql-Mon.log
-rw------- 1 postgres postgres 1355 Jun 19 11:22 postgresql-Tue.log

Here's normal behavior on a server that does NOT get restarted daily. The
logs are all rotated and truncated at 00:00 as expected:
$ pwd
/data/postgres/9.1/pg_log
ls -ltr
total 4
-rw------- 1 postgres postgres =A0 =A00 Jun 13 00:00 postgresql-Wed.log
-rw------- 1 postgres postgres =A0 =A00 Jun 14 00:00 postgresql-Thu.log
-rw------- 1 postgres postgres 1563 Jun 15 14:12 postgresql-Fri.log
-rw------- 1 postgres postgres =A0 =A00 Jun 16 00:00 postgresql-Sat.log
-rw------- 1 postgres postgres =A0 =A00 Jun 17 00:00 postgresql-Sun.log
-rw------- 1 postgres postgres =A0 =A00 Jun 18 00:00 postgresql-Mon.log
-rw------- 1 postgres postgres =A0 =A00 Jun 19 00:00 postgresql-Tue.log
--=20



Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com

Re: log_truncate_on_rotation=on is not truncating

From

Tom Lane

Date:

16 July 2012, 01:54:44

[ just got around to looking at this bug report, sorry for the delay ]

Viswanatham Kiran Kumar <viswanatham.kirankumar@huawei.com> writes:
> Hi Chan,
>     I think this is configuration issue.

>     This issue is happening because log files will always append when
> server is restarted. What I can see from your configuration,
> log_rotation_age=1d (1 day) which is greater than daily schedule
> shutdown/restart time. So this is the reason why log files are growing.

Yeah, I think that's part of it, but it does still seem like a bug.
The relevant bit of the code is

    /*
     * The requirements here are to choose the next time > now that is a
     * "multiple" of the log rotation interval.  "Multiple" can be interpreted
     * fairly loosely.  In this version we align to log_timezone rather than
     * GMT.
     */
    rotinterval = Log_RotationAge * SECS_PER_MINUTE;    /* convert to seconds */
    now = (pg_time_t) time(NULL);
    tm = pg_localtime(&now, log_timezone);
    now += tm->tm_gmtoff;
    now -= now % rotinterval;
    now += rotinterval;
    now -= tm->tm_gmtoff;
    next_rotation_time = now;

Looking at this, it seems like it will choose a next_rotation_time that
is more than rotinterval beyond "now" if you have a log_timezone for
which advancing by gmtoff moves past the next natural rotinterval
multiple.  So if the server gets shut down before that time elapses,
it never will think it should rotate the logfile.

Chan, could you say exactly what log_timezone setting is in use on the
servers that are acting funny?

            regards, tom lane

Re: log_truncate_on_rotation=on is not truncating

From

Tom Lane

Date:

16 July 2012, 02:43:42

I wrote:
> The relevant bit of the code is

>     /*
>      * The requirements here are to choose the next time > now that is a
>      * "multiple" of the log rotation interval.  "Multiple" can be interpreted
>      * fairly loosely.  In this version we align to log_timezone rather than
>      * GMT.
>      */

Oh, no, scratch that --- there's nothing wrong with that calculation.
Actually I think the issue is with the logic that decides whether to
truncate or append during rotation:

     * Decide whether to overwrite or append.  We can overwrite if (a)
     * Log_truncate_on_rotation is set, (b) the rotation was triggered by
     * elapsed time and not something else, and (c) the computed file name is
     * different from what we were previously logging into.
     *
     * Note: during the first rotation after forking off from the postmaster,
     * last_file_name will be NULL.  (We don't bother to set it in the
     * postmaster because it ain't gonna work in the EXEC_BACKEND case.) So we
     * will always append in that situation, even though truncating would
     * usually be safe.

This dates from around 2004, and at the time we did not think it was
very important if the first rotation opportunity failed to truncate.
Of course, if you restart the postmaster so often that it never gets to
the second rotation opportunity, that theory falls down.  But I guess
it's worth asking why you think it's a good plan to reboot so often.
Most people running Postgres are appalled by the idea that they might
have to have any downtime at all.

            regards, tom lane

Re: log_truncate_on_rotation=on is not truncating

From

Chan Pham

Date:

17 July 2012, 15:54:14

Hello Tom and all,

Thank you for following up on this.

1. log_timezone:

admin_datamart=# show log_timezone;
 log_timezone
--------------
 US/Pacific
(1 row)

I have been playing around with changing log_rotation_age = 7h, +/- 1 hour
to see if I can get the elapse timing right. The shutdown/startup time is
still at 1600/18:30. At 8h, it was still appending. But recently, I've
change it to 7h. Some day the log will truncate and some day it didn't. I'm
guessing if its based on elapsed time since startup for truncate to happen
or not, then it make sense that truncate didn't always happen because
startup time can varies (some day tar runs longer).

2. To answer your question as why we are restarting daily is that these
servers are hot standby and we are taking cold backups from there. The plan
is to do hot backup (now that we are on 9.1 and can pause recovery) but we
just haven't gotten time to implement the hot backup yet.

Thanks,

Chan

On Sun, Jul 15, 2012 at 10:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> I wrote:
> > The relevant bit of the code is
>
> >     /*
> >      * The requirements here are to choose the next time > now that is a
> >      * "multiple" of the log rotation interval.  "Multiple" can be
> interpreted
> >      * fairly loosely.  In this version we align to log_timezone rather
> than
> >      * GMT.
> >      */
>
> Oh, no, scratch that --- there's nothing wrong with that calculation.
> Actually I think the issue is with the logic that decides whether to
> truncate or append during rotation:
>
>      * Decide whether to overwrite or append.  We can overwrite if (a)
>      * Log_truncate_on_rotation is set, (b) the rotation was triggered by
>      * elapsed time and not something else, and (c) the computed file name
> is
>      * different from what we were previously logging into.
>      *
>      * Note: during the first rotation after forking off from the
> postmaster,
>      * last_file_name will be NULL.  (We don't bother to set it in the
>      * postmaster because it ain't gonna work in the EXEC_BACKEND case.)
> So we
>      * will always append in that situation, even though truncating would
>      * usually be safe.
>
> This dates from around 2004, and at the time we did not think it was
> very important if the first rotation opportunity failed to truncate.
> Of course, if you restart the postmaster so often that it never gets to
> the second rotation opportunity, that theory falls down.  But I guess
> it's worth asking why you think it's a good plan to reboot so often.
> Most people running Postgres are appalled by the idea that they might
> have to have any downtime at all.
>
>                         regards, tom lane
>

--
*

Chan Pham
Database Administrator

Direct: 503.943.6773
Fax: 503.224.1581 // AIM: ioChanny // Twitter: iovation
www.iovation.com

<http://www.iovation.com/eml-msg/>
*