Thread: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From
l@lrowe.co.uk
Date:
The following bug has been logged on the website:

Bug reference:      13685
Logged by:          Laurence Rwoe
Email address:      l@lrowe.co.uk
PostgreSQL version: 9.4.5
Operating system:   Mac OS X 10.10
Description:

I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I
set ``wal_level hot_standby``:

$ initdb testpg
$ cat << 'EOF' >> testpg/postgresql.conf
> wal_level = hot_standby
> archive_mode = on
> archive_timeout = 10
> checkpoint_timeout = 1h
> archive_command = 'echo $(date) archive %p'
> log_checkpoints = on
> EOF
$ postgres -D testpg
LOG:  database system was shut down at 2015-10-13 11:58:45 PDT
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
Tue Oct 13 12:00:47 PDT 2015 archive pg_xlog/000000010000000000000001
Tue Oct 13 12:00:57 PDT 2015 archive pg_xlog/000000010000000000000002
LOG:  checkpoint starting: xlog
Tue Oct 13 12:01:07 PDT 2015 archive pg_xlog/000000010000000000000003
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s)
added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.026 s;
sync files=0, longest=0.000 s, average=0.000 s
Tue Oct 13 12:01:17 PDT 2015 archive pg_xlog/000000010000000000000004
Tue Oct 13 12:01:27 PDT 2015 archive pg_xlog/000000010000000000000005
LOG:  checkpoint starting: xlog
Tue Oct 13 12:01:38 PDT 2015 archive pg_xlog/000000010000000000000006
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s)
added, 0 removed, 3 recycled; write=0.000 s, sync=0.000 s, total=0.027 s;
sync files=0, longest=0.000 s, average=0.000 s
Tue Oct 13 12:01:48 PDT 2015 archive pg_xlog/000000010000000000000007
Tue Oct 13 12:01:58 PDT 2015 archive pg_xlog/000000010000000000000008
LOG:  checkpoint starting: xlog
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s)
added, 0 removed, 3 recycled; write=0.000 s, sync=0.000 s, total=0.001 s;
sync files=0, longest=0.000 s, average=0.000 s


At ``wal_level archive`` I only see archiving every checkpoint_timeout (that
it archives every checkpoint_timeout is a known limitation, see
http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com):

$ initdb testpg
$ cat << 'EOF' >> testpg/postgresql.conf
> wal_level = archive
> archive_mode = on
> archive_timeout = 10
> checkpoint_timeout = 60
> archive_command = 'echo $(date) archive %p'
> log_checkpoints = on
> EOF
# lrowe@Laurences-iMac ~/scratch
$ postgres -D testpg
LOG:  database system was shut down at 2015-10-13 12:25:38 PDT
LOG:  MultiXact member wraparound protections are now enabled
LOG:  autovacuum launcher started
LOG:  database system is ready to accept connections
Tue Oct 13 12:25:49 PDT 2015 archive pg_xlog/000000010000000000000001
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s)
added, 0 removed, 0 recycled; write=0.000 s, sync=0.000 s, total=0.029 s;
sync files=0, longest=0.000 s, average=0.000 s
Tue Oct 13 12:26:39 PDT 2015 archive pg_xlog/000000010000000000000002
LOG:  checkpoint starting: time
LOG:  checkpoint complete: wrote 0 buffers (0.0%); 0 transaction log file(s)
added, 0 removed, 1 recycled; write=0.000 s, sync=0.000 s, total=0.030 s;
sync files=0, longest=0.000 s, average=0.000 s
Tue Oct 13 12:27:39 PDT 2015 archive pg_xlog/000000010000000000000003

I think this additional archiving at wal_level hot_standby is a bug.


On Sat, Oct 17, 2015 at 5:30 AM,  <l@lrowe.co.uk> wrote:
> I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I
> set ``wal_level hot_standby``.
> At ``wal_level archive`` I only see archiving every checkpoint_timeout (that
> it archives every checkpoint_timeout is a known limitation, see
> http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com):
> I think this additional archiving at wal_level hot_standby is a bug.

Agreed. There is indeed a difference between the way 9.3 and 9.4 behave. When wal_level = hot_standby, with 9.4 a segment will be archived depending on archive_timeout as you mention, and that's not the case of 9.3. There is definitely a regression here: we should not archive a segment if there is no activity.

If I look at the contents of the segments with 9.4 when there is no activity, I am seeing that actually a record XLOG_RUNNING_XACTS is generated all the time after switching a segment, leading to the archiving of this segment because server thinks that there is new data, and actually there is, so the segment will be archived... Here is for example the output of pg_xlogdump in this case:
$ pg_xlogdump 000000010000000000000018
rmgr: Standby     len (rec/tot):     24/    56, tx:          0, lsn: 0/18000028, prev 0/17000060, bkp: 0000, desc: running xacts: nextXid 1001 latestCompletedXid 1000 oldestRunningXid 1001
rmgr: XLOG        len (rec/tot):      0/    32, tx:          0, lsn: 0/18000060, prev 0/18000028, bkp: 0000, desc: xlog switch
[end of records for this segment]

A little bit of debugging is directing me to the bgwriter process, LogStandbySnapshot() being called by BackgroundWriterMain@bgwriter.c, generating those WAL records even if a system is idle. I am adding Robert and Andres in CC, as this is caused by commit ed46758 which is a new thing of 9.4.

I think that a simple idea would be to not call LogStandbySnapshot() when we are still at the beginning of a new segment. We know that the first page of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on that sounds enough to me. Per se the patch attached that should be applied down to 9.4. This fixes the regression reported by Laurence for me.
Regards,
--
Michael
Attachment
On Sat, Oct 17, 2015 at 11:10 PM, Michael Paquier
<michael.paquier@gmail.com>wrote:

> I think that a simple idea would be to not call LogStandbySnapshot() when
> we are still at the beginning of a new segment. We know that the first page
> of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on
> that sounds enough to me. Per se the patch attached that should be applied
> down to 9.4. This fixes the regression reported by Laurence for me.
>

This bug fix is registered in next CF so as we do not lose track of it;
https://commitfest.postgresql.org/7/398/
--
Michael

Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby

From
Michael Paquier
Date:
On Tue, Oct 20, 2015 at 2:21 AM, Michael Paquier wrote:
> On Sat, Oct 17, 2015 at 11:10 PM, Michael Paquier wrote:
>> I think that a simple idea would be to not call LogStandbySnapshot() when
>> we are still at the beginning of a new segment. We know that the first page
>> of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on
>> that sounds enough to me. Per se the patch attached that should be applied
>> down to 9.4. This fixes the regression reported by Laurence for me.
>
> This bug fix is registered in next CF so as we do not lose track of it;
> https://commitfest.postgresql.org/7/398/

Andres has mentioned me during Postgres Europe that instead of looking
at if the last inserted record was at the beginning of a new segment,
we had better check if some progress has been done since the last
checkpoint that happened. Attached is a patch adding some logic to
track the LSN position of the last checkpoint record created, and log
the standby activity only if some progress has been done since. It
seems that we had better be sure that the last checkpoint record is
neither the last inserted record, nor the one previously inserted
before logging the standby activity, as a XLOG_STANDBY_LOCK record may
be added in between.
Regards,
--
Michael

Attachment