Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Date
Msg-id CAB7nPqTJtH6YuFbfuPZ2YyN7gP4i8hpRV0U33YPPC1icxcS60Q@mail.gmail.com
Whole thread Raw
In response to BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (l@lrowe.co.uk)
Responses Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-bugs


On Sat, Oct 17, 2015 at 5:30 AM,  <l@lrowe.co.uk> wrote:
> I'm seeing Postgres 9.4.5 archive while idle every archive_timeout when I
> set ``wal_level hot_standby``.
> At ``wal_level archive`` I only see archiving every checkpoint_timeout (that
> it archives every checkpoint_timeout is a known limitation, see
> http://www.postgresql.org/message-id/1407389876762-5813999.post@n5.nabble.com):
> I think this additional archiving at wal_level hot_standby is a bug.

Agreed. There is indeed a difference between the way 9.3 and 9.4 behave. When wal_level = hot_standby, with 9.4 a segment will be archived depending on archive_timeout as you mention, and that's not the case of 9.3. There is definitely a regression here: we should not archive a segment if there is no activity.

If I look at the contents of the segments with 9.4 when there is no activity, I am seeing that actually a record XLOG_RUNNING_XACTS is generated all the time after switching a segment, leading to the archiving of this segment because server thinks that there is new data, and actually there is, so the segment will be archived... Here is for example the output of pg_xlogdump in this case:
$ pg_xlogdump 000000010000000000000018
rmgr: Standby     len (rec/tot):     24/    56, tx:          0, lsn: 0/18000028, prev 0/17000060, bkp: 0000, desc: running xacts: nextXid 1001 latestCompletedXid 1000 oldestRunningXid 1001
rmgr: XLOG        len (rec/tot):      0/    32, tx:          0, lsn: 0/18000060, prev 0/18000028, bkp: 0000, desc: xlog switch
[end of records for this segment]

A little bit of debugging is directing me to the bgwriter process, LogStandbySnapshot() being called by BackgroundWriterMain@bgwriter.c, generating those WAL records even if a system is idle. I am adding Robert and Andres in CC, as this is caused by commit ed46758 which is a new thing of 9.4.

I think that a simple idea would be to not call LogStandbySnapshot() when we are still at the beginning of a new segment. We know that the first page of a WAL segment has a size of SizeOfXLogLongPHD, so just having a check on that sounds enough to me. Per se the patch attached that should be applied down to 9.4. This fixes the regression reported by Laurence for me.
Regards,
--
Michael
Attachment

pgsql-bugs by date:

Previous
From: 許耀彰
Date:
Subject: postgresql table data control
Next
From: Gláucio Barros Barcelos
Date:
Subject: Re: BUG #13676: C typedef code generated by ecpg with wrong syntax