BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves) - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
Date
Msg-id 20140918180734.361021e1@erg
Whole thread Raw
In response to Re: .ready files appearing on slaves  (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses Re: BUG: *FF WALs under 9.2 (WAS: .ready files appearing on slaves)
List pgsql-hackers
Hi hackers,

We spend some time with Guillaume Lelarge studying this issue.

CreateRestartPoint() calls RemoveOldXlogFiles() to drop/recycle old WALs. This
one is calling XLogArchiveCheckDone() to check if the given WAL can be dropped.
As our slave has "archive_mode" & "archive_command" set, XLogArchivingActive()
returns true, forcing XLogArchiveCheckDone() to look for a ".done" file. If the
corresponding ".done" file doesn't exist, XLogArchiveCheckDone() calls
XLogArchiveNotify() to actually create the ".ready" file.


Now, on a standby, we are supposed to create the ".done" files by calling
XLogArchiveForceDone() either after a successful restore_command or when the
streaming replication finishes and flushes a WAL. It seems like the root cause
of this trouble: maybe a race condition where XLogArchiveForceDone() would not
be called for a WAL. As the ".done" file is never created, we end up calling
XLogArchiveNotify() to create a ".ready" and this WAL will never be archived by
the slave, sitting in the pg_xlog dir.

But worst, we experience this under two 9.2.8 clusters and realized
this version is not supposed to have 256 segments per WAL, finishing a WAL to
the *FE one. I'm quoting back the history of my previous mail as we clearly
have some *FF files on the slave side!

> >   000000010000040E000000FF
> >   0000000100000414000000DA
> >   000000010000046E000000FF
> >   0000000100000470000000FF
> >   00000001000004850000000F
> >   000000010000048500000010
> >   [...normal WAL sequence...]
> >   000000010000048500000052

Note that it seems to happen only during streaming replication, not in pure Log
Shipping replication. We added the debug message on the master side to make
sure it never archive *FF file

We kept the WAL files and log files for further analysis. How can we help
regarding this issue?

Regards,


On Mon, 15 Sep 2014 17:37:24 +0200
Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:

> Hi hackers,
> 
> An issue that seems related to this has been posted on pgsql-admin. See:
> 
>   http://www.postgresql.org/message-id/CAAS3tyLnXaYDZ0+zhXLPdVtOvHQOvR+jSPhp30o8kvWqQs0Tqw@mail.gmail.com
> 
> How can we help on this issue?
> 
> Cheers,
> 
> On Thu, 4 Sep 2014 17:50:36 +0200
> Jehan-Guillaume de Rorthais <jgdr@dalibo.com> wrote:
> 
> > Hi hackers,
> > 
> > Since few months, we occasionally see .ready files appearing on some slave
> >  instances from various context. The two I have in mind are under 9.2.x.
> > 
> > I tried to investigate a bit. These .ready files are created when a WAL file
> > from pg_xlog has no corresponding file in pg_xlog/archive_status. I could
> > easily experience this by deleting such a file: it is created again at the
> > next restartpoint or checkpoint received from the master.
> > 
> > Looking at the WAL in pg_xlog folder corresponding to these .ready files,
> > they are all much older than the current WAL "cycle" in both mtime and name
> > logic sequence. As instance on one of these box we have currently 6 of
> > those "ghost" WALs:
> > 
> >   0000000200001E53000000FF
> >   0000000200001F18000000FF
> >   0000000200002047000000FF
> >   00000002000020BF000000FF
> >   0000000200002140000000FF
> >   0000000200002370000000FF
> >   000000020000255D000000A8
> >   000000020000255D000000A9
> >   [...normal WAL sequence...]
> >   000000020000255E0000009D
> > 
> > And on another box:
> > 
> >   000000010000040E000000FF
> >   0000000100000414000000DA
> >   000000010000046E000000FF
> >   0000000100000470000000FF
> >   00000001000004850000000F
> >   000000010000048500000010
> >   [...normal WAL sequence...]
> >   000000010000048500000052
> > 
> > So it seems for some reasons, these old WALs were "forgotten" by the
> > restartpoint mechanism when they should have been recylced/deleted. 
> > 
> > For one of these servers, I could correlate this with some brutal
> > disconnection of the streaming replication appearing in its logs. But there
> > was no known SR disconnection on the second one.
> > 
> > Any idea about this weird behaviour? What can we do to help you investigate
> > further?
> > 
> > Regards,
> 
> 
> 



-- 
Jehan-Guillaume de Rorthais
Dalibo
http://www.dalibo.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Options OUTPUT_PLUGIN_* controlling format are confusing (Was: Misleading error message in logical decoding)
Next
From: Josh Berkus
Date:
Subject: Re: jsonb format is pessimal for toast compression