Thread: WAL archive is lost

WAL archive is lost

From

"matsumura.ryo@fujitsu.com"

Date:

22 November 2019, 05:31:55

Hi all

I find a situation that WAL archive file is lost but any WAL segment file is not lost.
It causes for archive recovery to fail. Is this behavior a bug?

example:

  WAL segment files
  000000010000000000000001
  000000010000000000000002
  000000010000000000000003

  Archive files
  000000010000000000000001
  000000010000000000000003

  Archive file 000000010000000000000002 is lost but WAL segment files
  is continuous. Recovery with archive (i.e. PITR) stops at the end of
  000000010000000000000001.

How to reproduce:
- Set up replication (primary and standby).
- Set [archive_mode = always] in standby.
- WAL receiver exits (i.e. because primary goes down)
  after receiver inserts the last record in some WAL segment file
  before receiver notifies the segement file to archiver(create .ready file).

Even if WAL receiver restarts, the WAL segment file is not notified to
archiver.


Regards
Ryo Matsumura

Re: WAL archive is lost

From

Tomas Vondra

Date:

22 November 2019, 19:44:40

On Fri, Nov 22, 2019 at 05:31:55AM +0000, matsumura.ryo@fujitsu.com wrote:
>Hi all
>
>I find a situation that WAL archive file is lost but any WAL segment file is not lost.
>It causes for archive recovery to fail. Is this behavior a bug?
>
>example:
>
>  WAL segment files
>  000000010000000000000001
>  000000010000000000000002
>  000000010000000000000003
>
>  Archive files
>  000000010000000000000001
>  000000010000000000000003
>
>  Archive file 000000010000000000000002 is lost but WAL segment files
>  is continuous. Recovery with archive (i.e. PITR) stops at the end of
>  000000010000000000000001.
>
>How to reproduce:
>- Set up replication (primary and standby).
>- Set [archive_mode = always] in standby.
>- WAL receiver exits (i.e. because primary goes down)
>  after receiver inserts the last record in some WAL segment file
>  before receiver notifies the segement file to archiver(create .ready file).
>
>Even if WAL receiver restarts, the WAL segment file is not notified to
>archiver.
>

That does indeed seem like a bug. We should certainly archive all WAL
segments, irrespectedly of primary shutdowns/restarts/whatever. I guess
we should make sure the archiver is properly notified befor ethe exit.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: WAL archive is lost

From

Jeff Janes

Date:

23 November 2019, 14:10:35

On Fri, Nov 22, 2019 at 8:04 AM matsumura.ryo@fujitsu.com <matsumura.ryo@fujitsu.com> wrote:

Hi all

I find a situation that WAL archive file is lost but any WAL segment file is not lost.
It causes for archive recovery to fail. Is this behavior a bug?

example:

WAL segment files
000000010000000000000001
000000010000000000000002
000000010000000000000003

Archive files
000000010000000000000001
000000010000000000000003

Archive file 000000010000000000000002 is lost but WAL segment files
is continuous. Recovery with archive (i.e. PITR) stops at the end of
000000010000000000000001.

Will it not archive 000000010000000000000002 eventually, like at the conclusion of the next restartpoint? or does it get recycled/removed without ever being archived? Or does it just hang out forever in pg_wal?

How to reproduce:
- Set up replication (primary and standby).
- Set [archive_mode = always] in standby.
- WAL receiver exits (i.e. because primary goes down)
after receiver inserts the last record in some WAL segment file
before receiver notifies the segement file to archiver(create .ready file).

Do you have a trick for reliably achieving this last step?

Cheers,

Jeff

RE: WAL archive is lost

From

"matsumura.ryo@fujitsu.com"

Date:

29 November 2019, 01:44:39

Tomas-san and Jeff-san

I'm very sorry for my slow response.

Tomas-san wrote:
> That does indeed seem like a bug. We should certainly archive all WAL
> segments, irrespectedly of primary shutdowns/restarts/whatever.

I think so, too.

Tomas-san wrote:
> I guess we should make sure the archiver is properly notified befor
> ethe exit.

Just an idea.
If walrcv_receive(libpqrcv_receive) returns by error value when 
socket error is occured, it is enable for walreceiver to walk
endofwal-route that calls XLogArchiveNotify() in the end of
outter loop of walreceiver.

 593                 XLogArchiveNotify(xlogfname);
 594         }
 595         recvFile = -1;
 596
 597         elog(DEBUG1, "walreceiver ended streaming and awaits new instructions");
 598         Wal

Jeff-san wrote:
> Will it not archive 000000010000000000000002 eventually, like at the
> conclusion of the next restartpoint?  or does it get recycled/removed
> without ever being archived?  Or does it just hang out forever in pg_wal?

000000010000000000000002 hang out forever.
000000010000000000000002 will be never archived, recycled, and removed.

I found that even if archive_mode is not set to 'always',
it will be never recycled and removed.

Jeff-san wrote:
> Do you have a trick for reliably achieving this last step?

If possible, stop walsender just after it sends the end record of in one
WAL segement file or SWITCH_LOG, and then stop primary immediately.

There are two pattern that cause this issue.

Pattern 1.
If primary is shut down immediately when walreceiver receives the end
record of one WAL segment file and then wait for next record by walrcv_receive(),
walreceiver exits without XLogArchiveNotify() or XLogArchiveForceDone() in
XLogWalRcvWrite() because walrcv_receive() reports ERROR.
Even if the startup process restarts walreceiver and requests to start
from the top of next segement file. Then, walreceiver receives it and
writes by XLogWalRcvWrite() but it doesn't walk the route to XLogArchiveNotify()
because it has not opened any file (recvFile == -1).

Pattern 2.
Only trigger is different.
If primary is shut down immediately when walreceiver receives SWITCH_LOG
and then wait for next record by walrcv_receive(), walreceiver exits
without notification to archiver.
The startup process will tell for walreceiver to start receiving from
the top of next segment file.

Regards
Ryo Matsumura