Home > mailing lists

Re: Unarchived WALs deleted after crash - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Unarchived WALs deleted after crash
Date	February 15, 2013 14:31:56
Msg-id	511E46CF.1090101@vmware.com Whole thread Raw
In response to	Unarchived WALs deleted after crash (Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Responses	Re: Unarchived WALs deleted after crash Re: Unarchived WALs deleted after crash
List	pgsql-hackers

Tree view

On 14.02.2013 17:45, Jehan-Guillaume de Rorthais wrote:
> I am facing an unexpected behavior on a 9.2.2 cluster that I can
> reproduce on current HEAD.
>
> On a cluster with archive enabled but failing, after a crash of
> postmaster, the checkpoint occurring before leaving the recovery mode
> deletes any additional WALs, even those waiting to be archived.
 > ...
 > Is it expected ?

No, it's a bug. Ouch. It was introduced in 9.2, by commit
5286105800c7d5902f98f32e11b209c471c0c69c:

> -  /*
> -   * Normally we don't delete old XLOG files during recovery to
> -   * avoid accidentally deleting a file that looks stale due to a
> -   * bug or hardware issue, but in fact contains important data.
> -   * During streaming recovery, however, we will eventually fill the
> -   * disk if we never clean up, so we have to. That's not an issue
> -   * with file-based archive recovery because in that case we
> -   * restore one XLOG file at a time, on-demand, and with a
> -   * different filename that can't be confused with regular XLOG
> -   * files.
> -   */
> -   if (WalRcvInProgress() || XLogArchiveCheckDone(xlde->d_name))
> +   if (RecoveryInProgress() || XLogArchiveCheckDone(xlde->d_name))
>          [ delete the file ]

With that commit, we started to keep WAL segments restored from the
archive in pg_xlog, so we needed to start deleting old segments during
archive recovery, even when streaming replication was not active. But
the above change was to broad; we started to delete old segments also
during crash recovery.

The above should check InArchiveRecovery, ie. only delete old files when
in archive recovery, not when in crash recovery. But there's one little
complication: InArchiveRecovery is currently only valid in the startup
process, so we'll need to also share it in shared memory, so that the
checkpointer process can access it.

I propose the attached patch to fix it.

- Heikki

Attachment

dont-delete-unarchived-xlog-files-during-crash-recovery.patch

pgsql-hackers by date:

From: Atri Sharma
Date: 15 February 2013, 14:03:20
Subject: Re: [pgsql-advocacy] Call for Google Summer of Code mentors, admins

From: Tom Lane
Date: 15 February 2013, 14:38:25
Subject: Re: I think we need PRE_COMMIT events for (Sub)XactCallbacks

Re: Unarchived WALs deleted after crash - Mailing list pgsql-hackers

Attachment

Previous

Next