Re: Unnecessary WAL archiving after failover - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Unnecessary WAL archiving after failover
Date
Msg-id CA+Tgmoa5BBq_6LfUqa1v8HK1C-cT8UtjaTHFgPN4=6kWf2o95Q@mail.gmail.com
Whole thread Raw
In response to Re: Unnecessary WAL archiving after failover  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Fri, Mar 23, 2012 at 10:03 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On second thought, I found other issues about WAL archiving after
> failover. So let me clarify the issues again.
>
> Just after failover, there can be three kinds of WAL files in new
> master's pg_xlog directory:
>
> (1) WAL files which were recycled to by restartpoint
>
> I've already explained upthread the issue which these WAL files cause
> after failover.

Check.

> (2) WAL files which were restored from the archive
>
> In 9.1 or before, the restored WAL files don't remain after failover
> because they are always restored onto the temporary filename
> "RECOVERYXLOG". So the issue which I explain from now doesn't exist
> in 9.1 or before.
>
> In 9.2dev, as the result of supporting cascade replication,
> an archived WAL file is restored onto correct file name so that
> cascading walsender can send it to another standby. This restored
> WAL file has neither .ready nor .done archive status file. After
> failover, checkpoint checks the archive status file of the restored
> WAL file to attempt to recycle it, finds that it has neither .ready
> nor ,done, and creates .ready. Because of existence of .ready,
> it will be archived again even though it obviously already exists in
> the archival storage :(
>
> To prevent a restored WAL file from being archived again, I think
> that .done should be created whenever WAL file is successfully
> restored (of course this should happen only when archive_mode is
> enabled). Thought?
>
> Since this is the oversight of cascade replication, I'm thinking to
> implement the patch for 9.2dev.

Yes, I think we had better fix this in 9.2.  As you say, it's a loose
end from streaming replication.  Do you have a patch?

> (3) WAL files which were streamed from the master
>
> These WAL files also don't have any archive status, so checkpoint
> creates .ready for them after failover. And then, all or many of
> them will be archived at a time, which would cause I/O spike on
> both WAL and archival storage.
>
> To avoid this problem, I think that we should change walreceiver
> so that it creates .ready as soon as it completes the WAL file. Also
> we should change the archiver process so that it starts up even in
> standby mode and archives the WAL files.
>
> If each server has its own archival storage, the above solution would
> work fine. But if all servers share the archival storage, multiple archiver
> processes in those servers might archive the same WAL file to
> the shared area at the same time. Is this OK? If not, to avoid this,
> we might need to separate archive_mode into two: one for normal mode
> (i.e., master), another for standbfy mode. If the archive is shared,
> we can ensure that only one archiver in the master copies the WAL file
> at the same time by disabling WAL archiving in standby mode but
> enabling it in normal mode. Thought?

Another option would be to run the archiver in both modes and somehow
pass a flag indicating whether it's running in standby mode or normal
running.

> Invoking the archiver process in standby mode is new feature,
> not a bug fix. It's too late to propose new feature for 9.2. So I'll
> propose this for 9.3.

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: proposal: additional error fields
Next
From: Tom Lane
Date:
Subject: Re: Latch for the WAL writer - further reducing idle wake-ups.