Re: Enable WAL archiving even in standby - Mailing list pgsql-hackers

From Alexey Klyukin
Subject Re: Enable WAL archiving even in standby
Date
Msg-id CAAS3tyJrUOQqhDo76K8bZ_N8HRVGCE307XtxLwDoJDx0vOjPdw@mail.gmail.com
Whole thread Raw
In response to Enable WAL archiving even in standby  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Wed, Aug 13, 2014 at 12:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> Hi,
>
> I'd propose the attached WIP patch which allows us to enable WAL archiving
> even in standby.
...
> I think that this feature is useful for the case, e.g., where large database
> needs to be replicated between remote servers. Imagine the situation where
> the replicated database gets corrupted completely in the remote standby.
> How should we address this problematic situation and restart the standby?
>
> One approach is to take a fresh backup from the master and restore it onto
> the standby. But since the database is large and there is long distance
> between two servers, this approach might take a surprisingly long time.
>
> Another approach is to restore the backup which was taken from the standby
> before. But most of many WAL files which the backup needs might exist only
> in the master (because WAL archiving cannot be enabled in the standby) and
> they need to be transfered from the master to the standby via long-distance
> network. So I think that this approach also would take a fairly long time.
> To shorten that time, you may think that archive_command in the master can
> be set so that it transfers WAL files from the master to the standby's
> archival storage. I agree that this setting can accelerate the database restore
> process. But this causes every WAL files to be transfered between remote
> servers twice (one is by streaming replication, another is by archive_command),
> and which is a waste of network bandwidth.

Well, in theory one can also use pg_receivexlog to get the WAL files
from master,
and then run them through the PITR on the slave without the
archive_cleanup command.

I'm not sure you can do the same if the source of the WAL files is a
cascading slave,
but I see no reasons why not to.

However, I find the patch  useful, since it allows accomplishing
things in a much more
straightforward way.

>
> Back to the patch. If archive_mode is set to "always", archive_command is
> always used to archive WAL files even during recovery. Do we need to separate
> the command into two for master and standby, respectively? We can add
> something like standby_archive_command parameter which is used to archive
> only WAL files walreceiver writes. The other WAL files are archived by
> archive_command. I'm not sure if it's really worth separating the command
> that way. Is there any use case?

I don't see a good use case for doing things only on standby, but I can imagine
that some different archiving methods might be used depending on the role of
the archiver: on master, one may do, for instance, additional copy to the NFS
partition. Does it make sense to expose the server role ('is_master') via an
additional variable available to the recovery_command (i.e. %m)?

>
> The patch doesn't allow us to enable WAL archiving *only* during recovery.
> Should we support yet another archive_mode like "standby" which allows
> the archiver to be running only during recovery, but makes it end just after
> the server is promoted to master? I'm not sure if there is really use case for
> that.

I do not see much use for this as well.

>
> I've not included the update of document in the patch yet. If we agree to
> support this feature, I will do the remaining work.

I think it is useful, and I gave this patch a spin by, essentially, creating a
cascaded archive-only slave. I made a base backup from master, then
ran the standby from this base backup with archive_mode = 'always' and
archive_command copying files to the archive_location, then created another
base backup out of it (without including WAL files into the backup) and pointed
the recovery command of the final slave into the archive created by
the intermediate one.

Recovery worked, as well as the promotion of the intermediate slave to
the master,
the final slave just caught up with the timeline changes (with
recovery_timeline set to
'latest') and went on with the recovery.

One thing I've noticed is that pg_basebackup copies the postgresql.conf from the
standby verbatim, including the archive_mode, which means that if one runs
the cascaded replica without changing the archive_mode, that replica
will try to archive
the WAL, and if both the source and the replica are running on the same machine
(or attached to  NFS using the same mount points) even the destination
for archiving
will be the same. Should not be a big problem if one follows the
recommendation of not
overwriting the files that already exist at the destination, but it
would be nice to reset the
archive_mode flag to off.

I do not know much about the WAL-related code, but one thing that I
found strange
in the patch is  a separate file xlogarchive.h instead of putting
stuff into xlog.h?
Does not make much sense for a single enum, are you planning to put
more things there?

There was a single hunk when applying this against the latest master:
>Hunk #4 succeeded at 4789 (offset -1 lines).

-- 
Regards,
Alexey Klyukin



pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: psql \watch versus \timing
Next
From: Fujii Masao
Date:
Subject: Re: psql \watch versus \timing