On Tue, 2009-12-22 at 14:40 +0900, Fujii Masao wrote:
> On Sat, Dec 19, 2009 at 1:03 AM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com> wrote:
> > I don't think it's worthwhile to modify pg_stop_backup() like that. We
> > should address the general problem. At the moment, you're fine if you
> > also configure WAL archiving and log file shipping, but it would be nice
> > to have some simpler mechanism to avoid the problem. For example, a GUC
> > in master to retain all log files (including backup history files) for X
> > days. Or some way for to register the standby with the master so that
> > the master knows it's out there, and still needs the logs, even when
> > it's not connected.
>
> I propose the new GUC replication_reserved_segments (better name?) which
> determines the maximum number of WAL files held for the standby.
>
> Design:
>
> * Only the WAL files which are replication_reserved_segments segments older
> than the current write segment can be recycled. IOW, we can think that the
> standby which falls replication_reserved_segments segments behind is always
> connected to the primary, and the WAL files needed for the active standby
> are not recycled.
(I don't fully understand your words above, sorry.)
Possibly an easier way would be to have a size limit, not a number of
segments. Something like max_reserved_wal = 240GB. We made that change
to shared_buffers a few years back and it was really useful.
The problem I foresee is that doing it this way puts an upper limit on
the size of database we can replicate. While we do base backup and
transfer it we must store WAL somewhere. Forcing us to store the WAL on
the master while this happens could be very limiting.
> * Disjoin the standby which falls more than replication_reserved_segment
> segments behind, in order to avoid the disk full failure, i.e., the
> primary server's PANIC error. This would be only possible in asynchronous
> replication case.
Or at the start of replication.
-- Simon Riggs www.2ndQuadrant.com