Streaming replication and a disk full in primary - Mailing list pgsql-hackers

From Fujii Masao
Subject Streaming replication and a disk full in primary
Date
Msg-id 3f0b79eb1001210544ndc95606ud37c26f8d66b643b@mail.gmail.com
Whole thread Raw
Responses Re: Streaming replication and a disk full in primary  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
Hi,

If the primary has a connected standby, the WAL files required for
the standby cannot be deleted. So if it has fallen too far behind
for some reasons, a disk full failure might occur on the primary.
This is one of the problems that should be fixed for v9.0.

We can cope with that case by carefully monitoring the standby lag.
In addition to this, I think that we should put an upper limit on
the number of WAL files held in pg_xlog for the standby (i.e.,
the maximum delay of the standby) as a safeguard against a disk
full error.

The attached patch introduces new GUC 'replication_lag_segments'
which specifies the maximum number of WAL files held in pg_xlog
to send to the standby. The replication to the standby which
falls more than the upper limit behind is automatically terminated,
which would avoid a disk full erro on the primary.

This GUC is also useful to hold some WAL files for the incoming
standby. This would avoid the problem that a WAL file required
for the standby doesn't exist in the primary at the start of
replication, to some extent.

The code is also available in the 'replication' branch in my
git repository.

    git://git.postgresql.org/git/users/fujii/postgres.git
    branch: replication

Comment? Objection? Review?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Streaming Replication on win32
Next
From: Greg Stark
Date:
Subject: Re: Streaming Replication and archiving