Re: The way to know whether the standby has caught up with the master - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: The way to know whether the standby has caught up with the master
Date
Msg-id BANLkTikLqmT5hwrF2SeSQwBXvg+1aCAByw@mail.gmail.com
Whole thread Raw
In response to Re: The way to know whether the standby has caught up with the master  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, May 25, 2011 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> On 25.05.2011 07:42, Fujii Masao wrote:
>>> To achieve that, I'm thinking to change walsender so that, when the standby
>>> has caught up with the master, it sends back the message indicating that to
>>> the standby. And I'm thinking to add new function (or view like
>>> pg_stat_replication)
>>> available on the standby, which shows that info.
>
>> By the time the standby has received that message, it might not be
>> caught-up anymore because new WAL might've been generated in the master
>> already.
>
> Even assuming that you believe this is a useful capability, there is no
> need to change walsender.  It *already* sends the current-end-of-WAL in
> every message, which indicates precisely whether the message contains
> all of available WAL data.

That's not enough to calculate whether failover is safe or not. Even if the
standby's flush location is equal to the master's current end location, new
WAL might have already been generated, and the "success" indication of
the corresponding transaction might have been returned to the client (this
is possible only when async mode). So in addition to the master's current
end location, the standby must know its sync mode, which walsender would
need to send.

Another problem is that, when we can safely promote the standby, the
standby's flush location isn't always equal to the master's current end
location. Imagine the case where there are some unsent WAL in the master
and corresponding transactions are waiting for replication. In this case,
obviously those locations are not the same. But in sync replication, we can
guarantee that all the committed (from the client's view) transactions have
been replicated to the standby, so failover is safe.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: The way to know whether the standby has caught up with the master
Next
From: Peter Geoghegan
Date:
Subject: Re: Latch implementation that wakes on postmaster death on both win32 and Unix