On 05.09.2012 01:03, Dimitri Fontaine wrote:
> Heikki Linnakangas<hlinnaka@iki.fi> writes:
>> On 04.09.2012 03:02, Dimitri Fontaine wrote:
>>> Heikki Linnakangas<hlinnaka@iki.fi> writes:
>>>> Hmm, I was thinking that when walsender gets the position it can send the
>>>> WAL up to, in GetStandbyFlushRecPtr(), it could atomically check the current
>>>> recovery timeline. If it has changed, refuse to send the new WAL and
>>>> terminate. That would be a fairly small change, it would just close the
>>>> window between requesting walsenders to terminate and them actually
>>>> terminating.
>>
>> No, only cascading replication is affected. In non-cascading situation, the
>> timeline never changes in the master. It's only in cascading mode that you
>> have a problem, where the standby can cross timelines while it's replaying
>> the WAL, and also sending it over to cascading standby.
>
> It seems to me that it applies to connecting a standby to a newly
> promoted standby too, as the timeline did change in this case too.
I was worried about that too at first, but Fujii pointed out that's OK:
see last paragraph at
http://archives.postgresql.org/pgsql-hackers/2012-08/msg01203.php.
If you connect to a standby that was already promoted to new master,
it's no different from connecting to a master in general. It works. If
you connect just before a standby is promoted, it works because a
cascading standby pays attention to the recovery target timeline, and
the pointer to last replayed WAL record. Promoting a standby doesn't
change recovery target timeline or the last replayed WAL record, it sets
XLogCtl->ThisTimeLineID. So the walsender in cascading mode will send
the WAL up to where the promotion happened, but will stop there until
it's terminated by the signal.
- Heikki