Thread: streaming replication and recovery

streaming replication and recovery

From
Anupama Ramaswamy
Date:
I have 2 postgres nodes setup in a replication and hot standby configuration. I am using pgpool for automatic failover and load balancing the read queries.

I have setup scripts for automatic failover when the master node fails. I want to understand how it would work in the following 2 scenarios. 
touch /tmp/pgsql.failover  - which is the trigger file.
I dont use pg_standby in my recovery process

Scenario 1
----------------
Suppose the secondary server is lagging behind the primary at the time of primary failure, will the secondary completely catch up to the primary state, before stopping replication. Or what in the process ensures this happens ?

Scenario 2
------------------
Both primary and secondary are running. 
Secondary goes down at time t1. 
Now primary goes down at t2. At this point, the trigger file is created.
Secondary node restarts at time t3.
What happens to all the transactions that occurred between t1 and t2.

Regards,
Anupama.



Re: streaming replication and recovery

From
Sergey Konoplev
Date:
On Sat, Apr 5, 2014 at 3:48 AM, Anupama Ramaswamy <anumr_0123@yahoo.com> wrote:
> Scenario 1
> ----------------
> Suppose the secondary server is lagging behind the primary at the time of
> primary failure, will the secondary completely catch up to the primary
> state, before stopping replication. Or what in the process ensures this
> happens ?

There are two lag types to consider about in case of a normal
streaming replication - delivery lag and replay lag. The secondary
will completely catch up to what have been delivered, but what have
not been is going to be lost. See [1][2].

> Scenario 2
> ------------------
> Both primary and secondary are running.
> Secondary goes down at time t1.
> Now primary goes down at t2. At this point, the trigger file is created.
> Secondary node restarts at time t3.
> What happens to all the transactions that occurred between t1 and t2.

They are going to be lost in case of a normal streaming replication.

Note that there is an ability to setup so called synchronous
replication [3] that guarantees delivery of changes that were commit
in exchange of some performance. Literally, it wont commit until the
changes are transferred.

So, if you have a mission critical or other high importance data you
should to setup synchronous replication, otherwise, if you can
sacrifice some latest commits, normal streaming replication.

[1] http://www.postgresql.org/docs/current/static/monitoring-stats.html#PG-STAT-REPLICATION-VIEW
[2] http://www.postgresql.org/docs/current/static/functions-admin.html#FUNCTIONS-RECOVERY-CONTROL
[3] http://www.postgresql.org/docs/current/static/warm-standby.html#SYNCHRONOUS-REPLICATION

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: streaming replication and recovery

From
Anupama Ramaswamy
Date:
Thanks for your response.

>>There are two lag types to consider about in case of a normal
>>streaming replication - delivery lag and replay lag. The secondary
>>will completely catch up to what have been delivered, but what have
>>not been is going to be lost. See [1][2].

Ok, I understand. I want to understand exact sequence of events that will happen. 
Lets say I have specified the following in my recovery.conf

trigger_file =/tmp/pgsql.failover

I detect that the master has failed and immediately I do the following on the standby
touch /tmp/pgsql.failover

Lets suppose at this point there is 0 delivery lag but XXXX bytes of replay lag. 

a) Will the replay complete before the standby stops replicating (because it sees the trigger file) ?
b) If I want to run this as new master and attach other secondaries to point to this master, can I do it soon after "touch /tmp/pgsql.failover" OR should I wait till the secondary has finished replaying all the delivered stream ?
c) How do I know if the replay is over and it is ready for a standalone operation ?

Thanks for your help
Anupama.

Re: streaming replication and recovery

From
Michael Paquier
Date:
On Sat, Apr 12, 2014 at 3:12 PM, Anupama Ramaswamy <anumr_0123@yahoo.com> wrote:
> Lets suppose at this point there is 0 delivery lag but XXXX bytes of replay
> lag.
>
All your answers are here:
http://www.postgresql.org/docs/devel/static/warm-standby.html
"Standby mode is exited and the server switches to normal operation
when pg_ctl promote is run or a trigger file is found (trigger_file).
Before failover, any WAL immediately available in the archive or in
pg_xlog will be restored, but no attempt is made to connect to the
master."

> a) Will the replay complete before the standby stops replicating (because it
> sees the trigger file) ?
Any WAL available in the archives or pg_xlog will be replayed before
the failover.

> b) If I want to run this as new master and attach other secondaries to point
> to this master, can I do it soon after "touch /tmp/pgsql.failover" OR should
> I wait till the secondary has finished replaying all the delivered stream ?
You need to wait until all the WAL has been replayed, which is the
point where failover occurs.

> c) How do I know if the replay is over and it is ready for a standalone
> operation ?
"SELECT pg_is_in_recovery();" returns true if server is still
performing recovery operations.
--
Michael


Re: streaming replication and recovery

From
Anupama Ramaswamy
Date:
Thanks so much. That clarifies.

-Anupama
On Monday, April 14, 2014 12:09 PM, Michael Paquier <michael.paquier@gmail.com> wrote:
On Sat, Apr 12, 2014 at 3:12 PM, Anupama Ramaswamy <anumr_0123@yahoo.com> wrote:
> Lets suppose at this point there is 0 delivery lag but XXXX bytes of replay
> lag.
>
All your answers are here:
http://www.postgresql.org/docs/devel/static/warm-standby.html
"Standby mode is exited and the server switches to normal operation
when pg_ctl promote is run or a trigger file is found (trigger_file).
Before failover, any WAL immediately available in the archive or in
pg_xlog will be restored, but no attempt is made to connect to the
master."

> a) Will the replay complete before the standby stops replicating (because it
> sees the trigger file) ?
Any WAL available in the archives or pg_xlog will be replayed before
the failover.

> b) If I want to run this as new master and attach other secondaries to point
> to this master, can I do it soon after "touch /tmp/pgsql.failover" OR should
> I wait till the secondary has finished replaying all the delivered stream ?
You need to wait until all the WAL has been replayed, which is the
point where failover occurs.


> c) How do I know if the replay is over and it is ready for a standalone
> operation ?

"SELECT pg_is_in_recovery();" returns true if server is still
performing recovery operations.
--
Michael