Thread: pg9.6 when is a promoted cluster ready to accept "rewind" request?

pg9.6 when is a promoted cluster ready to accept "rewind" request?

From
magodo
Date:
Dear supporters,

I'm writing some scripts to implement manual failover. I have two
clusters(let's say p1 and p2), where one is primary(e.g. p1) and the
other is standby(e.g. p2). The way to do manual failover is straight
forward, like following:

1. promote on p2
2. wait `pg_is_ready()` on p2
3. rewind on p1
4. prepare a recovery.conf on p1
5. start p1

This should ends up with the same HA but role switched.

It works find if I manually do each step. 

But if I call each step sequentially in a script, it will fail after I
switched role for the 1st time and want to switch back.

For example, with a fresh setup(timeline starts from 1), I firstly
tried to switch role, and it works. I get p1 as standby following p2,
which is the priamry. Then I switch role again and error occurs, the
error message is like:

   < 2018-11-12 04:59:24.547 UTC > LOG:  entering standby mode
   < 2018-11-12 04:59:24.555 UTC > LOG:  redo starts at 0/4000028
   < 2018-11-12 04:59:24.566 UTC > LOG:  started streaming WAL from
   primary at 0/5000000 on timeline 1
   < 2018-11-12 04:59:24.566 UTC > FATAL:  could not receive data from
   WAL stream: ERROR:  requested WAL segment 000000020000000000000005
   has already been
   removed                                                             
                                                      

   < 2018-11-12 04:59:24.577 UTC > LOG:  started streaming WAL from
   primary at 0/5000000 on timeline 1
   < 2018-11-12 04:59:24.577 UTC > FATAL:  could not receive data from
   WAL stream: ERROR:  requested WAL segment 000000020000000000000005
   has already been
   removed                                                             
                                                      

   < 2018-11-12 04:59:25.413 UTC > FATAL:  the database system is
   starting up
   < 2018-11-12 04:59:26.416 UTC > FATAL:  the database system is
   starting up
   < 2018-11-12 04:59:27.419 UTC > FATAL:  the database system is
   starting up
   < 2018-11-12 04:59:28.422 UTC > FATAL:  the database system is
   starting up
   < 2018-11-12 04:59:29.425 UTC > FATAL:  the database system is
   starting up
   < 2018-11-12 04:59:29.576 UTC > LOG:  started streaming WAL from
   primary at 0/5000000 on timeline 1
   < 2018-11-12 04:59:29.576 UTC > FATAL:  could not receive data from
   WAL stream: ERROR:  requested WAL segment 000000020000000000000005
   has already been removed              


the pg_rewind output is as follow:

   servers diverged at WAL position 0/5000060 on timeline 1         
   rewinding from last common checkpoint at 0/4000060 on timeline 1 

From the log, it seems the wrong timeline of divergence is evaluated,
it should be timeline 2 rather than 1. 

Furthermore, if I add a `sleep` between step 2(promote) and step
3(rewind), it just works. 

Hence, I suspect the promoted cluster is not ready to be used for
rewinding right after promote. Is there anything I need to wait before
I rewind the old primary against this promoted cluster?

Thank you in advance!

---
magodo




Re: pg9.6 when is a promoted cluster ready to accept "rewind" request?

From
talk to ben
Date:
Hi, 

You might have to wait for pg_is_in_recovery to be false after the promotion. (in 9.6 pg_ctl promote doesn't wait for promotion to complete unlike 10).  [1]

You should CHECKOINT between 2 and 3. (or wait for the first checkpoint to finish)
In the thread [2], Michael Paquier explains that:

" This makes the promoted standby update its
timeline number in the on-disk control file, which is used by pg_rewind
to check if a rewind needs to happen or not. "