Thread: how to identify the timeline of specified recovery_target_time whendo multiple PITR

I read the document about recovery configuration, it says:

   recovery_target_timeline (string)

   ...

   Other than that you only need to set this parameter in complex re-
   recovery situations, where you need to return to a state that itself
   was reached after a point-in-time recovery. See Section 25.3.5 for
   discussion.

   ...

Andin section 25.3.5 it says:

   ...

   If you wish to recover into some child timeline (that is, you want
   to return to some state that was itself generated after a recovery
   attempt), you need to specify the target timeline ID in
   recovery.conf.

   ...

Therefore, suppose I am going to recover to a point of some child
timeline, to identify the point, I have to specify either
recovery_target_name or recovery_target_time, and also specify the
recovery_target_timeline.

It is more like a tuple like (recovery_target_time,
recovery_target_timeline), that specify a real point among all history
branches. Am I understand this correctly?

If yes, what I want to ask is that, though the timeline is increasing
between different recovery, but each timestamp corresponds to a
timeline ID, one by one. So if I get a recovery_target_time, why should
I still specify the recovery_target_timeline? 

Suppose following illustration:

         A     B
BASE-----+-----+------o1 (recover to A)                              1
         |     |           C
         +.....|.......----+---o2 (regret, recover to B)             2
               |           |    
               +...........|..------o3 (regret again, recover to C)  3
                           | 
                           +........----                             4


Legend:

   BASE: basebackup
   A-Z: recovery point
   ---: active wal histroy (continuous among branches)
   ...: inactive wal history
   oN: point to do PITR

If am at "o3", and I want to recover to "C", if I don't specify
timeline ID, then i will ends up with state as "o1". Only if I specify
timeline ID as "2", then I can get the state as I expect (as
illustrated). 

Why not just recover along the history from timeline 1 to the timeline
specified by recovery target? The only reason I can imagine is that
there is a chance that there are two active timeline at same point, but
what is the use case of that(if it exists)?




magodo wrote:
> I read the document about recovery configuration, it says:

[...]

> Therefore, suppose I am going to recover to a point of some child
> timeline, to identify the point, I have to specify either
> recovery_target_name or recovery_target_time, and also specify the
> recovery_target_timeline.
> 
> It is more like a tuple like (recovery_target_time,
> recovery_target_timeline), that specify a real point among all history
> branches. Am I understand this correctly?

I think you understood the concept well enough.

> If yes, what I want to ask is that, though the timeline is increasing
> between different recovery, but each timestamp corresponds to a
> timeline ID, one by one. So if I get a recovery_target_time, why should
> I still specify the recovery_target_timeline? 
> 
> Suppose following illustration:
> 
>          A     B
> BASE-----+-----+------o1 (recover to A)                              1
>          |     |           C
>          +.....|.......----+---o2 (regret, recover to B)             2
>                |           |    
>                +...........|..------o3 (regret again, recover to C)  3
>                            | 
>                            +........----                             4

Consider this       ^   point in time.
                    |

Suppose you specify this point in time as recovery_target_time.

Then it is not clear which of the timelines you want to follow.
The point of time exists in timeline 1, 2 and 3.

In other words, should recovery use the WAL from
0000000100000ABC00000012, 0000000200000ABC00000012 or 0000000300000ABC00000012?

By default, recovery will stay on the timeline where it started.
If you want to go to timeline 2 or 3, you have to specify
recovery_target_timeline.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



On Wed, 2018-10-03 at 08:06 +0200, Laurenz Albe wrote:
> magodo wrote:
> > I read the document about recovery configuration, it says:
> 
> [...]
> 
> > Therefore, suppose I am going to recover to a point of some child
> > timeline, to identify the point, I have to specify either
> > recovery_target_name or recovery_target_time, and also specify the
> > recovery_target_timeline.
> > 
> > It is more like a tuple like (recovery_target_time,
> > recovery_target_timeline), that specify a real point among all
> > history
> > branches. Am I understand this correctly?
> 
> I think you understood the concept well enough.
> 
> > If yes, what I want to ask is that, though the timeline is
> > increasing
> > between different recovery, but each timestamp corresponds to a
> > timeline ID, one by one. So if I get a recovery_target_time, why
> > should
> > I still specify the recovery_target_timeline? 
> > 
> > Suppose following illustration:
> > 
> >          A     B
> > BASE-----+-----+------o1 (recover to
> > A)                              1
> >          |     |           C
> >          +.....|.......----+---o2 (regret, recover to
> > B)             2
> >                |           |    
> >                +...........|..------o3 (regret again, recover to
> > C)  3
> >                            | 
> >                            +........--
> > --                             4
> 
> Consider this       ^   point in time.
>                     |
> 
> Suppose you specify this point in time as recovery_target_time.
> 
> Then it is not clear which of the timelines you want to follow.
> The point of time exists in timeline 1, 2 and 3.
> 
> In other words, should recovery use the WAL from
> 0000000100000ABC00000012, 0000000200000ABC00000012 or
> 0000000300000ABC00000012?
> 
> By default, recovery will stay on the timeline where it started.
> If you want to go to timeline 2 or 3, you have to specify
> recovery_target_timeline.
> 
> Yours,
> Laurenz Albe

For me, the specified point corresponds to timeline 1, because at that
time, the timeline is 1 indeed (when there is no timeline 2 branched
out yet). 

So in other word, my expectation is like I want to get the first
timeline which covers the specified time.

As you mentioned, there is no active timeline concept in postgres, then
what is the best practice to meet my expectation? Do I have to somehow
record the timestamp when archived wal's timeline has been changed,
then compare the specified recovery target time with those records to
find the first timeline which covers that time?

Thank you in anticipation!

---
Magodo




magodo wrote:
> > By default, recovery will stay on the timeline where it started.
> > If you want to go to timeline 2 or 3, you have to specify
> > recovery_target_timeline.
> > 
> For me, the specified point corresponds to timeline 1, because at that
> time, the timeline is 1 indeed (when there is no timeline 2 branched
> out yet). 
> 
> So in other word, my expectation is like I want to get the first
> timeline which covers the specified time.
> 
> As you mentioned, there is no active timeline concept in postgres, then
> what is the best practice to meet my expectation? Do I have to somehow
> record the timestamp when archived wal's timeline has been changed,
> then compare the specified recovery target time with those records to
> find the first timeline which covers that time?

As I wrote, that is the default behavior; PostgreSQL will stay on
the timeline that was active when recovery was started.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



On Wed, 2018-10-10 at 18:21 +0200, Laurenz Albe wrote:
> magodo wrote:
> > > By default, recovery will stay on the timeline where it started.
> > > If you want to go to timeline 2 or 3, you have to specify
> > > recovery_target_timeline.
> > > 
> > For me, the specified point corresponds to timeline 1, because at
> > that
> > time, the timeline is 1 indeed (when there is no timeline 2
> > branched
> > out yet). 
> > 
> > So in other word, my expectation is like I want to get the first
> > timeline which covers the specified time.
> > 
> > As you mentioned, there is no active timeline concept in postgres,
> > then
> > what is the best practice to meet my expectation? Do I have to
> > somehow
> > record the timestamp when archived wal's timeline has been changed,
> > then compare the specified recovery target time with those records
> > to
> > find the first timeline which covers that time?
> 
> As I wrote, that is the default behavior; PostgreSQL will stay on
> the timeline that was active when recovery was started.
> 
> Yours,
> Laurenz Albe

OK... Just take another example:

         A     B
BASE-----+-----+------o1 (recover to A)                              1
         |     |           C
         +.....|.......----+---o2 (regret, recover to B)             2
               |           |    
               +...........|..------o3 (regret again, recover to C)  3
                           | 
                           +........----                             4


Suppose I'm at "o3" and want to recover to point "C". Because I want to
recover to the first timeline which covers this time point, it means I
wish to recover to timeline 2. Then the same question:

> > what is the best practice to meet my expectation? Do I have to
> > somehow
> > record the timestamp when archived wal's timeline has been changed,
> > then compare the specified recovery target time with those records
> > to
> > find the first timeline which covers that time?

---
Magodo
 




magodo wrote:
> OK... Just take another example:
> 
>          A     B
> BASE-----+-----+------o1 (recover to A)                              1
>          |     |           C
>          +.....|.......----+---o2 (regret, recover to B)             2
>                |           |    
>                +...........|..------o3 (regret again, recover to C)  3
>                            | 
>                            +........----                             4
> 
> 
> Suppose I'm at "o3" and want to recover to point "C". Because I want to
> recover to the first timeline which covers this time point, it means I
> wish to recover to timeline 2.

Ah, I finally understand your question.

You assume tht timeline 1 and 2 have ended (that's how you drew it),
and that consequently timeline 3 is the "earliest existing" timeline,
so why doesn't PostgreSQL choose that one automatically?

Even though you drew it that way, timeline 1 and 2 have not ended, in
a way.  There may be more on them.  How should PostgreSQL know what is
the last WAL entry on a certain timeline?  For that, it would have to
restore and examine *all* WAL segments on that timeline until that fails.

But that is unreasonable because of the potential amount of time
and work involved.  Rather, PostgreSQL has to decide at the point where
timeline 2 branches off which one it should follow.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



On Thu, 2018-10-11 at 06:35 +0200, Laurenz Albe wrote:
> magodo wrote:
> > OK... Just take another example:
> > 
> >          A     B
> > BASE-----+-----+------o1 (recover to
> > A)                              1
> >          |     |           C
> >          +.....|.......----+---o2 (regret, recover to
> > B)             2
> >                |           |    
> >                +...........|..------o3 (regret again, recover to
> > C)  3
> >                            | 
> >                            +........--
> > --                             4
> > 
> > 
> > Suppose I'm at "o3" and want to recover to point "C". Because I
> > want to
> > recover to the first timeline which covers this time point, it
> > means I
> > wish to recover to timeline 2.
> 
> Ah, I finally understand your question.
> 
> You assume tht timeline 1 and 2 have ended (that's how you drew it),
> and that consequently timeline 3 is the "earliest existing" timeline,
> so why doesn't PostgreSQL choose that one automatically?
> 
> Even though you drew it that way, timeline 1 and 2 have not ended, in
> a way.  There may be more on them.  How should PostgreSQL know what
> is
> the last WAL entry on a certain timeline?  For that, it would have to
> restore and examine *all* WAL segments on that timeline until that
> fails.
> 
> But that is unreasonable because of the potential amount of time
> and work involved.  Rather, PostgreSQL has to decide at the point
> where
> timeline 2 branches off which one it should follow.
> 
> Yours,
> Laurenz Albe

Thank God I finnally conveied my meaning :)

For one cluster, if , for example, it recovered to "A" at "o1", then
the switched WAL(in case of stop-then-recover) or .partial
corresponding WAL(in case of promote) is the last WAL of the timeline1,
and it makes no sense to consider timeline1 will continue grow after
"o1", because it has ended.

You meantioned: "There may be more on them.".

Would you please give me an example? I must have missed something
out...

---
Magodo




magodo wrote:
> On Thu, 2018-10-11 at 06:35 +0200, Laurenz Albe wrote:
> > magodo wrote:
> > > OK... Just take another example:
> > > 
> > >          A     B
> > > BASE-----+-----+------o1 (recover to
> > > A)                              1
> > >          |     |           C
> > >          +.....|.......----+---o2 (regret, recover to
> > > B)             2
> > >                |           |    
> > >                +...........|..------o3 (regret again, recover to
> > > C)  3
> > >                            | 
> > >                            +........--
> > > --                             4
> > > 
> > > 
> > > Suppose I'm at "o3" and want to recover to point "C". Because I want to
> > > recover to the first timeline which covers this time point, it means I
> > > wish to recover to timeline 2.
> > 
> > Ah, I finally understand your question.
> > 
> > You assume tht timeline 1 and 2 have ended (that's how you drew it),
> > and that consequently timeline 3 is the "earliest existing" timeline,
> > so why doesn't PostgreSQL choose that one automatically?
> > 
> > Even though you drew it that way, timeline 1 and 2 have not ended, in
> > a way.  There may be more on them.  How should PostgreSQL know what is
> > the last WAL entry on a certain timeline?  For that, it would have to
> > restore and examine *all* WAL segments on that timeline until that fails.
> > 
> > But that is unreasonable because of the potential amount of time
> > and work involved.  Rather, PostgreSQL has to decide at the point where
> > timeline 2 branches off which one it should follow.
> 
> Thank God I finnally conveied my meaning :)
> 
> For one cluster, if , for example, it recovered to "A" at "o1", then
> the switched WAL(in case of stop-then-recover) or .partial
> corresponding WAL(in case of promote) is the last WAL of the timeline1,
> and it makes no sense to consider timeline1 will continue grow after
> "o1", because it has ended.
> 
> You meantioned: "There may be more on them.".
> 
> Would you please give me an example? I must have missed something
> out...

The timeline is determined before any WAL segments are restored, because
the timeline history (as persisted in the *.history files) determines
which WAL segments will be restored.

You seem to assume that the point A at which recovery ended is the end of the
WAL stream of timeline 1, but you cannot know that:
- it could have been a point-in-time-recovery
- it could be that during the first recovery attempt some WAL archives
  were missing, which caused a premature end of recovery,
  but they are there at the second attempt.

Indeed, your sentence

> if , for example, it recovered to "A" at "o1", then
> the switched WAL(in case of stop-then-recover) or .partial
> corresponding WAL(in case of promote) is the last WAL of the timeline1

seems to contradict your drawing, which has B after A on timeline 1.

Example:
Assume that timeline 1 reaches to 000000010000001500000030.
We recover to point A, which is in the middle of 000000010000001500000020,
and there branch to timeline 2.
After some time, we decide to recover again, starting from a
checkpoint in 000000010000001500000010.
We want to recover to 2018-10-11 12:00:00.

How can you know how many WAL segments there are on timeline 1, and if
there is one that extends past 2018-10-11 12:00:00 or not?

The only way would be to restore and read them all *before* you even
decide which timeline you want to choose for recovery.

That would be forbiddingly expensive, wouldn't it?

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



On Thu, 2018-10-11 at 19:53 +0200, Laurenz Albe wrote:

Indeed, your sentence

if , for example, it recovered to "A" at "o1", then
the switched WAL(in case of stop-then-recover) or .partial
corresponding WAL(in case of promote) is the last WAL of the timeline1

seems to contradict your drawing, which has B after A on timeline 1.

Err... I mean "o1" is the end of timelien1, and the last WAL is the one "o1" was on just before recovering to "A".


Example:
Assume that timeline 1 reaches to 000000010000001500000030.
We recover to point A, which is in the middle of 000000010000001500000020,
and there branch to timeline 2.
After some time, we decide to recover again, starting from a
checkpoint in 000000010000001500000010.
We want to recover to 2018-10-11 12:00:00.

How can you know how many WAL segments there are on timeline 1, and if
there is one that extends past 2018-10-11 12:00:00 or not?

This is the exact problem I want to figure out. My approach is as you said, I will parse each archived WAL segment via `pg_xlogdump -r Transaction`, and try to find the first least earliest WAL against the specified time. This is a linear search, which has complexity of O(n).

So if you want to recover to that point of time, how do you choose the timeline?

---
Magodo
magodo wrote:
> > How can you know how many WAL segments there are on timeline 1, and if
> > there is one that extends past 2018-10-11 12:00:00 or not?
> 
> This is the exact problem I want to figure out. My approach is as you said,
> I will parse each archived WAL segment via `pg_xlogdump -r Transaction`,
> and try to find the first least earliest WAL against the specified time.
> This is a linear search, which has complexity of O(n).

That means that the time spent grows linearly, but it is still a lot of time
if there are a lot of WAL archives.

> So if you want to recover to that point of time, how do you choose the timeline?

PostgreSQL doesn't provide auch a functionality, and I can think of three reasons:
1) the high cost as mentioned above
2) during archive recovery, it does not happen very often that you have to
   choose between timelines at all.  It is more typical for streaming replication.
3) nobody saw a use case for such a functionality

If you really need this functionality, you'll have to write it yourself.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com