Home > mailing lists

Re: A few nuances about specifying the timeline with START_REPLICATION - Mailing list pgsql-hackers

From	Jeff Davis
Subject	Re: A few nuances about specifying the timeline with START_REPLICATION
Date	June 18, 2021 19:55:17
Msg-id	484d05905c22c9ba5150bdde2511b161bb12c8aa.camel@j-davis.com Whole thread Raw
In response to	Re: A few nuances about specifying the timeline with START_REPLICATION (Heikki Linnakangas <hlinnaka@iki.fi>)
Responses	Re: A few nuances about specifying the timeline with START_REPLICATION
List	pgsql-hackers

Tree view

On Fri, 2021-06-18 at 21:48 +0300, Heikki Linnakangas wrote:
> On 18/06/2021 20:27, Jeff Davis wrote:
> We could teach it to look into the timeline history to find the
> correct 
> file, though.

That's how recovery_target_timeline behaves, and it would match my
intuition better if START_REPLICATION behaved that way.

> If the client asks for a historic timeline, the replication will
> stop 
> when it reaches the end of that timeline. In hindsight, I think it
> would 
> make more sense to send a message to the client to say that it's 
> switching to a new timeline, and continue streaming from the new
> timeline.

Why is it important for the standby to be told explicitly in the
protocol about timeline switches? If it is important, why only for
historical timelines?

> Hmm, the timeline in the START_REPLICATION command is not specifying
> a 
> recovery target timeline, so I don't think "latest" or "current"
> make 
> much sense there. Per above, it just tells the server which timeline
> the 
> requested starting point belongs to, so it's actually redundant.

That's not very clear from the docs: "if TIMELINE option is specified,
streaming starts on timeline tli...".

Part of the confusion is that there's not a good distinction in
terminology between:
   1. a timeline ID, which is a specific segment of a timeline
   2. a timeline made up of the given timeline ID and all its
ancestors, terminating at the given ID
   3. the timeline made up of the current ID, all ancestor IDs, and all
descendent IDs that the current active primary switches to
   4. the set of all timelines that contain a given ID

It seems you are saying that replication only concerns itself with #3,
which does not require a timeline ID at all. That seems basically
correct for now, but since we already document the protocol to take a
timeline, it makes sense to me to just have the primary serve it if
possible.

If we (continue to?) allow timelines for replication, it will start to
treat the primary like an archive. That might not be quite what was
intended, but could be powerful. You could imagine a special archive
that implements the replication protocol, and have replicas directly
off the archive, or maybe doing PITR off the archive.

Regards,
    Jeff Davis

pgsql-hackers by date:

From: John Naylor
Date: 18 June 2021, 19:54:40
Subject: Re: PoC: Using Count-Min Sketch for join cardinality estimation

From: Tomas Vondra
Date: 18 June 2021, 20:24:45
Subject: Re: PoC: Using Count-Min Sketch for join cardinality estimation

Re: A few nuances about specifying the timeline with START_REPLICATION - Mailing list pgsql-hackers

Previous

Next