Re: Timeline following for logical slots - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Timeline following for logical slots
Date
Msg-id 20160405060944.iw7xelzke33zm3kl@alap3.anarazel.de
Whole thread Raw
In response to Re: Timeline following for logical slots  (Petr Jelinek <petr@2ndquadrant.com>)
Responses Re: Timeline following for logical slots  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-hackers
On 2016-04-05 05:53:53 +0200, Petr Jelinek wrote:
> On 04/04/16 17:15, Andres Freund wrote:
> >
> >>* Robust sequence decoding and replication. If you were following the later
> >>parts of that discussion you will've seen how fun that's going to be, but
> >>it's the simplest of all of the problems.
> >
> >Unconvinced. People used londiste and slony for years without that, and
> >it's not even remotely at the top of the list of problems with either.
> >
> 
> Londiste and Slony also support physical failover unlike logical decoding
> which is the main point of this discussion, lets not forget that.

Sure. But that level of failover isn't all that hard to implement.

I just want to reiterate: I'm not against failover slots per-se, or
against timeline following[1], or something like that. I think it's just
getting the priorities backwards. Until there's basic features
available, you're going to have a hard time fighting for more advanced
features.

[1]: That said, I don't agree with the approach chosen so far, but
that's just a matter of discussion.


> >
> >>* Robust, seamless DDL replication, so things don't just break randomly.
> >>This makes the other points above look nice and simple by comparison.
> >
> >We're talking about a system which involves logical decoding. Whether
> >you have failover via physical rep or not, doesn't have anything to do
> >with this point.

> It is if you are insisting on using logical rep as solution for failover.

How on earth does this follow? If your replicas don't do DDL
propagation, doing so on the primary -> new primary, it surely isn't a
prerequisite.  I agree DDL rep is pretty crucial, but this argument here
isn't that.


> I also don't buy your argument that it's unsafe to use timeline following on
> logical decoding on replica.

I'm not saying that generally. I'm saying that you can't realistically
do it safely with just the timeline following patch, as committed.


> You can always keep master from moving too far
> ahead by other means (even if you just use dummy slot which is only used for
> this purpose, yes ugly I know).

Yes, that somewhat works. And I think this is actually kinda the design
"failover" slots should take instead.


> If we got failover slots into 9.6 it would
> be better but that does not look realistic at this point. I don't think that
> current design for failover slots is best possible - I think failover slots
> should be created on replica and send their status up to the master which
> would then take them into account when calculating oldest needed catalog
> xmin and lsn (simple way of doing that would be to add this to feedback
> protocol and let physical slot to keep the xmin/lsn as well)

Yes, that's not too far away from what I'm thinking of. If we do it
right that also solves the important problems for decoding on a standby.


> but that does not mean timeline following isn't good thing on it's own
> (not to mention that iterative development is a thing).

Yes, I'm not saying that. The reason it scares me is that it's not a
particularly carefully reviewed patch, and the intended usage as
described by Craig.

We most definitely need timeline following, independent of failover
slots. Most importantly for decoding on a standby.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: More stable query plans via more predictable column statistics
Next
From: Peter Geoghegan
Date:
Subject: Re: Timeline following for logical slots