Re: [HACKERS] WIP: Failover Slots - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] WIP: Failover Slots
Date
Msg-id 20170812000319.4higzrilqyxstbko@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] WIP: Failover Slots  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] WIP: Failover Slots
List pgsql-hackers
On 2017-08-02 16:35:17 -0400, Robert Haas wrote:
> I actually think failover slots are quite desirable, especially now
> that we've got logical replication in core.  In a review of this
> thread I don't see anyone saying otherwise.  The debate has really
> been about the right way of implementing that.

Given that I presumably was one of the people pushing back more
strongly: I agree with that.  Besides disagreeing with the proposed
implementation our disagreements solely seem to have been about
prioritization.

I still think we should have a halfway agreed upon *design* for logical
failover, before we introduce a concept that's quite possibly going to
be incompatible with that, however. But that doesn't mean it has to
submitted/merged to core.


> - When a standby connects to a master, it can optionally supply a list
> of slot names that it cares about.
> - The master responds by periodically notifying the standby of changes
> to the slot contents using some new replication sub-protocol message.
> - The standby applies those updates to its local copies of the slots.

> So, you could create a slot on a standby with an "uplink this" flag of
> some kind, and it would then try to keep it up to date using the
> method described above.  It's not quite clear to me how to handle the
> case where the corresponding slot doesn't exist on the master, or
> initially does but then it's later dropped, or it initially doesn't
> but it's later created.


I think there's a couple design goals we need to agree upon, before
going into the weeds of how exactly we want this to work. Some of the
axis I can think of are:

- How do we want to deal with cascaded setups, do slots have to be available everywhere, or not?
- What kind of PITR integration do we want? Note that simple WAL based slots do *NOT* provide proper PITR support,
there'snot enough interlock easily available (you'd have to save slots at the end, then increment minRecoveryLSN to a
pointlater than the slot saving)
 
- How much divergence are we going to accept between logical decoding on standbys, and failover slots. I'm probably a
lotcloser to closer than than Craig is.
 
- How much divergence are we going to accept between infrastructure for logical failover, and logical failover via
failoverslots (or however we're naming this)? Again, I'm probably a lot closer to zero than craig is.
 


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] WIP Patch: Pgbench Serialization and deadlock errors
Next
From: Andres Freund
Date:
Subject: Re: [HACKERS] POC: Sharing record typmods between backends