Re: Timeline following for logical slots - Mailing list pgsql-hackers

From Petr Jelinek
Subject Re: Timeline following for logical slots
Date
Msg-id 5706A7EC.2000709@2ndquadrant.com
Whole thread Raw
In response to Re: Timeline following for logical slots  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 07/04/16 17:32, Robert Haas wrote:
>>> Second, I'm not sure whether it was a good design decision to make
>>> logical slots a special kind of object that sit off to the side,
>>> neither configuration (like postgresql.conf) nor WAL-protected data
>>> (like pg_clog and the data files themselves), but it was certainly a
>>> very deliberate decision.  I sort of expected them to be WAL-logged,
>>> but Andres argued (not unconvincingly) that we'd want to have slots on
>>> standbys, and making them WAL-logged would preclude that.
>>
>> Yeah. I understand the reasons for that decision. Per an earlier reply I
>> think we can avoid making them WAL-logged so they can be used on standbys
>> and still achieve usable failover support on physical replicas.
>
> I think at one point we may have discussed doing this via additional
> side-channel protocol messages.  Is that what you are thinking about
> now, or something else?
>

I think the most promising idea was to use pull model instead of push 
model for slot updates and using feedback to tell master how far the 
oldest slot on standby is. This has the additional advantage of solving 
the biggest problem with decoding on standby (keeping old enough catalog 
xid and lsn). And also solves your concern of propagating through whole 
cascade as this can be done in more controllable way (from bottom up in 
the replication chain).

> For version one, I would cut all of
> the stuff that allows data to be sent in any format other than text,
> and just use in/outfuncs all the time.

Agreed here, I think doing the binary transfer for base types, etc is 
just optimization, not necessity. This is one of the reasons why I 
wanted to get bit broader feedback on the protocol - to get some kind of 
consensus about what we want now, what we might want in the future and 
how to handle getting from now to future without too much complexity or 
breakage. For example currently we have flags for every protocol message 
which I am not sure are completely necessary there, OTOH we probably 
don't want to bump protocol version with every new version of Postgres 
either.

>
> I do generally think that logical decoding relies too much on trying
> to set up situations where it will never fail, and I've said from the
> beginning that it should make more provision to cope with failure
> rather than just trying to avoid it.  If logical decoding never
> breaks, great.  But the approach I would favor is to set things up so
> that it automatically reclones if there is a replication break, and
> then as an optimization project, try to eliminate those cases one by
> one.
>

Well that really depends. I've seen way too many cases where people use 
logical replication as transport mechanism, rather than replication 
where the destination is same as source, and in those scenarios there is 
often no way to "reclone" either because the historical data are no 
longer on the source or because the data on the target were already 
updated after they've been replicated. But in general the idea of 
recovering from error rather than being hell bent on preventing it is 
something I am pushing as well. For example it should be easier to look 
at what's in replication queue and remove things from there if needed.

--   Petr Jelinek                  http://www.2ndQuadrant.com/  PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: Using quicksort for every external sort run
Next
From: Tom Lane
Date:
Subject: Re: Performance improvement for joins where outer side is unique