Re: Notes on physical replica failover with logical publisher or subscriber - Mailing list pgsql-hackers

From Alexey Kondratov
Subject Re: Notes on physical replica failover with logical publisher or subscriber
Date
Msg-id 51d91e87c48aa3dfa68e7ea6fa53e08a@postgrespro.ru
Whole thread Raw
In response to Notes on physical replica failover with logical publisher or subscriber  (Craig Ringer <craig.ringer@enterprisedb.com>)
List pgsql-hackers
Hi Craig,

On 2020-11-30 06:59, Craig Ringer wrote:
> 
> https://wiki.postgresql.org/wiki/Logical_replication_and_physical_standby_failover
> 

Thank you for sharing these notes. I have not dealt a lot with 
physical/logical replication interoperability, so those were mostly new 
problems for me to know.

One point from the wiki page, which seems clear enough to me:

```
Logical slots can fill pg_wal and can't benefit from archiving. Teach 
the logical decoding page read callback how to use the restore_command 
to retrieve WAL segs temporarily if they're not found in pg_wal...
```

It does not look like a big deal to teach logical decoding process to 
use restore_command, but I have some doubts about how everything will 
perform in the case when we started getting WAL from archive for 
decoding purposes. If we started using restore_command, then subscriber 
lagged long enough to exceed max_slot_wal_keep_size. Taking into account 
that getting WAL files from the archive has an additional overhead and 
that primary continues generating (and archiving) new segments, there is 
a possibility for primary to start doing this double duty forever --- 
archive WAL file at first and get it back for decoding when requested.

Another problem is that there are maybe several active decoders, IIRC, 
so they would have better to communicate in order to avoid fetching the 
same segment twice.

> 
> I tried to address many of these issues with failover slots, but I am
> not trying to beat that dead horse now. I know that at least some
> people here are of the opinion that effort shouldn't go into
> logical/physical replication interoperation anyway - that we should
> instead address the remaining limitations in logical replication so
> that it can provide complete HA capabilities without use of physical
> replication. So for now I'm just trying to save others who go looking
> into these issues some time and warn them about some of the less
> obvious booby-traps.
> 

Another point to add regarding logical replication capabilities to build 
logical-only HA system --- logical equivalent of pg_rewind. At least I 
have not noticed anything after brief reading of the wiki page. IIUC, 
currently there is no way to quickly return ex-primary (ex-logical 
publisher) into HA-cluster without doing a pg_basebackup, isn't it? It 
seems that we should have the same problem here as with physical 
replication --- ex-primary may accept some xacts after promotion of new 
primary, so their history diverges and old primary should be rewound 
before being returned as standby (subscriber).


Regards
-- 
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company



pgsql-hackers by date:

Previous
From: "bucoo@sohu.com"
Date:
Subject: Re: Re: parallel distinct union and aggregate support patch
Next
From: Tom Lane
Date:
Subject: Re: Cost overestimation of foreign JOIN