Thread: asynchronous commit&synchronous replication

asynchronous commit&synchronous replication

From
Konstantin Knizhnik
Date:

Hi hackers,

Logical replication apply worker by default switches off asynchronous commit. Cite from documentation of subscription parameters:

```

synchronous_commit (enum) 

The value of this parameter overrides the synchronous_commit setting within this subscription's apply worker processes. The default value is off.
It is safe to use off for logical replication: If the subscriber loses transactions because of missing synchronization, the data will be sent again from the publisher.

```

So subscriber can confirm transaction which are not persisted. But consider a PostgreSQL HA setup with:

  • primary node
  • (cold) standby node streaming WAL from the primary
  • synchronous replication enabled, so that you get zero data loss if the primary dies
  • the primary/standby cluster is a subscriber to a remote PostgreSQL server
It can happen that:
  • the primary streams some transactions from the remote PostgreSQL, with logical replication
  • the primary crashes. Failover to the standby happens
  • the standby tries to stream the transactions from the subscriber. But some transactions are missed, because the primary had already reported a higher flush LSN.


I wonder if such scenario is considered as an "expected behavior" or "bug" by community?
It seems to be quite easily fixed (see attached patch).

So should we take in account sync replication in LR apply worker or not?

Thanks to Heikki Linnakangas <hlinnaka@iki.fi> for describing this scenario and Arseny Sher <ars@neon.tech> for providing the patch.

Attachment

Re: asynchronous commit&synchronous replication

From
"Andrey M. Borodin"
Date:

> On 10 Aug 2024, at 17:25, Konstantin Knizhnik <knizhnik@garret.ru> wrote:
>
> So should we take in account sync replication in LR apply worker or not?

There was some relevant discussion of this topic on PGCon2020 Unconference [0].
My recollection is that it would be nice to have LR slot setting akin to synchronous_standby_names which describes what
kindof durability guarantees should be met by streamed data. 


Best regards, Andrey Borodin.

[0]
https://wiki.postgresql.org/wiki/PgCon_2020_Developer_Unconference/Edge_cases_of_synchronous_replication_in_HA_solutions