[HACKERS] Commits don't block for synchronous replication - Mailing list pgsql-hackers

From Xin Zhang
Subject [HACKERS] Commits don't block for synchronous replication
Date
Msg-id CABrsG8j3kPD+kbbsx_isEpFvAgaOBNGyGpsqSjQ6L8vwVUaZAQ@mail.gmail.com
Whole thread Raw
Responses Re: [HACKERS] Commits don't block for synchronous replication  (Michael Paquier <michael.paquier@gmail.com>)
List pgsql-hackers
By setting up synchronized replication between primary and standby, commits on primary should be blocked by function `SyncRepWaitForLSN()` until its effects replicated to the standby.

Currently, `SyncRepWaitForLSN()` fast exits if the `SyncStandbysDefined()` is false, which means GUC `synchronous_standby_names` is empty.

There is a race condition where the GUC is set but not reflected by some backends. However, the `pg_stat_replication` `sync_state` is `sync`, and the `state` is `streaming`, which means the commit should block and get replicated to the mirror before it returns.

The proposed fix is to NOT do the fast exit based on the GUC, but only rely on `WalSndCtl->sync_standbys_defined`.

Here is a quick repro:
- setup async replication
- create a backend
- set breakpoint at SyncRepWaitForLSN()
- Issue a ddl like `create table foo(c int)`
- Then the backend will be break at breakpoint SyncRepWaitForLSN()
- Set the GUC `synchronous_standby_names` to '*' in `postgresql.conf` to trigger the synchronous replication.
- `pg_ctl` reload to signal the backend to reload the conf
- Check the `pg_stat_replication` to ensure `state` is `streaming` and `sync_state` is `sync`.
- In this case, we expect the DDL is blocked.
- Then, step through the breakpoint until line:
```
-> 163 if (!SyncRepRequested() || !SyncStandbysDefined())
   164 return;
```
- Check the content of the GUC as well as the `WalSndCtl->sync_standbys_defined`
```
(lldb) p SyncRepStandbyNames
(char *) $1 = 0x00007fae31c03a80 ""
(lldb) p WalSndCtl->sync_standbys_defined
(bool) $2 = '\x01'
(lldb)
​```
​- As you can see the GUC is still not reflect with new content '*', but the `sync_standbys_defined` is already changed to `1` by checkpointer process.
- If we continue, the function will return, means the foo table will only be created on primary, not on the standby.

If primary crashed at that moment, and failover to standby, the foo table is lost, even though the replication is synced according to `pg_stat_replication` view.

This is the patch we suggest as the fix:

```
diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c
index 8677235411..962772ef8f 100644
--- a/src/backend/replication/syncrep.c
+++ b/src/backend/replication/syncrep.c
@@ -156,11 +156,9 @@ SyncRepWaitForLSN(XLogRecPtr lsn, bool commit)
                mode = Min(SyncRepWaitMode, SYNC_REP_WAIT_FLUSH);

        /*
-        * Fast exit if user has not requested sync replication, or there are no
-        * sync replication standby names defined. Note that those standbys don't
-        * need to be connected.
+        * Fast exit if user has not requested sync replication.
         */
-       if (!SyncRepRequested() || !SyncStandbysDefined())
+       if (!SyncRepRequested())
                return;

        Assert(SHMQueueIsDetached(&(MyProc->syncRepLinks)));
```

A separate question, is the `pg_stat_replication` view the reliable way to find when to failover to a standby, or there are some other ways to ensure the standby is in-sync with the primary?

Thanks,
Xin Zhang, Ashwin Agrawal, and Asim R Praveen

pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: [HACKERS] CREATE COLLATION does not sanitize ICU's BCP 47 language tags. Should it?
Next
From: David Steele
Date:
Subject: Re: [HACKERS] Creating backup history files for backups taken from standbys