Thread: BUG #16498: The master inserts data successfully when the standby stopped in synchronous stream replication

The following bug has been logged on the website:

Bug reference:      16498
Logged by:          yi Ding
Email address:      abcxiaod@126.com
PostgreSQL version: 10.12
Operating system:   linux
Description:

Master PGserver:
1、Table test_1 currently has no data
postgres=# select * from test_1;
 a
---
(0 rows)

2、Synchronous streaming replication
postgres=# show synchronous_standby_names ;
 synchronous_standby_names
---------------------------
 sb2019abcd514
(1 row)

3、Stop the standby PGserver

4、Insert data

postgres=# begin;
BEGIN
postgres=# insert into test_1 values(2);
INSERT 0 1
postgres=# insert into test_1 values(3);
INSERT 0 1
postgres=# commit;

commit stuck

5、Restart the database

6、Check table test_1, data is still committed

postgres=# select * from test_1;
 a
---
2
3
(0 rows)

7、Code structure
CommitTransaction->RecordTransactionCommit::
    if ((wrote_xlog && markXidCommitted && synchronous_commit >
SYNCHRONOUS_COMMIT_OFF) || forceSyncCommit || nrels > 0){
         XLogFlush(XactLastRecEnd);
         /*
          * Now we may update the CLOG, if we wrote a COMMIT record above
          */
        if (markXidCommitted)
            TransactionIdCommitTree(xid, nchildren, children);
    }else{
        XLogSetAsyncXactLSN(XactLastRecEnd);
        if (markXidCommitted)
            TransactionIdAsyncCommitTree(xid, nchildren, children,
XactLastRecEnd);
    }
    ...
    if (wrote_xlog && markXidCommitted)
        SyncRepWaitForLSN(XactLastRecEnd, true);
          if (!SyncRepRequested())
              return;
          for(;;){
               
        }
    ...
#define SyncRepRequested() \
    (max_wal_senders > 0 && synchronous_commit >
SYNCHRONOUS_COMMIT_LOCAL_FLUSH)


PG Bug reporting form <noreply@postgresql.org> writes:
> [ $subject ]

I don't think this is a bug; you are just misunderstanding the guarantee
that synchronous replication offers.  In syncrep mode, the primary server
commits and then waits for some number of standbys to acknowledge having
replicated that commit action before it tells the client the commit is
complete.  Killing the primary during that wait does not, and cannot,
cause the commit not to have happened.

What I think you are looking for is two-phase commit, which is a whole
different animal that is far more complex and expensive than syncrep.

            regards, tom lane