Re: Synchronous replication patch built on SR - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Synchronous replication patch built on SR
Date
Msg-id 201004302057.o3UKvWj26902@momjian.us
Whole thread Raw
In response to Synchronous replication patch built on SR  (zb@cybertec.at)
Responses Re: Synchronous replication patch built on SR  (Boszormenyi Zoltan <zb@cybertec.at>)
List pgsql-hackers
Please add it to the next commit-fest:
https://commitfest.postgresql.org/action/commitfest_view/inprogress

---------------------------------------------------------------------------

zb@cybertec.at wrote:
> Resending, my ISP lost my mail yesterday. :-(
> 
> ===========================================================
> 
> Hi,
> 
> attached is a patch that does $SUBJECT, we are submitting it for 9.1.
> I have updated it to today's CVS after the "wal_level" GUC went in.
> 
> How does it work?
> 
> First, the walreceiver and the walsender are now able to communicate
> in a duplex way on the same connection, so while COPY OUT is
> in progress from the primary server, the standby server is able to
> issue PQputCopyData() to pass the transaction IDs that were seen
> with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE
> signatures. I did by adding a new protocol message type, with letter
> 'x' that's only acknowledged by the walsender process. The regular
> backend was intentionally unchanged so an SQL client gets a protocol
> error. A new libpq call called PQsetDuplexCopy() which sends this
> new message before sending START_REPLICATION. The primary
> makes a note of it in the walsender process' entry.
> 
> I had to move the TransactionIdLatest(xid, nchildren, children) call
> that computes latestXid earlier in RecordTransactionCommit(), so
> it's in the critical section now, just before the
> XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata)
> call. Otherwise, there was a race condition between the primary
> and the standby server, where the standby server might have seen
> the XLOG_XACT_COMMIT record for some XIDs before the
> transaction in the primary server marked itself waiting for this XID,
> resulting in stuck transactions.
> 
> I have added 3 new options, two GUCs in postgresql.conf and one
> setting in recovery.conf. These options are:
> 
> 1. min_sync_replication_clients = N
> 
> where N is the number of reports for a given transaction before it's
> released as committed synchronously. 0 means completely asynchronous,
> the value is maximized by the value of max_wal_senders. Anything
> in between 0 and max_wal_senders means different levels of partially
> synchronous replication.
> 
> 2. strict_sync_replication = boolean
> 
> where the expected number of synchronous reports from standby
> servers is further limited to the actual number of connected synchronous
> standby servers if the value of this GUC is false. This means that if
> no standby servers are connected yet then the replication is asynchronous
> and transactions are allowed to finish without waiting for synchronous
> reports. If the value of this GUC is true, then transactions wait until
> enough synchronous standbys connect and report back.
> 
> 3. synchronous_slave = boolean (in recovery.conf)
> 
> this instructs the standby server to tell the primary that it's a
> synchronous
> replication server and it will send the committed XIDs back to the primary.
> 
> I also added a contrib module for monitoring the synchronous replication
> but it abuses the procarray.c code by exposing the procArray pointer
> which is ugly. It's either need to be abandoned or moved to core if or when
> this code is discussed enough.  :-)
> 
> Best regards,
> Zolt?n B?sz?rm?nyi

[ Attachment, skipping... ]

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com


pgsql-hackers by date:

Previous
From: Stefan Kaltenbrunner
Date:
Subject: HS - odd process listing
Next
From: Tom Lane
Date:
Subject: Re: HS - odd process listing