Thread: Synchronous replication patch built on SR
Resending, my ISP lost my mail yesterday. :-( =========================================================== Hi, attached is a patch that does $SUBJECT, we are submitting it for 9.1. I have updated it to today's CVS after the "wal_level" GUC went in. How does it work? First, the walreceiver and the walsender are now able to communicate in a duplex way on the same connection, so while COPY OUT is in progress from the primary server, the standby server is able to issue PQputCopyData() to pass the transaction IDs that were seen with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE signatures. I did by adding a new protocol message type, with letter 'x' that's only acknowledged by the walsender process. The regular backend was intentionally unchanged so an SQL client gets a protocol error. A new libpq call called PQsetDuplexCopy() which sends this new message before sending START_REPLICATION. The primary makes a note of it in the walsender process' entry. I had to move the TransactionIdLatest(xid, nchildren, children) call that computes latestXid earlier in RecordTransactionCommit(), so it's in the critical section now, just before the XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata) call. Otherwise, there was a race condition between the primary and the standby server, where the standby server might have seen the XLOG_XACT_COMMIT record for some XIDs before the transaction in the primary server marked itself waiting for this XID, resulting in stuck transactions. I have added 3 new options, two GUCs in postgresql.conf and one setting in recovery.conf. These options are: 1. min_sync_replication_clients = N where N is the number of reports for a given transaction before it's released as committed synchronously. 0 means completely asynchronous, the value is maximized by the value of max_wal_senders. Anything in between 0 and max_wal_senders means different levels of partially synchronous replication. 2. strict_sync_replication = boolean where the expected number of synchronous reports from standby servers is further limited to the actual number of connected synchronous standby servers if the value of this GUC is false. This means that if no standby servers are connected yet then the replication is asynchronous and transactions are allowed to finish without waiting for synchronous reports. If the value of this GUC is true, then transactions wait until enough synchronous standbys connect and report back. 3. synchronous_slave = boolean (in recovery.conf) this instructs the standby server to tell the primary that it's a synchronous replication server and it will send the committed XIDs back to the primary. I also added a contrib module for monitoring the synchronous replication but it abuses the procarray.c code by exposing the procArray pointer which is ugly. It's either need to be abandoned or moved to core if or when this code is discussed enough. :-) Best regards, Zoltán Böszörményi
Attachment
Please add it to the next commit-fest: https://commitfest.postgresql.org/action/commitfest_view/inprogress --------------------------------------------------------------------------- zb@cybertec.at wrote: > Resending, my ISP lost my mail yesterday. :-( > > =========================================================== > > Hi, > > attached is a patch that does $SUBJECT, we are submitting it for 9.1. > I have updated it to today's CVS after the "wal_level" GUC went in. > > How does it work? > > First, the walreceiver and the walsender are now able to communicate > in a duplex way on the same connection, so while COPY OUT is > in progress from the primary server, the standby server is able to > issue PQputCopyData() to pass the transaction IDs that were seen > with XLOG_XACT_COMMIT or XLOG_XACT_PREPARE > signatures. I did by adding a new protocol message type, with letter > 'x' that's only acknowledged by the walsender process. The regular > backend was intentionally unchanged so an SQL client gets a protocol > error. A new libpq call called PQsetDuplexCopy() which sends this > new message before sending START_REPLICATION. The primary > makes a note of it in the walsender process' entry. > > I had to move the TransactionIdLatest(xid, nchildren, children) call > that computes latestXid earlier in RecordTransactionCommit(), so > it's in the critical section now, just before the > XLogInsert(RM_XACT_ID, XLOG_XACT_COMMIT, rdata) > call. Otherwise, there was a race condition between the primary > and the standby server, where the standby server might have seen > the XLOG_XACT_COMMIT record for some XIDs before the > transaction in the primary server marked itself waiting for this XID, > resulting in stuck transactions. > > I have added 3 new options, two GUCs in postgresql.conf and one > setting in recovery.conf. These options are: > > 1. min_sync_replication_clients = N > > where N is the number of reports for a given transaction before it's > released as committed synchronously. 0 means completely asynchronous, > the value is maximized by the value of max_wal_senders. Anything > in between 0 and max_wal_senders means different levels of partially > synchronous replication. > > 2. strict_sync_replication = boolean > > where the expected number of synchronous reports from standby > servers is further limited to the actual number of connected synchronous > standby servers if the value of this GUC is false. This means that if > no standby servers are connected yet then the replication is asynchronous > and transactions are allowed to finish without waiting for synchronous > reports. If the value of this GUC is true, then transactions wait until > enough synchronous standbys connect and report back. > > 3. synchronous_slave = boolean (in recovery.conf) > > this instructs the standby server to tell the primary that it's a > synchronous > replication server and it will send the committed XIDs back to the primary. > > I also added a contrib module for monitoring the synchronous replication > but it abuses the procarray.c code by exposing the procArray pointer > which is ugly. It's either need to be abandoned or moved to core if or when > this code is discussed enough. :-) > > Best regards, > Zolt?n B?sz?rm?nyi [ Attachment, skipping... ] > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com
Hi, Bruce Momjian írta: > Please add it to the next commit-fest: > > https://commitfest.postgresql.org/action/commitfest_view/inprogress > it was already added two days ago: https://commitfest.postgresql.org/action/patch_view?id=297 Best regards, Zoltán Böszörményi -- Bible has answers for everything. Proof: "But let your communication be, Yea, yea; Nay, nay: for whatsoever is more than these cometh of evil." (Matthew 5:37) - basics of digital technology. "May your kingdom come" - superficial description of plate tectonics ---------------------------------- Zoltán Böszörményi Cybertec Schönig & Schönig GmbH http://www.postgresql.at/