Thread: Review: Patch for Synchronous Replication
This is a basic review of Fujii Masao's synchronous replication patch from http://archives.postgresql.org/message-id/AANLkTik2c3kV7HgJnM4MjkCWVG-QvDJXD3iR9TqsCnpP@mail.gmail.com Review Description ================== This patch extends existing asynchronous streaming replication with options to enable different levels of synchronous behaviour. On the primary, the is a new standbys.conf file containing a manifest of standbys it seeks a form a confirmation from before committing transactions. These contain standby name, level of synchronicity (if it wasn't a word, it is now), and a timeout value specifying how long the primary is willing to wait for confirmation before aborting the transaction. On the standby, the new option "standby_name" has been added to publish a named identity of the standby server to the primary. Patch application ================= The patch applies cleanly to HEAD. All regression tests pass successfully as expected. Testing ======= I configured the primary's standby.conf to provide synchronous replication to a single slave using fsync as it's replication level and a timeout of 100ms. The wal_level was set to hot_standby in postgresql.conf. The standby's recovery.conf had its standby_name value set to match that expected by the primary's standby.conf. This is summarised as follows: ## primary postgresql.conf: wal_level = hot_standby max_wal_senders = 2 wal_keep_segments = 2 ## primary standbys.conf # STANDBY NAME SYNCHRONOUS TIMEOUT cougar fsync 100ms ## primary pg_hba.conf # TYPE DATABASE USER CIDR-ADDRESS METHOD host replication postgres 192.168.102.17/32 trust ## standby postgresql.conf hot_standby = on ## standby recovery.conf standby_name = 'cougar' standby_mode = 'on' primary_conninfo = 'host=192.168.102.125 port=5432' Issues ====== The primary started up fine and was accepting connections. A base backup was taken for the standby, and after configuring the standby, I attempted to bring it up, but received the following error: postgres@cougar:~/project/data$ pg_ctl start server starting postgres@cougar:~/project/data$ LOG: database system was shut down in recovery at 2010-09-29 22:52:24 BST LOG: entering standby mode LOG: redo starts at 0/1000020 LOG: record with zero length at 0/10000B0 FATAL: could not connect to the primary server: invalid connection option "standby_name" I believe I am using the correct parameter as I followed the additional documentation provided in the patch. Conclusion ========== It at least appears to me that it isn't functional in its current state, or there is setup information missing from the documentation.... or I've make a stupid mistake somewhere (likely). -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935
On 29 September 2010 23:29, Thom Brown <thom@linux.com> wrote: > This is a basic review of Fujii Masao's synchronous replication patch > from http://archives.postgresql.org/message-id/AANLkTik2c3kV7HgJnM4MjkCWVG-QvDJXD3iR9TqsCnpP@mail.gmail.com > > Review Description > ================== > This patch extends existing asynchronous streaming replication with > options to enable different levels of synchronous behaviour. On the > primary, the is a new standbys.conf file containing a manifest of > standbys it seeks a form a confirmation from before committing > transactions. These contain standby name, level of synchronicity (if > it wasn't a word, it is now), and a timeout value specifying how long > the primary is willing to wait for confirmation before aborting the > transaction. On the standby, the new option "standby_name" has been > added to publish a named identity of the standby server to the > primary. > > > Patch application > ================= > The patch applies cleanly to HEAD. All regression tests pass > successfully as expected. > > > Testing > ======= > I configured the primary's standby.conf to provide synchronous > replication to a single slave using fsync as it's replication level > and a timeout of 100ms. The wal_level was set to hot_standby in > postgresql.conf. The standby's recovery.conf had its standby_name > value set to match that expected by the primary's standby.conf. This > is summarised as follows: > > ## primary postgresql.conf: > wal_level = hot_standby > max_wal_senders = 2 > wal_keep_segments = 2 > > ## primary standbys.conf > # STANDBY NAME SYNCHRONOUS TIMEOUT > cougar fsync 100ms > > ## primary pg_hba.conf > # TYPE DATABASE USER CIDR-ADDRESS METHOD > host replication postgres 192.168.102.17/32 trust > > ## standby postgresql.conf > hot_standby = on > > ## standby recovery.conf > standby_name = 'cougar' > standby_mode = 'on' > primary_conninfo = 'host=192.168.102.125 port=5432' > > > Issues > ====== > > The primary started up fine and was accepting connections. A base > backup was taken for the standby, and after configuring the standby, I > attempted to bring it up, but received the following error: > > postgres@cougar:~/project/data$ pg_ctl start > server starting > postgres@cougar:~/project/data$ LOG: database system was shut down in > recovery at 2010-09-29 22:52:24 BST > LOG: entering standby mode > LOG: redo starts at 0/1000020 > LOG: record with zero length at 0/10000B0 > FATAL: could not connect to the primary server: invalid connection > option "standby_name" > > I believe I am using the correct parameter as I followed the > additional documentation provided in the patch. > > > Conclusion > ========== > > It at least appears to me that it isn't functional in its current > state, or there is setup information missing from the > documentation.... or I've make a stupid mistake somewhere (likely). Quick back-peddle... it appears the patch was only successful on the standby. Only doc changes appeared to make it to the primary. Re-attempt tomorrow. Apologies. -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935
On 29 September 2010 23:57, Thom Brown <thom@linux.com> wrote: > On 29 September 2010 23:29, Thom Brown <thom@linux.com> wrote: >> This is a basic review of Fujii Masao's synchronous replication patch >> from http://archives.postgresql.org/message-id/AANLkTik2c3kV7HgJnM4MjkCWVG-QvDJXD3iR9TqsCnpP@mail.gmail.com >> >> Review Description >> ================== >> This patch extends existing asynchronous streaming replication with >> options to enable different levels of synchronous behaviour. On the >> primary, the is a new standbys.conf file containing a manifest of >> standbys it seeks a form a confirmation from before committing >> transactions. These contain standby name, level of synchronicity (if >> it wasn't a word, it is now), and a timeout value specifying how long >> the primary is willing to wait for confirmation before aborting the >> transaction. On the standby, the new option "standby_name" has been >> added to publish a named identity of the standby server to the >> primary. >> >> >> Patch application >> ================= >> The patch applies cleanly to HEAD. All regression tests pass >> successfully as expected. >> >> >> Testing >> ======= >> I configured the primary's standby.conf to provide synchronous >> replication to a single slave using fsync as it's replication level >> and a timeout of 100ms. The wal_level was set to hot_standby in >> postgresql.conf. The standby's recovery.conf had its standby_name >> value set to match that expected by the primary's standby.conf. This >> is summarised as follows: >> >> ## primary postgresql.conf: >> wal_level = hot_standby >> max_wal_senders = 2 >> wal_keep_segments = 2 >> >> ## primary standbys.conf >> # STANDBY NAME SYNCHRONOUS TIMEOUT >> cougar fsync 100ms >> >> ## primary pg_hba.conf >> # TYPE DATABASE USER CIDR-ADDRESS METHOD >> host replication postgres 192.168.102.17/32 trust >> >> ## standby postgresql.conf >> hot_standby = on >> >> ## standby recovery.conf >> standby_name = 'cougar' >> standby_mode = 'on' >> primary_conninfo = 'host=192.168.102.125 port=5432' >> >> >> Issues >> ====== >> >> The primary started up fine and was accepting connections. A base >> backup was taken for the standby, and after configuring the standby, I >> attempted to bring it up, but received the following error: >> >> postgres@cougar:~/project/data$ pg_ctl start >> server starting >> postgres@cougar:~/project/data$ LOG: database system was shut down in >> recovery at 2010-09-29 22:52:24 BST >> LOG: entering standby mode >> LOG: redo starts at 0/1000020 >> LOG: record with zero length at 0/10000B0 >> FATAL: could not connect to the primary server: invalid connection >> option "standby_name" >> >> I believe I am using the correct parameter as I followed the >> additional documentation provided in the patch. >> >> >> Conclusion >> ========== >> >> It at least appears to me that it isn't functional in its current >> state, or there is setup information missing from the >> documentation.... or I've make a stupid mistake somewhere (likely). > > Quick back-peddle... it appears the patch was only successful on the > standby. Only doc changes appeared to make it to the primary. > Re-attempt tomorrow. Apologies. Well, that doesn't seem to have made any difference. Confirmed the patch was applied in both cases, rebuilt, base backup again etc... no change. Same error as in original review. *shrug* -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935
Thom - Just as a logistical note, reviews should be posted to -hackers. -rrreviewers is for discussion of assigning reviewers, coordinating who is doing what, etc. ...Robert On Wed, Sep 29, 2010 at 6:29 PM, Thom Brown <thom@linux.com> wrote: > This is a basic review of Fujii Masao's synchronous replication patch > from http://archives.postgresql.org/message-id/AANLkTik2c3kV7HgJnM4MjkCWVG-QvDJXD3iR9TqsCnpP@mail.gmail.com > > Review Description > ================== > This patch extends existing asynchronous streaming replication with > options to enable different levels of synchronous behaviour. On the > primary, the is a new standbys.conf file containing a manifest of > standbys it seeks a form a confirmation from before committing > transactions. These contain standby name, level of synchronicity (if > it wasn't a word, it is now), and a timeout value specifying how long > the primary is willing to wait for confirmation before aborting the > transaction. On the standby, the new option "standby_name" has been > added to publish a named identity of the standby server to the > primary. > > > Patch application > ================= > The patch applies cleanly to HEAD. All regression tests pass > successfully as expected. > > > Testing > ======= > I configured the primary's standby.conf to provide synchronous > replication to a single slave using fsync as it's replication level > and a timeout of 100ms. The wal_level was set to hot_standby in > postgresql.conf. The standby's recovery.conf had its standby_name > value set to match that expected by the primary's standby.conf. This > is summarised as follows: > > ## primary postgresql.conf: > wal_level = hot_standby > max_wal_senders = 2 > wal_keep_segments = 2 > > ## primary standbys.conf > # STANDBY NAME SYNCHRONOUS TIMEOUT > cougar fsync 100ms > > ## primary pg_hba.conf > # TYPE DATABASE USER CIDR-ADDRESS METHOD > host replication postgres 192.168.102.17/32 trust > > ## standby postgresql.conf > hot_standby = on > > ## standby recovery.conf > standby_name = 'cougar' > standby_mode = 'on' > primary_conninfo = 'host=192.168.102.125 port=5432' > > > Issues > ====== > > The primary started up fine and was accepting connections. A base > backup was taken for the standby, and after configuring the standby, I > attempted to bring it up, but received the following error: > > postgres@cougar:~/project/data$ pg_ctl start > server starting > postgres@cougar:~/project/data$ LOG: database system was shut down in > recovery at 2010-09-29 22:52:24 BST > LOG: entering standby mode > LOG: redo starts at 0/1000020 > LOG: record with zero length at 0/10000B0 > FATAL: could not connect to the primary server: invalid connection > option "standby_name" > > I believe I am using the correct parameter as I followed the > additional documentation provided in the patch. > > > Conclusion > ========== > > It at least appears to me that it isn't functional in its current > state, or there is setup information missing from the > documentation.... or I've make a stupid mistake somewhere (likely). > > -- > Thom Brown > Twitter: @darkixion > IRC (freenode): dark_ixion > Registered Linux user: #516935 > > -- > Sent via pgsql-rrreviewers mailing list (pgsql-rrreviewers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-rrreviewers > -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company