Thread: pgsql: Efficient transaction-controlled synchronous replication.

pgsql: Efficient transaction-controlled synchronous replication.

From
Simon Riggs
Date:
Efficient transaction-controlled synchronous replication.
If a standby is broadcasting reply messages and we have named
one or more standbys in synchronous_standby_names then allow
users who set synchronous_replication to wait for commit, which
then provides strict data integrity guarantees. Design avoids
sending and receiving transaction state information so minimises
bookkeeping overheads. We synchronize with the highest priority
standby that is connected and ready to synchronize. Other standbys
can be defined to takeover in case of standby failure.

This version has very strict behaviour; more relaxed options
may be added at a later date.

Simon Riggs and Fujii Masao, with reviews by Yeb Havinga, Jaime
Casanova, Heikki Linnakangas and Robert Haas, plus the assistance
of many other design reviewers.

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/a8a8a3e0965201df88bdfdff08f50e5c06c552b7

Modified Files
--------------
doc/src/sgml/config.sgml                      |   86 +++++++++++
doc/src/sgml/high-availability.sgml           |  203 +++++++++++++++++++++++++
doc/src/sgml/monitoring.sgml                  |    7 +-
src/backend/access/transam/twophase.c         |   25 +++
src/backend/access/transam/xact.c             |   11 ++-
src/backend/catalog/system_views.sql          |    4 +-
src/backend/postmaster/autovacuum.c           |    7 +
src/backend/postmaster/postmaster.c           |    3 +
src/backend/replication/Makefile              |    2 +-
src/backend/replication/walreceiver.c         |    9 +-
src/backend/replication/walsender.c           |   65 +++++++-
src/backend/storage/ipc/shmqueue.c            |   21 +++-
src/backend/storage/lmgr/proc.c               |   12 ++
src/backend/utils/misc/guc.c                  |   19 +++
src/backend/utils/misc/postgresql.conf.sample |   11 ++-
src/include/catalog/pg_proc.h                 |    2 +-
src/include/replication/walsender.h           |   22 +++
src/include/storage/lwlock.h                  |    1 +
src/include/storage/proc.h                    |   14 ++
src/include/storage/shmem.h                   |    3 +
src/test/regress/expected/rules.out           |    2 +-
21 files changed, 507 insertions(+), 22 deletions(-)


Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Andrew Dunstan
Date:

On 03/06/2011 05:51 PM, Simon Riggs wrote:
> Efficient transaction-controlled synchronous replication.
>

I'm glad this is in, but I thought we agreed NOT to call it "synchronous
replication".

cheers

andrew

Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> Efficient transaction-controlled synchronous replication.

This patch broke the build.  Kindly fix or revert at once.

            regards, tom lane

Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Jaime Casanova
Date:
On Sun, Mar 6, 2011 at 6:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> Efficient transaction-controlled synchronous replication.
>
> This patch broke the build.  Kindly fix or revert at once.
>

Seems Simon forgot to include src/include/replication/syncrep.h on the commit

--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Jaime Casanova
Date:
On Sun, Mar 6, 2011 at 6:36 PM, Jaime Casanova <jaime@2ndquadrant.com> wrote:
> On Sun, Mar 6, 2011 at 6:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Simon Riggs <simon@2ndQuadrant.com> writes:
>>> Efficient transaction-controlled synchronous replication.
>>
>> This patch broke the build.  Kindly fix or revert at once.
>>
>
> Seems Simon forgot to include src/include/replication/syncrep.h on the commit
>

It doesn't have src/backend/replication/syncrep.c either


--
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Simon Riggs
Date:
On Sun, 2011-03-06 at 18:28 -0500, Tom Lane wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
> > Efficient transaction-controlled synchronous replication.
>
> This patch broke the build.  Kindly fix or revert at once.

I think that's fixed it now. I was in the middle of doing that when your
last commit hit, so I had to rewind and try again.

--
 Simon Riggs           http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services



Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Fujii Masao
Date:
On Mon, Mar 7, 2011 at 7:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> Efficient transaction-controlled synchronous replication.
> If a standby is broadcasting reply messages and we have named
> one or more standbys in synchronous_standby_names then allow
> users who set synchronous_replication to wait for commit, which
> then provides strict data integrity guarantees. Design avoids
> sending and receiving transaction state information so minimises
> bookkeeping overheads. We synchronize with the highest priority
> standby that is connected and ready to synchronize. Other standbys
> can be defined to takeover in case of standby failure.
>
> This version has very strict behaviour; more relaxed options
> may be added at a later date.

Pretty cool! I'd appreciate very much your efforts and contributions.

And,, I found one bug ;) You seem to have wrongly removed the check
of max_wal_senders in SyncRepWaitForLSN. This can make the
backend wait for replication even if max_wal_senders = 0. I could produce
this problematic situation in my machine. The attached patch fixes this problem.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Simon Riggs
Date:
On Mon, 2011-03-07 at 17:27 +0900, Fujii Masao wrote:
> On Mon, Mar 7, 2011 at 7:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

> And,, I found one bug ;) You seem to have wrongly removed the check
> of max_wal_senders in SyncRepWaitForLSN. This can make the
> backend wait for replication even if max_wal_senders = 0. I could produce
> this problematic situation in my machine. The attached patch fixes this problem.

There may be a bug, but that's not the fix.

I spotted that issue myself in testing. I put in a protection to stop
setting synchronous_standby_names if max_wal_senders is zero, with error
message.

Are you saying the committed version doesn't trigger that ERROR?

--
 Simon Riggs           http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services



Re: pgsql: Efficient transaction-controlled synchronous replication.

From
Robert Haas
Date:
On Mon, Mar 7, 2011 at 3:27 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Mar 7, 2011 at 7:51 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
>> Efficient transaction-controlled synchronous replication.
>> If a standby is broadcasting reply messages and we have named
>> one or more standbys in synchronous_standby_names then allow
>> users who set synchronous_replication to wait for commit, which
>> then provides strict data integrity guarantees. Design avoids
>> sending and receiving transaction state information so minimises
>> bookkeeping overheads. We synchronize with the highest priority
>> standby that is connected and ready to synchronize. Other standbys
>> can be defined to takeover in case of standby failure.
>>
>> This version has very strict behaviour; more relaxed options
>> may be added at a later date.
>
> Pretty cool! I'd appreciate very much your efforts and contributions.
>
> And,, I found one bug ;) You seem to have wrongly removed the check
> of max_wal_senders in SyncRepWaitForLSN. This can make the
> backend wait for replication even if max_wal_senders = 0. I could produce
> this problematic situation in my machine. The attached patch fixes this problem.

I committed a slightly different fix for this problem.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company