Issues with two-server Synch Rep - Mailing list pgsql-hackers

From Josh Berkus
Subject Issues with two-server Synch Rep
Date
Msg-id 4CAE0BFF.5010806@agliodbs.com
Whole thread Raw
Responses Re: Issues with two-server Synch Rep
Re: Issues with two-server Synch Rep
Re: Issues with two-server Synch Rep
List pgsql-hackers
Simon, Fujii,

What follows are what I see as the major issues with making two-server
synch replication work well.  I would like to have you each answer them,
explaining how your patch and your design addresses each issue.  I
believe this will go a long way towards helping the majority of the
community understand the options we have from your code, as well as
where help is still needed.

Adding a Synch Standby
-----------------------
What is the procedure for adding a new synchronous standby in your
implementation?  That is, how do we go from having a standby server with
an empty PGDATA to having a working synchronous standby?

Snapshot Publication
---------------------
During 9.0 development discussion, one of the things we realized we
needed for synch standby was publication of snapshots back to the master
in order to prevent query cancel on the standby.  Without this, the
synch standby is useless for running read queries.  Does your patch
implement this?  Please describe.

Management
-----------
One of the serious flaws currently in HS/SR is complexity of
administration.  Setting up and configuring even a single master and
single standby requires editing up to 6 configuration files in Postgres,
as well as dealing with file permissions.  As such, any Synch Rep patch
must work together with attempts to simplify administration.  How does
your design do this?

Monitoring
-----------
Synch rep offers severe penalties to availability if a synch standby
gets behind or goes down.  What replication-specific monitoring tools
and hooks are available to allow administators to take action before the
database becomes unavailable?

Degradation
------------
In the event that the synch rep standby falls too far behind or becomes
unavailable, or is deliberately taken offline, what are you envisioning
as the process for the DBA resolving the situation?  Is there any
ability to commit "stuck" transactions?

Client Consistency
---------------------
With a standby in "apply" mode, and a master failure at the wrong time,
there is the possibility that the Standby will apply a transaction at
the same time that the master crashes, causing the client to never
receive a commit message.  Once the client reconnects to the standby,
how will it know whether its transaction was committed or not?

As a lesser case, a standby in "apply" mode will show the results of
committed transactions *before* they are visible on the master.  Is
there any need to handle this?  If so, how?

Performance
------------
As with XA, synch rep has the potential to be so slow as to be unusable.What optimizations to you make in your approach
tosynch rep to make it
 
faster than two-phase commit?  What other performance optimizations have
you added?

--                                  -- Josh Berkus                                    PostgreSQL Experts Inc.
                        http://www.pgexperts.com
 


pgsql-hackers by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Issues with Quorum Commit
Next
From: Aidan Van Dyk
Date:
Subject: Re: standby registration (was: is sync rep stalled?)