Re: Issues with two-server Synch Rep - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Issues with two-server Synch Rep
Date
Msg-id 1286981709.1709.2376.camel@ebony
Whole thread Raw
In response to Issues with two-server Synch Rep  (Josh Berkus <josh@agliodbs.com>)
List pgsql-hackers
On Thu, 2010-10-07 at 11:05 -0700, Josh Berkus wrote:
> Simon, Fujii,
> 
> What follows are what I see as the major issues with making two-server
> synch replication work well.  I would like to have you each answer them,
> explaining how your patch and your design addresses each issue.  I
> believe this will go a long way towards helping the majority of the
> community understand the options we have from your code, as well as
> where help is still needed.

Happy to answer your questions. Please add me to the copy list if you
address me directly.

> Adding a Synch Standby
> -----------------------
> What is the procedure for adding a new synchronous standby in your
> implementation?  That is, how do we go from having a standby server with
> an empty PGDATA to having a working synchronous standby?

Same as adding a streaming standby.

Only difference is that *if* you don't want standby to be a synch
standby then you would set synchronous_replication_service = off

My understanding is that other approaches are significantly more complex
at this point, with required changes on the master, and also on the
standby should we wish the standby to be a failover target.

> Snapshot Publication
> ---------------------
> During 9.0 development discussion, one of the things we realized we
> needed for synch standby was publication of snapshots back to the master
> in order to prevent query cancel on the standby.  Without this, the
> synch standby is useless for running read queries.  

Don't see much difference there. 

This isn't isn't needed for sync rep. It can be added, as soon as we
have a channel to pass info back from standby to master. That is a small
commit that can be added after we commit something; I will handle that -
it is a requirement that will be addressed.

> Does your patch
> implement this?  Please describe.

No, but that isn't needed for sync rep.

> Management
> -----------
> One of the serious flaws currently in HS/SR is complexity of
> administration.  Setting up and configuring even a single master and
> single standby requires editing up to 6 configuration files in Postgres,
> as well as dealing with file permissions.  As such, any Synch Rep patch
> must work together with attempts to simplify administration.  How does
> your design do this?

Simplification of the existing framework is possible, though is not a
goal of sync rep. My proposed approach is to add as few mandatory
parameters as possible to avoid over-complexity.

Complexity of administration is very important, because getting it wrong
has a critical impact on availability and can lead to data loss.

In the two node case this post covers, my patch requires 1 parameter,
added to the existing postgresql.conf on the master. That parameter does
not need to be changed should failover occur. So no parameter changes
are required at failover, nor can mistakes happen because of
misconfiguration.

> Monitoring
> -----------
> Synch rep offers severe penalties to availability if a synch standby
> gets behind or goes down.  What replication-specific monitoring tools
> and hooks are available to allow administators to take action before the
> database becomes unavailable?

I don't see any differences here. It's easy to add an SRF that shows
current status of standbys.

> Degradation
> ------------
> In the event that the synch rep standby falls too far behind or becomes
> unavailable, or is deliberately taken offline, what are you envisioning
> as the process for the DBA resolving the situation?  

Add a new standby as quickly as possible. This only happens if the DBA
had not provided sufficient standbys in the first place.

> Is there any
> ability to commit "stuck" transactions?

Yes, an operator function.

> Client Consistency
> ---------------------
> With a standby in "apply" mode, and a master failure at the wrong time,
> there is the possibility that the Standby will apply a transaction at
> the same time that the master crashes, causing the client to never
> receive a commit message.  Once the client reconnects to the standby,
> how will it know whether its transaction was committed or not?

It wouldn't, but this situation already occurs even without sync rep.
Any user issuing COMMIT at time of server crash may that their
transaction was committed and they received no commit message. There is
no "tell me if the last thing I did worked" function, since the client
doesn't record the xid.

> As a lesser case, a standby in "apply" mode will show the results of
> committed transactions *before* they are visible on the master.  Is
> there any need to handle this?  If so, how?

No need to handle it. It's how it works. As long as there are more than
one clog then we will have commits happening at different times.

> Performance
> ------------
> As with XA, synch rep has the potential to be so slow as to be unusable.
>  What optimizations to you make in your approach to synch rep to make it
> faster than two-phase commit?  What other performance optimizations have
> you added?

Applications implementing sync rep will be very sensitive to the
performance we provide. Designed-in performance will be critical.
Providing both performance and application flexibility is a cornerstone
of my design. Sync rep does not have to be slow, nor do we need to make
it unusable.

* Master-side transaction controlled replication allows parameters to
control behaviour within applications.

* Bulk acknowledgement - the standby doesn't send back details of each
individual waiting transaction, nor does it know or care. This means the
response messages are very small and only sent when status changes.

* First reponder processing is faster than quorum_commit > 1.

* In a later patch, WAL writer will be active during recovery, so that
WALreceiver doesn't need to fsync. This is required to implement the
"recv" mode which is very important for performance. Heikki asked me to
remove this from my current patch, but its easy to add back in again
soon afterwards. It's very important, IMHO.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: levenshtein_less_equal (was: multibyte charater set in levenshtein function)
Next
From: Andrew Dunstan
Date:
Subject: Re: SQL command to edit postgresql.conf, with comments