Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Support for N synchronous standby servers - take 2
Date
Msg-id 558D87F6.7080203@agliodbs.com
Whole thread Raw
In response to Support for N synchronous standby servers - take 2  (Beena Emerson <memissemerson@gmail.com>)
Responses Re: Support for N synchronous standby servers - take 2  (Robert Haas <robertmhaas@gmail.com>)
Re: Support for N synchronous standby servers - take 2  (Michael Paquier <michael.paquier@gmail.com>)
Re: Support for N synchronous standby servers - take 2  (Peter Eisentraut <peter_e@gmx.net>)
List pgsql-hackers
On 06/26/2015 09:42 AM, Robert Haas wrote:
> On Fri, Jun 26, 2015 at 1:46 AM, Michael Paquier
>> That's where the micro-language idea makes sense to use. For example,
>> we can define a group using separators and like (elt1,...eltN) or
>> [elt1,elt2,eltN]. Appending a number in front of a group is essential
>> as well for quorum commits. Hence for example, assuming that '()' is
>> used for a group whose element order does not matter, if we use that:
>> - k(elt1,elt2,eltN) means that we need for the k elements in the set
>> to return true (aka commit confirmation).
>> - k[elt1,elt2,eltN] means that we need for the first k elements in the
>> set to return true.
>>
>> When k is not defined for a group, k = 1. Using only elements
>> separated by commas for the upper group means that we wait for the
>> first element in the set (for backward compatibility), hence:
>> 1(elt1,elt2,eltN) <=> elt1,elt2,eltN

This really feels like we're going way beyond what we want a single
string GUC.  I feel that this feature, as outlined, is a terrible hack
which we will regret supporting in the future.  You're taking something
which was already a fast hack because we weren't sure if anyone would
use it, and building two levels on top of that.

If we're going to do quorum, multi-set synchrep, then we need to have a
real management interface.  Like, we really ought to have a system
catalog and some built in functions to manage this instead, e.g.

pg_add_synch_set(set_name NAME, quorum INT, set_members VARIADIC)

pg_add_synch_set('bolivia', 1, 'bsrv-2,'bsrv-3','bsrv-5')

pg_modify_sync_set(quorum INT, set_members VARIADIC)

pg_drop_synch_set(set_name NAME)

For users who want the new functionality, they just set
synchronous_standby_names='catalog' in pg.conf.

Having a function interface for this would make it worlds easier for the
DBA to reconfigure in order to accomodate network changes as well.
Let's face it, a DBA with three synch sets in different geos is NOT
going to want to edit pg.conf by hand and reload when the link to Brazil
goes down.  That's a really sucky workflow, and near-impossible to automate.

We'll also want a new system view, pg_stat_syncrep:

pg_stat_synchrepstandby_nameclient_addrreplication_statussynch_setsynch_quorumsynch_status

Alternately, we could overload those columns onto pg_stat_replication,
but that seems messy.
Finally, while I'm raining on everyone's parade: the mechanism of
identifying synchronous replicas by setting the application_name on the
replica is confusing and error-prone; if we're building out synchronous
replication into a sophisticated system, we ought to think about
replacing it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: PANIC in GIN code
Next
From: Alvaro Herrera
Date:
Subject: Re: Rework the way multixact truncations work