Re: Multimaster - Mailing list pgsql-general

From Konstantin Knizhnik
Subject Re: Multimaster
Date
Msg-id 57166309.5090907@postgrespro.ru
Whole thread Raw
In response to Re: Multimaster  (Craig Ringer <craig@2ndquadrant.com>)
List pgsql-general


On 19.04.2016 15:56, Craig Ringer wrote:
On 18 April 2016 at 16:28, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
 
I intend to make the same split in pglogical its self - a receiver and apply worker split. Though my intent is to have them communicate via a shared memory segment until/unless the apply worker gets too far behind and spills to disk.


In case of multimaster  "too far behind" scenario can never happen.

I disagree. In the case of tightly coupled synchronous multi-master it can't happen, sure. But that's hardly the only case of multi-master out there.

Sorry, it is just matter of terms meaning. By multimaster I really mean "synchronous multimaster", because from my point of view the main characteristic of multimaster is symmetric access to all nodes. If there is no warranty that all cluster nodes have the same state, then, from my point of view, it is not a multimaster at all.  But i have registered "multimaster" trademark, so can not insists on such treatment of this term:)




 
2. Logical backup: transfer data to different database (including new version of Postgres)

I think that's more HA than logical backup. Needs to be able to be synchronous or asynchronous, much like our current phys.rep.

Closely related but not quite the same is logical read replicas/standbys.

This is use case from real production system (Galera use case). If customer want to migrate data to new data center, then multimaster is one of the possible (and safest) ways to do it. You can ste-by-step and transparently for users redirect workload to new data center.

 
3. Change notification: there are many different subscribers which can be interested in receiving notifications about database changes.

Yep. I suspect we'll want a json output plugin for this, separate to pglogical etc, but we'll need to move a bunch of functionality from pglogical into core so it can be shared rather than duplicated.

JSON is not is efficient format for it. And here performance may be critical.

 
4. Synchronous replication: multimaster

"Synchronous multimaster". Not all multimastrer is synchronous, not all synchronous replication is multimaster. 

We are not enforcing order of commits as Galera does. Consistency is enforces by DTM, which enforce that transactions at all nodes are given consistent snapshots and assigned same CSNs. We have also global deadlock detection algorithm which build global lock graph (but still false positives are possible because  this graphs is build incrementally and so it doesn't correspond to some global snapshot).

OK, so you're relying on a GTM to determine safe, conflict-free apply orderings.

I'm ... curious ... about how you do that. Do you have a global lock manager too? How do you determine ordering for things that in a single-master case are addressed via unique b-tree indexes, not (just) heavyweight locking?


We have tried both DTM with global arbiter (analogue of XL GTM) and DTM based on timestamps. In the last case there is no centralized arbiter. But we are using "raftable" - yet another our plugin which provides consistent distributed storage based on RAFT protocol.
Using this raftable we build global deadlock graph based on local subgraphs.



Or need to add to somehow add original DDL statements to the log.

Actually you need to be able to add normalized statements to the xlog. The original DDL text isn't quite good enough due to issues with search_path among other things. Hence DDL deparse.

Yes, for general purpose we need some DBMS-independent representation of DDL.
But for multimaster needs original SQL statement will be enough.


 

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-general by date:

Previous
From: Jinhua Luo
Date:
Subject: Re: Re: what's the exact command definition in read committed isolation level?
Next
From: Rob Brucks
Date:
Subject: Enhancement Request