Re: replication docs: split single vs. multi-master - Mailing list pgsql-patches

From Markus Schiltknecht
Subject Re: replication docs: split single vs. multi-master
Date
Msg-id 455CCE3B.4080809@bluegap.ch
Whole thread Raw
In response to Re: replication docs: split single vs. multi-master  (Bruce Momjian <bruce@momjian.us>)
List pgsql-patches
Hello Bruce,

Bruce Momjian wrote:
> Actually the patch moves down data paritioning.  I am confused.

Uh.. yeah, sorry, that's what I meant.

> I thought a long time about this.  I have always liked splitting the
> solutions up into single and multi-master, but in doing this
> documentation section, I realized that the split isn't all that helpful,
> and can be confusing.

Not mentioning that categorization doesn't help in clearing the
confusion. Just look around, most people use these terms. They're used
by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a
multi-master operation mode.

> For example, Slony is clearly single-master,

Agreed.

> but
> what about data partitioning?  That is multi-master, in that there is
> more than one master, but only one master per data set.

Data Partitioning is a way to work around the trouble of database
replication in the application layer. Instead of trying to categorize it
like a replication algorithm, we should explain that working around the
trouble may be worthwhile in many cases.

> And for
> multi-master, Oracle RAC is clearly multi master,

Yes.

>  and I can see pgpool
> as multi-master, or as several single-master systems, in that they
> operate independently.

Several single-master systems? C'mon! Pgpool simply implements the most
simplistic form of multi-master replication. Just because you can access
the single databases inside the cluster doesn't make it less
Multi-Master, does it?

> After much thought, it seems that putting things
> into single/multi-master categories just adds more confusion, because
> several solutions just aren't clear

Agreed, I'm not saying you must categorize all solutions you describe.
But please do categorize the ones which can be (and have so often been)
categorized.

> or fall into neither, e.g. Shared Disk Failover.

Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a
warning "about the risk of having two postmaster come up...".

What about other means of sharing disks or filesystems? NBDs or even
worse: NFS?

> Another issue is that you mentioned heavly locking for
> multi-master, when in fact pgpool doesn't do any special inter-server
> locking, so it just doesn't apply.

Sure it does apply, in the sense that *every* single lock is granted and
released on *every* node. The total amount of locks scales linearly with
the amount of nodes in the cluster.

> In summary, it just seemed clearer to talk about each item and how it
> works, rather than try to categorize them.  The categorization just
> seems to do more harm than good.
>
> Of course, I might be totally wrong, and am still looking for feedback,
> but these are my current thoughts.  Feedback?

AFAICT, the categorization in Single- and Multi-Master replication is
very common. I think that's partly because it's focused on the solution.
One can ask: do I want to write on all nodes or is a failover solution
sufficient? Or can I probably get away with a read-only Slave?

It's a categorization the user does, often before having a glimpse about
how complicated database replication really is. Thus, IMO, it would make
sense to help the user and allow him to quickly find answers. (And we
can still tell them that it's not easy or even possible to categorize
all the solutions.)

> I didn't mention distributed shared memory as a separate item because I
> felt it was an implementation detail of clustering, rather than
> something separate.  I kept two-phase in the cluster item for the same
> reason.

Why is pgpool not an implementation detail of clustering, then?

> Current version at:
>
>     http://momjian.us/main/writings/pgsql/sgml/failover.html

That somehow doesn't work for me:

--- momjian.us ping statistics ---
15 packets transmitted, 0 received, 100% packet loss, time 14011ms


Just my 2 cents, in the hope to be of help.

Regards

Markus


[1]: Brad Nicholson's suggestion:
http://archives.postgresql.org/pgsql-admin/2006-11/msg00154.php

pgsql-patches by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: replication docs: split single vs. multi-master
Next
From: Bruce Momjian
Date:
Subject: Re: replication docs: split single vs. multi-master