Thread: Re: [PATCHES] replication docs: split single vs. multi-master

Re: [PATCHES] replication docs: split single vs. multi-master

From

Bruce Momjian

Date:

17 November 2006, 01:06:42

Markus Schiltknecht wrote:
> Not mentioning that categorization doesn't help in clearing the 
> confusion. Just look around, most people use these terms. They're used 
> by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a 
> multi-master operation mode.

OK.

> > For example, Slony is clearly single-master, 
> 
> Agreed.
> 
> > but
> > what about data partitioning?  That is multi-master, in that there is
> > more than one master, but only one master per data set.  
> 
> Data Partitioning is a way to work around the trouble of database 
> replication in the application layer. Instead of trying to categorize it 
> like a replication algorithm, we should explain that working around the 
> trouble may be worthwhile in many cases.

OK.  I am still feeling that data partitioning is like master/slave
replication because you have to get that read-only copy to the other
server.  If you split things up so data sets resided on only one
machine, you are right that would not be replication, but do people do
that?  If so, it is almost another solution.

> 
> > And for
> > multi-master, Oracle RAC is clearly multi master,
> 
> Yes.
> 
> >  and I can see pgpool
> > as multi-master, or as several single-master systems, in that they
> > operate independently.  
> 
> Several single-master systems? C'mon! Pgpool simply implements the most 
> simplistic form of multi-master replication. Just because you can access 
> the single databases inside the cluster doesn't make it less 
> Multi-Master, does it?

OK, changed to "Multi-Master Replication Using Query Broadcasting".

> 
> > After much thought, it seems that putting things
> > into single/multi-master categories just adds more confusion, because
> > several solutions just aren't clear
> 
> Agreed, I'm not saying you must categorize all solutions you describe. 
> But please do categorize the ones which can be (and have so often been) 
> categorized.

OK.

> > or fall into neither, e.g. Shared Disk Failover.
> 
> Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a 
> warning "about the risk of having two postmaster come up...".


Added.

> 
> What about other means of sharing disks or filesystems? NBDs or even 
> worse: NFS?

Added.

> 
> > Another issue is that you mentioned heavly locking for
> > multi-master, when in fact pgpool doesn't do any special inter-server
> > locking, so it just doesn't apply.
> 
> Sure it does apply, in the sense that *every* single lock is granted and 
> released on *every* node. The total amount of locks scales linearly with 
> the amount of nodes in the cluster.

Uh, but the locks are the same on each machine as if it was a single
server, while in a cluster, the locks are more intertwined with other
things that are happening on the server, no?

> > In summary, it just seemed clearer to talk about each item and how it
> > works, rather than try to categorize them.  The categorization just
> > seems to do more harm than good.
> > 
> > Of course, I might be totally wrong, and am still looking for feedback,
> > but these are my current thoughts.  Feedback?
> 
> AFAICT, the categorization in Single- and Multi-Master replication is 
> very common. I think that's partly because it's focused on the solution. 
> One can ask: do I want to write on all nodes or is a failover solution 
> sufficient? Or can I probably get away with a read-only Slave?

OK.

> It's a categorization the user does, often before having a glimpse about 
> how complicated database replication really is. Thus, IMO, it would make 
> sense to help the user and allow him to quickly find answers. (And we 
> can still tell them that it's not easy or even possible to categorize 
> all the solutions.)
> 
> > I didn't mention distributed shared memory as a separate item because I
> > felt it was an implementation detail of clustering, rather than
> > something separate.  I kept two-phase in the cluster item for the same
> > reason.
> 
> Why is pgpool not an implementation detail of clustering, then?
> 
> > Current version at:
> > 
> >     http://momjian.us/main/writings/pgsql/sgml/failover.html
> 
> That somehow doesn't work for me:

I lost power for a few hours.  I am back online.  I have updated the
docs at that URL.  Please check and let me know.

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: [PATCHES] replication docs: split single vs.

From

Hannu Krosing

Date:

17 November 2006, 03:30:48

Ühel kenal päeval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> Markus Schiltknecht wrote:
> > Not mentioning that categorization doesn't help in clearing the 
> > confusion. Just look around, most people use these terms. They're used 
> > by MySQL and Oracle. Even Microsofts ActiveDirectory seems to have a 
> > multi-master operation mode.
> 
> OK.
> 
> > > For example, Slony is clearly single-master, 
> > 
> > Agreed.
> > 
> > > but
> > > what about data partitioning?  That is multi-master, in that there is
> > > more than one master, but only one master per data set.  
> > 
> > Data Partitioning is a way to work around the trouble of database 
> > replication in the application layer. Instead of trying to categorize it 
> > like a replication algorithm, we should explain that working around the 
> > trouble may be worthwhile in many cases.
> 
> OK.  I am still feeling that data partitioning is like master/slave
> replication because you have to get that read-only copy to the other
> server.  If you split things up so data sets resided on only one
> machine, you are right that would not be replication, but do people do
> that?  If so, it is almost another solution.

People do that in cases where there is high write loads ("high" as in
"not 10+ times less than reads") and just replicating the RO copies
would be prohibitively expensive in either network, cpu or memory terms.

pl/proxy is one tool for doing it. You can get latest stable version
from https://developer.skype.com/SkypeGarage/DbProjects . 

> > > And for
> > > multi-master, Oracle RAC is clearly multi master,
> > 
> > Yes.
> > 
> > >  and I can see pgpool
> > > as multi-master, or as several single-master systems, in that they
> > > operate independently.  
> > 
> > Several single-master systems? C'mon! Pgpool simply implements the most 
> > simplistic form of multi-master replication. 

In what way is pgpool multimaster ? last time I looked it did nothing
but applying DML to several databses. i.e. it is not replication at all,
or at least it is masterless, unless we think of the pgpool process
itself as the _single_ master :)

> Just because you can access 
> > the single databases inside the cluster doesn't make it less 
> > Multi-Master, does it?
> 
> OK, changed to "Multi-Master Replication Using Query Broadcasting".

I think this gives completely wrong picture of what pgpool does.

How about just "Query Broadcasting" ?

> > 
> > > After much thought, it seems that putting things
> > > into single/multi-master categories just adds more confusion, because
> > > several solutions just aren't clear
> > 
> > Agreed, I'm not saying you must categorize all solutions you describe. 
> > But please do categorize the ones which can be (and have so often been) 
> > categorized.
> 
> OK.
> 
> > > or fall into neither, e.g. Shared Disk Failover.
> > 
> > Oh, yes, this reminds me of Brad Nicholson's suggestion in [1] to add a 
> > warning "about the risk of having two postmaster come up...".
> 
> 
> Added.
> 
> > 
> > What about other means of sharing disks or filesystems? NBDs or even 
> > worse: NFS?
> 
> Added.
> 
> > 
> > > Another issue is that you mentioned heavly locking for
> > > multi-master, when in fact pgpool doesn't do any special inter-server
> > > locking, so it just doesn't apply.
> > 
> > Sure it does apply, in the sense that *every* single lock is granted and 
> > released on *every* node. The total amount of locks scales linearly with 
> > the amount of nodes in the cluster.
> 
> Uh, but the locks are the same on each machine as if it was a single
> server, while in a cluster, the locks are more intertwined with other
> things that are happening on the server, no?
> 
> > > In summary, it just seemed clearer to talk about each item and how it
> > > works, rather than try to categorize them.  The categorization just
> > > seems to do more harm than good.
> > > 
> > > Of course, I might be totally wrong, and am still looking for feedback,
> > > but these are my current thoughts.  Feedback?
> > 
> > AFAICT, the categorization in Single- and Multi-Master replication is 
> > very common. I think that's partly because it's focused on the solution. 
> > One can ask: do I want to write on all nodes or is a failover solution 
> > sufficient? Or can I probably get away with a read-only Slave?
> 
> OK.
> 
> > It's a categorization the user does, often before having a glimpse about 
> > how complicated database replication really is. Thus, IMO, it would make 
> > sense to help the user and allow him to quickly find answers. (And we 
> > can still tell them that it's not easy or even possible to categorize 
> > all the solutions.)
> > 
> > > I didn't mention distributed shared memory as a separate item because I
> > > felt it was an implementation detail of clustering, rather than
> > > something separate.  I kept two-phase in the cluster item for the same
> > > reason.
> > 
> > Why is pgpool not an implementation detail of clustering, then?
> > 
> > > Current version at:
> > > 
> > >     http://momjian.us/main/writings/pgsql/sgml/failover.html
> > 
> > That somehow doesn't work for me:
> 
> I lost power for a few hours.  I am back online.  I have updated the
> docs at that URL.  Please check and let me know.
> 
-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

Re: [PATCHES] replication docs: split single vs.

From

Hannu Krosing

Date:

17 November 2006, 03:45:50

Ühel kenal päeval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> > > Current version at:
> > > 
> > >     http://momjian.us/main/writings/pgsql/sgml/failover.html

it refers to "Warm Standby Using Point-In-Time
Recovery" (http://momjian.us/main/writings/pgsql/sgml/warm-standby.html), maybe its a good idea to give pointers to
SkyTools(description: https://developer.skype.com/SkypeGarage/DbProjects/SkyTools
 
code: http://pgfoundry.org/projects/skytools/ ) which includes a
walmgr.py script which sets up and manages WAL-based standby servers.


-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

Re: [PATCHES] replication docs: split single vs. multi-master

From

Markus Schiltknecht

Date:

17 November 2006, 04:03:07

Hello Bruce,

You wrote:
> I am still feeling that data partitioning is like master/slave
> replication because you have to get that read-only copy to the other
> server.  

Yes, that's where replication comes into play. But data partitioning per 
se has nothing to do with replication, has it? You can partition your 
data however you want: among tablespaces, among databases or among 
multiple servers. Data partitioning solves different problems than 
replication. I think it's important to keep them separate. Why do you 
mix-in Slony-I in the Data Partitioning Section? One can use any other 
replication solution to "get that read-only copy to the other server".

> If you split things up so data sets resided on only one
> machine, you are right that would not be replication, but do people do
> that?  If so, it is almost another solution.

Yes, as I say: Data Partitioning solves another problem.

>>> And for
>>> multi-master, Oracle RAC is clearly multi master,
>> Yes.
>>
>>>  and I can see pgpool
>>> as multi-master, or as several single-master systems, in that they
>>> operate independently.  
>> Several single-master systems? C'mon! Pgpool simply implements the most 
>> simplistic form of multi-master replication. Just because you can access 
>> the single databases inside the cluster doesn't make it less 
>> Multi-Master, does it?
> 
> OK, changed to "Multi-Master Replication Using Query Broadcasting".

Good. That reads already better for me. ;-)

As Jim Nasby pointed out in [1], not all solutions are as simplistic as 
pgpool and do not necessarily have the same disadvantages - while using 
the very same algorithm: Query Broadcasting.

I suggest we make sure to clarify that and better point out some of the 
aspects all Multi-Master Replication have in common (see 
replication_doku_4.diff of my patches).

> Added.
> 
> Added.

(the additions to "Shared Disk Failover")

Good. Short and clear. (Except perhaps: how can I find out if NFS has 
full POSIX behavior? Do we have to go into more detail there? I dunno.)

> Uh, but the locks are the same on each machine as if it was a single
> server, while in a cluster, the locks are more intertwined with other
> things that are happening on the server, no?

Sure.

Maybe you are right and we should better not use the term locking there. 
It seems confusing because it's not clear what a 'lock' is for some 
replication systems (i.e. also Postgres-R, how do you compare it's 
"amount of locks"?).

Regards

Markus

Re: [PATCHES] replication docs: split single vs. multi-master

From

Markus Schiltknecht

Date:

17 November 2006, 04:06:41

Good morning Hannu,

Hannu Krosing wrote:
> People do that in cases where there is high write loads ("high" as in
> "not 10+ times less than reads") and just replicating the RO copies
> would be prohibitively expensive in either network, cpu or memory terms.

Okay. It that case it's even less like any type of replication.

IMO, Data Partitioning is the most simple method of Load Balancing. It's 
like saying: hey, if your database server is overloaded, simply split 
your data over multiple servers.

Which is not always possible and can lead to other problems. Some of 
which can solved by replication solutions.

> In what way is pgpool multimaster ? last time I looked it did nothing
> but applying DML to several databses. i.e. it is not replication at all,

Please give your definition of replication.

Wikipedia gives us [1]: "Replication refers to the use of redundant 
resources, such as software or hardware components, to improve 
reliability, fault-tolerance, or performance."

Pgpool does that by Query Broadcasting, no?

> or at least it is masterless, unless we think of the pgpool process
> itself as the _single_ master :)

Hm. That's a good point. Pgpool allows to write to only one master (the 
pgpool process) but read from multiple, synchronous masters. I admit 
that makes it a little hard to split into Single- or Multi-Master.

Doesn't Sequoia support multiple Query Broadcasting processes? Would it 
qualify as Multi-Master *Replication*, then?

In an ideal implementation, every Master could broadcast queries to all 
other masters. Thus giving a *real* Multi-Master solution. Postgres-R 
(6.4) did fall back into that mode for transactions which change a lot 
of tuples, so that the writeset didn't exceed a certain size limit.

> I think this gives completely wrong picture of what pgpool does.

As I see it, that's because pgpool is a very limited implementation of 
Query Broadcasting. But pgpool is not the only solution implementing 
that algorithm. Do we want to describe the general algorithm or pgpool here?

Regards

Markus

[1]: Wikipedia about Replication (Computer Science):
http://en.wikipedia.org/wiki/Replication_%28computer_science%29

Re: [PATCHES] replication docs: split single vs.

From

Bruce Momjian

Date:

17 November 2006, 09:26:05

Hannu Krosing wrote:
> > OK.  I am still feeling that data partitioning is like master/slave
> > replication because you have to get that read-only copy to the other
> > server.  If you split things up so data sets resided on only one
> > machine, you are right that would not be replication, but do people do
> > that?  If so, it is almost another solution.
> 
> People do that in cases where there is high write loads ("high" as in
> "not 10+ times less than reads") and just replicating the RO copies
> would be prohibitively expensive in either network, cpu or memory terms.

OK, as Markus suggested, I have moved Data Partitioning down to the
bottom, and mentioned it as only optionally keeping a read-only copy on
each server.  Is this better?

> > > Several single-master systems? C'mon! Pgpool simply implements the most 
> > > simplistic form of multi-master replication. 
> 
> In what way is pgpool multimaster ? last time I looked it did nothing
> but applying DML to several databses. i.e. it is not replication at all,
> or at least it is masterless, unless we think of the pgpool process
> itself as the _single_ master :)

I have remove the mention of "multi-master" from query broadcast.

> 
> > Just because you can access 
> > > the single databases inside the cluster doesn't make it less 
> > > Multi-Master, does it?
> > 
> > OK, changed to "Multi-Master Replication Using Query Broadcasting".
> 
> I think this gives completely wrong picture of what pgpool does.
> 
> How about just "Query Broadcasting" ?
> 

Done.

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: [PATCHES] replication docs: split single vs.

From

Bruce Momjian

Date:

17 November 2006, 09:27:31

Hannu Krosing wrote:
> ?hel kenal p?eval, R, 2006-11-17 kell 00:01, kirjutas Bruce Momjian:
> > > > Current version at:
> > > > 
> > > >     http://momjian.us/main/writings/pgsql/sgml/failover.html
> 
> it refers to "Warm Standby Using Point-In-Time
> Recovery" (http://momjian.us/main/writings/pgsql/sgml/warm-standby.html), maybe its a good idea to give pointers to
SkyTools(description: https://developer.skype.com/SkypeGarage/DbProjects/SkyTools
 
> code: http://pgfoundry.org/projects/skytools/ ) which includes a
> walmgr.py script which sets up and manages WAL-based standby servers.

Isn't that functionality included in 8.2, which is what this
documentation is being included with?

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: [PATCHES] replication docs: split single vs.

From

Bruce Momjian

Date:

17 November 2006, 09:55:37

Markus Schiltknecht wrote:
> Hello Bruce,
> 
> You wrote:
> > I am still feeling that data partitioning is like master/slave
> > replication because you have to get that read-only copy to the other
> > server.  
> 
> Yes, that's where replication comes into play. But data partitioning per 
> se has nothing to do with replication, has it? You can partition your 
> data however you want: among tablespaces, among databases or among 
> multiple servers. Data partitioning solves different problems than 
> replication. I think it's important to keep them separate. Why do you 
> mix-in Slony-I in the Data Partitioning Section? One can use any other 
> replication solution to "get that read-only copy to the other server".

Yes, updated.

> >>>  and I can see pgpool
> >>> as multi-master, or as several single-master systems, in that they
> >>> operate independently.  
> >> Several single-master systems? C'mon! Pgpool simply implements the most 
> >> simplistic form of multi-master replication. Just because you can access 
> >> the single databases inside the cluster doesn't make it less 
> >> Multi-Master, does it?
> > 
> > OK, changed to "Multi-Master Replication Using Query Broadcasting".
> 
> Good. That reads already better for me. ;-)

Oops, now modified to just "Query Broadcasting".

> As Jim Nasby pointed out in [1], not all solutions are as simplistic as 
> pgpool and do not necessarily have the same disadvantages - while using 
> the very same algorithm: Query Broadcasting.
> 
> I suggest we make sure to clarify that and better point out some of the 
> aspects all Multi-Master Replication have in common (see 
> replication_doku_4.diff of my patches).
> 
> > Added.
> > 
> > Added.
> 
> (the additions to "Shared Disk Failover")
> 
> Good. Short and clear. (Except perhaps: how can I find out if NFS has 
> full POSIX behavior? Do we have to go into more detail there? I dunno.)

Uh, I am unclear on that myself.  I think NFS3 or NSF4 is OK, but am
unsure.

> > Uh, but the locks are the same on each machine as if it was a single
> > server, while in a cluster, the locks are more intertwined with other
> > things that are happening on the server, no?
> 
> Sure.
> 
> Maybe you are right and we should better not use the term locking there. 
> It seems confusing because it's not clear what a 'lock' is for some 
> replication systems (i.e. also Postgres-R, how do you compare it's 
> "amount of locks"?).

OK, locks are currently mentioned only for clustering.

URL updated:
http://momjian.us/main/writings/pgsql/sgml/failover.html

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: [PATCHES] replication docs: split single vs.

From

Bruce Momjian

Date:

17 November 2006, 12:39:30

I have renamed the documentation section "High Availability and Load
Balancing".  I think the current version takes many of your comments
below into account.  Please let me know.

---------------------------------------------------------------------------

Markus Schiltknecht wrote:
> Good morning Hannu,
> 
> Hannu Krosing wrote:
> > People do that in cases where there is high write loads ("high" as in
> > "not 10+ times less than reads") and just replicating the RO copies
> > would be prohibitively expensive in either network, cpu or memory terms.
> 
> Okay. It that case it's even less like any type of replication.
> 
> IMO, Data Partitioning is the most simple method of Load Balancing. It's 
> like saying: hey, if your database server is overloaded, simply split 
> your data over multiple servers.
> 
> Which is not always possible and can lead to other problems. Some of 
> which can solved by replication solutions.
> 
> > In what way is pgpool multimaster ? last time I looked it did nothing
> > but applying DML to several databses. i.e. it is not replication at all,
> 
> Please give your definition of replication.
> 
> Wikipedia gives us [1]: "Replication refers to the use of redundant 
> resources, such as software or hardware components, to improve 
> reliability, fault-tolerance, or performance."
> 
> Pgpool does that by Query Broadcasting, no?
> 
> > or at least it is masterless, unless we think of the pgpool process
> > itself as the _single_ master :)
> 
> Hm. That's a good point. Pgpool allows to write to only one master (the 
> pgpool process) but read from multiple, synchronous masters. I admit 
> that makes it a little hard to split into Single- or Multi-Master.
> 
> Doesn't Sequoia support multiple Query Broadcasting processes? Would it 
> qualify as Multi-Master *Replication*, then?
> 
> In an ideal implementation, every Master could broadcast queries to all 
> other masters. Thus giving a *real* Multi-Master solution. Postgres-R 
> (6.4) did fall back into that mode for transactions which change a lot 
> of tuples, so that the writeset didn't exceed a certain size limit.
> 
> > I think this gives completely wrong picture of what pgpool does.
> 
> As I see it, that's because pgpool is a very limited implementation of 
> Query Broadcasting. But pgpool is not the only solution implementing 
> that algorithm. Do we want to describe the general algorithm or pgpool here?
> 
> Regards
> 
> Markus
> 
> 
> [1]: Wikipedia about Replication (Computer Science):
> http://en.wikipedia.org/wiki/Replication_%28computer_science%29
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: [PATCHES] replication docs: split single vs.

From

Tatsuo Ishii

Date:

21 November 2006, 10:03:10

From high-availability.sgml:
   Clustering For Parallel Query Execution
   This allows multiple servers to work concurrently on a single   query. One possible way this could work is for the
datato be   split among servers and for each server to execute its part of the   query and results sent to a central
serverto be combined and   returned to the user. There currently is no PostgreSQL open source   solution for this.
 

I think pgpool-II can do this.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

Re: [PATCHES] replication docs: split single vs.

From

Bruce Momjian

Date:

21 November 2006, 17:38:05

Tatsuo Ishii wrote:
> >From high-availability.sgml:
> 
>     Clustering For Parallel Query Execution
> 
>     This allows multiple servers to work concurrently on a single
>     query. One possible way this could work is for the data to be
>     split among servers and for each server to execute its part of the
>     query and results sent to a central server to be combined and
>     returned to the user. There currently is no PostgreSQL open source
>     solution for this.
> 
> I think pgpool-II can do this.

Thanks, I suspected it could, added:
   This allows multiple servers to work concurrently on a single   query.  One possible way this could work is for the
datato be   split among servers and for each server to execute its part of   the query and results sent to a central
serverto be combined   and returned to the user.  Pgpool-II has this capability.

--  Bruce Momjian   bruce@momjian.us EnterpriseDB    http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +