Thread: Bidirectional replication

Bidirectional replication

From
tushar nehete
Date:
Hi,
Is there any way to do bidirectional replication for Postgresql Plus Advance Server 8.4.5?

I tried SLONY-I but its master-slave asynchronous replication.
Can we configure master-master replication by slony?


Or is there any trusted tool to do it?


Regards,
Tushar

Re: Bidirectional replication

From
John R Pierce
Date:
On 05/02/11 11:15 PM, tushar nehete wrote:
> Hi,
> Is there any way to do bidirectional replication for Postgresql Plus
> Advance Server 8.4.5?
>

PostgreSQL Plus Advanced Server is a commercial product sold by
EntepriseDB, you probably should ask them

> I tried SLONY-I but its master-slave asynchronous replication.
> Can we configure master-master replication by slony?
>
>
> Or is there any trusted tool to do it?


In general, master-master replication is not easy to do efficiently and
correctly.   every implementation on any database suffers from issues
with either very poor performance due to global synchronous locking and
2 phase commits, or it suffers from data collisions, which can only be
avoided with careful application design and programming, not easily
enforced at the database server.

AFAIK, the only postgres replication systems that even pretend to
support master-master are things like Bucardo that do the replication at
the SQL layer, by sending all update/insert/delete commands to both
servers, and under certain sequences of concurrent queries, you could
end up with different results on the two servers.

Re: Bidirectional replication

From
Sim Zacks
Date:
On 05/03/2011 09:15 AM, tushar nehete wrote:

> Hi,
> Is there any way to do bidirectional replication for Postgresql Plus
> Advance Server 8.4.5?
>
> I tried SLONY-I but its master-slave asynchronous replication.
> Can we configure master-master replication by slony?
>
>
> Or is there any trusted tool to do it?
>
>
> Regards,
> Tushar
I have heard good things about Bucardo, though I haven't tried it myself
yet. I was warned that it would be risky to have 2 masters that have the
same tables modified in both because of issues such as delayed sync,
race conditions and other such goodies that may corrupt the meaning of
the data.

Re: Bidirectional replication

From
Simon Riggs
Date:
On Tue, May 3, 2011 at 7:31 AM, Sim Zacks <sim@compulab.co.il> wrote:

> I have heard good things about Bucardo, though I haven't tried it myself
> yet. I was warned that it would be risky to have 2 masters that have the
> same tables modified in both because of issues such as delayed sync, race
> conditions and other such goodies that may corrupt the meaning of the data.


Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require "conflict resolution" cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Bidirectional replication

From
tushar nehete
Date:
Thanks you all,
I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5) server.
Can you please suggest some link which describe the installation steps in details.


Thanks,
Tushar

On Tue, May 3, 2011 at 2:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Tue, May 3, 2011 at 7:31 AM, Sim Zacks <sim@compulab.co.il> wrote:

> I have heard good things about Bucardo, though I haven't tried it myself
> yet. I was warned that it would be risky to have 2 masters that have the
> same tables modified in both because of issues such as delayed sync, race
> conditions and other such goodies that may corrupt the meaning of the data.


Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require "conflict resolution" cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Bidirectional replication

From
Raghavendra
Date:
Best to start with..

http://bucardo.org/wiki/Bucardo/Installation

Best Regards,
Raghavendra
EnterpriseDB Corporation



On Tue, May 3, 2011 at 5:34 PM, tushar nehete <tpnehete@gmail.com> wrote:
Thanks you all,
I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5) server.
Can you please suggest some link which describe the installation steps in details.


Thanks,
Tushar

On Tue, May 3, 2011 at 2:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Tue, May 3, 2011 at 7:31 AM, Sim Zacks <sim@compulab.co.il> wrote:

> I have heard good things about Bucardo, though I haven't tried it myself
> yet. I was warned that it would be risky to have 2 masters that have the
> same tables modified in both because of issues such as delayed sync, race
> conditions and other such goodies that may corrupt the meaning of the data.


Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require "conflict resolution" cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Bidirectional replication

From
Raghavendra
Date:
One more point, Please take into consideration the points mentioned by Simon Riggs in your testing. 

Best Regards,
Raghavendra
EnterpriseDB Corporation



On Tue, May 3, 2011 at 5:41 PM, Raghavendra <raghavendra.rao@enterprisedb.com> wrote:
Best to start with..

http://bucardo.org/wiki/Bucardo/Installation

Best Regards,
Raghavendra
EnterpriseDB Corporation



On Tue, May 3, 2011 at 5:34 PM, tushar nehete <tpnehete@gmail.com> wrote:
Thanks you all,
I started with Bucardo. I installed activeperl 5.12 on my Linux(RHEL5.5) server.
Can you please suggest some link which describe the installation steps in details.


Thanks,
Tushar

On Tue, May 3, 2011 at 2:49 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Tue, May 3, 2011 at 7:31 AM, Sim Zacks <sim@compulab.co.il> wrote:

> I have heard good things about Bucardo, though I haven't tried it myself
> yet. I was warned that it would be risky to have 2 masters that have the
> same tables modified in both because of issues such as delayed sync, race
> conditions and other such goodies that may corrupt the meaning of the data.


Just to be clear and fair to Bucardo, I would add a few points.

All multi-master replication solutions that use an optimistic
mechanism require "conflict resolution" cases and code. This is the
same with SQLServer and Oracle etc.. Referring to a well known problem
as a race condition seems to introduce doubt and fear into a situation
that is well understood. Bucardo does offer hooks for conflict
resolution to allow you to program around the issues.

So if I felt that multi-master replication was the right way to go for
a solution, Bucardo is a good choice.

Just to add other info: if multi-master replication uses pessimistic
coherence, then the coherence mechanism can also be a source of
contention and/or cause the need for alternative kinds of conflict
resolution.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



Re: Bidirectional replication

From
Merlin Moncure
Date:
On Tue, May 3, 2011 at 4:19 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Tue, May 3, 2011 at 7:31 AM, Sim Zacks <sim@compulab.co.il> wrote:
>
>> I have heard good things about Bucardo, though I haven't tried it myself
>> yet. I was warned that it would be risky to have 2 masters that have the
>> same tables modified in both because of issues such as delayed sync, race
>> conditions and other such goodies that may corrupt the meaning of the data.
>
>
> Just to be clear and fair to Bucardo, I would add a few points.
>
> All multi-master replication solutions that use an optimistic
> mechanism require "conflict resolution" cases and code. This is the
> same with SQLServer and Oracle etc.. Referring to a well known problem
> as a race condition seems to introduce doubt and fear into a situation
> that is well understood. Bucardo does offer hooks for conflict
> resolution to allow you to program around the issues.
>
> So if I felt that multi-master replication was the right way to go for
> a solution, Bucardo is a good choice.
>
> Just to add other info: if multi-master replication uses pessimistic
> coherence, then the coherence mechanism can also be a source of
> contention and/or cause the need for alternative kinds of conflict
> resolution.

Yeah.  One nasty property that async multi master solutions share is
that they change the definition of what 'COMMIT' means -- the database
can't guarantee the transaction is valid because not all the
supporting facts are necessarily known.  Even after libpq gives you
the green light that transaction could fail an arbitrary length of
time later, and you can't rely in the assumption it's valid until
you've done some synchronizing with the other 'masters'.  Maybe you
don't need to rely on that assumption so a 'fix it later, or possibly
never' methodology works well.  Those cases unfortunately fairly rare
in the real world.

Multi master replication, at least those implementations that don't
hold locks and release the transaction until you've got a guarantee
it's valid and will stay valid, are fundamentally incompatible with
SQL.  I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist, but the old school approach of
dividing work logically and developing communication protocols is
often the best approach to take.

merlin

Re: Bidirectional replication

From
John R Pierce
Date:
On 05/03/11 5:04 AM, tushar nehete wrote:
> I started with Bucardo. I installed activeperl 5.12 on my
> Linux(RHEL5.5) server.

why ActivePerl, which is usually used by MS Windows users, rather than
the Perl built into RHEL 5.5 (btw, 5.6 is out now, you really should run
'yum update').



Re: Bidirectional replication

From
Greg Smith
Date:
Merlin Moncure wrote:
> I know some people do some cool, usable things with that stuff,
> but the whole concept seems awfully awkward to me.  I suppose I'm a
> crotchety, cane shaking fundamentalist...

It's possible--do you sometimes find yourself yelling at young
developers, telling them to stop replicating in your yard?

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD



Re: Bidirectional replication

From
tushar nehete
Date:
Hi Thanks to ALL,
John I tried Perl built into RHEL 5.5 but i got some errors so I download activeperl 5.12 and
installed it.
After that when start installation I stuck with the error,

FAILED! (psql:/usr/local/share/bucardo/bucardo.schema:40: ERROR:  didn't get a returINSTALLATION n item from mksafefunc )

Can any one help to deal with this error  !!!

Thanks,
Tushar



On Wed, May 4, 2011 at 12:59 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Merlin Moncure wrote:
I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist...

It's possible--do you sometimes find yourself yelling at young developers, telling them to stop replicating in your yard?

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: Bidirectional replication

From
tushar nehete
Date:
Hi,
I gone through the steps from bucardo sites as,

[root@billingtest1 Bucardo-4.4.3]# perl Makefile.PL
WARNING: LICENSE is not a known parameter.
Warning: prerequisite DBD:g 2.0 not found. We have 1.49.
Warning: prerequisite ExtUtils::MakeMaker 6.32 not found. We have 6.30.
'LICENSE' is not a known MakeMaker parameter name.
Writing Makefile for Bucardo
[root@billingtest1 Bucardo-4.4.3]# make
cp bucardo_ctl blib/script/bucardo_ctl
/usr/bin/perl "-MExtUtils::MY" -e "MY->fixin(shift)" blib/script/bucardo_ctl
Manifying blib/man1/bucardo_ctl.1pm
Manifying blib/man3/Bucardo.3pm
[root@billingtest1 Bucardo-4.4.3]# make install
Installing /usr/bin/bucardo_ctl
Writing /usr/lib64/perl5/site_perl/5.8.8/x86_64-linux-thread-multi/auto/Bucardo/.packlist
Appending installation info to /usr/lib64/perl5/5.8.8/x86_64-linux-thread-multi/perllocal.pod


I gone through the script and found this is what is happening behind

bucardo=#
bucardo=#
bucardo=# CREATE OR REPLACE FUNCTION bucardo.plperlu_test()
bucardo-# RETURNS TEXT
bucardo-# LANGUAGE plperlu
bucardo-# AS $bc$
bucardo$# return 'Pl/PerlU was successfully installed';
bucardo$# $bc$;
ERROR: didn't get a return item from mksafefunc
bucardo=#

so there must be something wrong at mksafefunc or in that perl file.

Any solution?


On Wed, May 4, 2011 at 1:14 PM, tushar nehete <tpnehete@gmail.com> wrote:
Hi Thanks to ALL,
John I tried Perl built into RHEL 5.5 but i got some errors so I download activeperl 5.12 and
installed it.
After that when start installation I stuck with the error,

FAILED! (psql:/usr/local/share/bucardo/bucardo.schema:40: ERROR:  didn't get a returINSTALLATION n item from mksafefunc )

Can any one help to deal with this error  !!!

Thanks,
Tushar




On Wed, May 4, 2011 at 12:59 PM, Greg Smith <greg@2ndquadrant.com> wrote:
Merlin Moncure wrote:
I know some people do some cool, usable things with that stuff,
but the whole concept seems awfully awkward to me.  I suppose I'm a
crotchety, cane shaking fundamentalist...

It's possible--do you sometimes find yourself yelling at young developers, telling them to stop replicating in your yard?

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD




--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: Bidirectional replication

From
Andrew Sullivan
Date:
On Thu, May 05, 2011 at 03:07:20PM +0530, tushar nehete wrote:
> Hi,
> I gone through the steps from bucardo sites as,
>
> [root@billingtest1 Bucardo-4.4.3]# perl Makefile.PL
> WARNING: LICENSE is not a known parameter.
> Warning: prerequisite DBD:g 2.0 not found. We have 1.49.
> Warning: prerequisite ExtUtils::MakeMaker 6.32 not found. We have 6.30.

I don't know anything about Bucardo, but it sure looks to me like you
need to do some upgrading before continuing past this point.

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: Bidirectional replication

From
Joshua Tolley
Date:
On Mon, May 02, 2011 at 11:31:28PM -0700, John R Pierce wrote:
> AFAIK, the only postgres replication systems that even pretend to
> support master-master are things like Bucardo that do the replication at
> the SQL layer, by sending all update/insert/delete commands to both
> servers, and under certain sequences of concurrent queries, you could
> end up with different results on the two servers.

Actually, Bucardo doesn't do statement replication. It, like Slony for
instance, replicates data, not SQL statements. And as you pointed out, it does
do bidirectional replication in a way that's sufficient for some use cases.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment

Re: Bidirectional replication

From
John R Pierce
Date:
On 05/05/11 8:05 PM, Joshua Tolley wrote:
> Actually, Bucardo doesn't do statement replication. It, like Slony for
> instance, replicates data, not SQL statements. And as you pointed out, it does
> do bidirectional replication in a way that's sufficient for some use cases.


does it use triggers for replication, similar to Slony, then?
obviously, it can't be doing WAL level replication or it wouldn't be
able to do any sort of master-master.



Re: Bidirectional replication

From
Joshua Tolley
Date:
On Thu, May 05, 2011 at 03:07:20PM +0530, tushar nehete wrote:
> Warning: prerequisite DBD:Pg 2.0 not found. We have 1.49.
> Warning: prerequisite ExtUtils::MakeMaker 6.32 not found. We have 6.30.

You need to install DBD::Pg, version 2.0 or greater. You also need to install
ExtUtils::MakeMaker version 6.32 or greater. These are both Perl packages,
available several different ways. Sometimes your operating system will
provide sufficiently recent versions through its own packaging system (e.g.
"yum install perl-DBD-Pg"); the more difficult way is to get it through CPAN,
per instructions here: http://www.cpan.org/modules/INSTALL.html

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment

Re: Bidirectional replication

From
Joshua Tolley
Date:
On Thu, May 05, 2011 at 08:13:55PM -0700, John R Pierce wrote:
> On 05/05/11 8:05 PM, Joshua Tolley wrote:
>> Actually, Bucardo doesn't do statement replication. It, like Slony for
>> instance, replicates data, not SQL statements. And as you pointed out, it does
>> do bidirectional replication in a way that's sufficient for some use cases.
>
> does it use triggers for replication, similar to Slony, then?
> obviously, it can't be doing WAL level replication or it wouldn't be
> able to do any sort of master-master.

Exactly. It doesn't function exactly like Slony does under the hood, of
course, but it is trigger based. One notable difference between Bucardo and
Slony is that whereas Slony's triggers store the entire row data in a separate
log table when something changes, Bucardo stores only the primary key. As a
result, Bucardo doesn't apply each transaction to the replica databases, but
rather a set of all transactions that took place on the source since the last
time it synchronized things. For whatever that's worth.

--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com

Attachment

Re: Bidirectional replication

From
John R Pierce
Date:
On 05/05/11 8:14 PM, Joshua Tolley wrote:
> On Thu, May 05, 2011 at 03:07:20PM +0530, tushar nehete wrote:
>> Warning: prerequisite DBD:Pg 2.0 not found. We have 1.49.
>> Warning: prerequisite ExtUtils::MakeMaker 6.32 not found. We have 6.30.
> You need to install DBD::Pg, version 2.0 or greater. You also need to install
> ExtUtils::MakeMaker version 6.32 or greater. These are both Perl packages,
> available several different ways. Sometimes your operating system will
> provide sufficiently recent versions through its own packaging system (e.g.
> "yum install perl-DBD-Pg"); the more difficult way is to get it through CPAN,
> per instructions here: http://www.cpan.org/modules/INSTALL.html

if you do get it into your mind that you need a newer version of perl
than was supplied with RHEL 5 or whatever, do NOT replace the system
perl in /usr/lib/perl ... instead, build your own perl to run in
/usr/local/perl5 or /opt/perl5 or something.

I only see perl-DBD-Pg 1.49 in the RHEL repos, and I don't see
perl-ExtUtils-MakeMaker in there at all (or in EPEL or in RpmForge). so
you might be stuck with going the CPAN route.   This will likely require
you to install the development tools (gcc etc) as well as perl-devel

If you want to do it cleanly, there exist scripts to turn CPAN modules
into RPMs, so your system files remain under RPM management... otherwise
a future yum upgrade could step on what you've manually installed.



Re: Bidirectional replication

From
Andrew Sullivan
Date:
On Thu, May 05, 2011 at 09:22:14PM -0600, Joshua Tolley wrote:

> course, but it is trigger based. One notable difference between Bucardo and
> Slony is that whereas Slony's triggers store the entire row data in a separate
> log table when something changes, Bucardo stores only the primary key.

That's interesting.  An earlier replication system we had at Afilias
(erserver, which was descended from the rserv code that used to be in
contrib/) used this strategy.[1]

I liked to distinguish between the "latest consistent data" strategy
and the "logical order application" strategy.

There are some advantages to the latest consistent data strategy, the
greatest of which is that you don't get the "lag" problems.  Under
Slony, you have to capture all the state between the last replication
sync and the current one, even if there are multiple changes to the
same row.

There is a problem, however, in that if you want to use your replica
to capture various changes along the way, you can't do it.  Moreover,
there's no guarantee under such a system that your replica is ever
consistent with the way a given _client_ saw the database (there is a
guarantee that it is consistent with some database state on the
master, of course, but not a guarantee that it ever looks just as a
client would have seen it at the moment of the client's action).
These two counter-considerations were among the things that made the
erserver strategy undesirable from my point of view given what we were
trying to do at Afilias at the time.  So that's why I was happy we
changed direction with Slony.  (But that decision came with its own
complications.)

A

[1] The code is still hanging around somewhere, I think, mostly as an
example of what not to do.  For instance: copying entire result sets
into memory and them sorting them is a bad idea.  Also, if someone
imposes on you a programmer you are fairly sure doesn't understand the
problem you're working on, you should quit on the spot.  (I have to
keep relearning this one, though.)

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: Bidirectional replication

From
Vick Khera
Date:
On Fri, May 6, 2011 at 8:59 AM, Andrew Sullivan <ajs@crankycanuck.ca> wrote:
> That's interesting.  An earlier replication system we had at Afilias
> (erserver, which was descended from the rserv code that used to be in
> contrib/) used this strategy.[1]
>

Oh... I remember erserver.  It served us well for about 2 years for a
simple, not very high-velocity database that was 99.44% read-only.  I
did have to monitor it closely and restart regularly.  At least with
slony I don't worry about it crashing out from under me... :)

Re: Bidirectional replication

From
Andrew Sullivan
Date:
On Fri, May 06, 2011 at 09:15:37AM -0400, Vick Khera wrote:
>
> Oh... I remember erserver.  It served us well for about 2 years for a
> simple, not very high-velocity database that was 99.44% read-only.  I
> did have to monitor it closely and restart regularly.  At least with
> slony I don't worry about it crashing out from under me... :)

<oldtimer style="doddering">

To be fair to it, erserver managed to live through the launch of the
.info registry and its expansion, as well as the technical takeover of
.org by Afilias.  These were both fairly high-volume systems.  It was
plenty challenging to use for this arrangement, but it did manage to
do the job.

The big flaw in it was the decision on the part of the "architect" at
the time to use Java.  Java was picked because, well, because it said
Enterprise in the name, as far as I could tell.  This was made
considerably worse by the decision to employ a programmer (who shall
remain nameless) on the project who didn't know anything about
Postgres.  (He was gone from the project and from anything to do with
the Afilias systems by the end of 2001, IIRC.)

This was all to re-implement POC code that Vadim Mikheev had written,
which was an extenion of the rserv code.  That used named pipes, and
the guy who got to make these decisions thought named pipes were for
weenies.  All the deep and clever tricks in erserver were ones Vadim
wrote.  (I suspect if you ask Jan, he'll tell you that some of the
Slony code, especially in the way the triggers work, was also inspired
by that code.)

The fellow who reimplemented all this in Java desperately wanted to
use multiple threads to get all the data, so he could run in parallel.
(He had a mania for threads.  He had an error reporting thread that
was supposed to handle all exceptions.  He stubbed it out and then
never wrote the exception handler, which meant that every single
exception you ever got from the code came from the same line number
and said "Exception" but nothing more.  This was extremely fun for
debugging.  To this day, whenever someone starts telling me that
threads are the obvious solution to some problem, I want to quiz them
really hard.  I am pretty sure I can count in single digits the number
of people I've ever met who actually know what they're doing with
threads.)

His plan, in fact, was to open one connection to the database for
every table in the replication set.  When I told him that this was a
non-starter (we'd have used 50 or 60 connections just for this -- in
the days when the JDBC driver would issue BEGIN and sit there like a
dummy, blocking all your vacuums), he didn't revise his plan, so we
ended up with a system that could in principle pull inconsistent views
of the data into the replication engine.  The way to fix this, as it
turned out, was to force the connection pool only to use one
connection.  Anyway, because of this multithreading plan, he couldn't
sort the data to be applied in a reasonable way in the database, so
he'd pull all the pending data into the replication engine, then sort
it, and then apply it on the slave.  Well, no, actually.  He'd sort
and copy it, because if he had more than one slave the sort might have
to be different.  Or something -- I never actually understood the
reason for this, and I stopped trying after all my teeth had to be
replaced from the grinding.  But it meant that if you got far enough
behind, the replication engine would blow up because it ran out of
memory it could allocate.  IIRC, the VM was limited to 4G total.

We had a large number of mitigation strategies we used to catch the
possibility that the replication engine would get behind.  I'm pretty
sure it was the analysis of on-call billing hours that ultimately
convinced management we needed to start over.  I know that some people
(I am among them) find Slony complicated and hard to manage.  I can
tell you, however, that compared to the stone tools we had for
replicating in 2001, it was a dream.

</oldtimer>

A

--
Andrew Sullivan
ajs@crankycanuck.ca

Re: Bidirectional replication

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> I only see perl-DBD-Pg 1.49 in the RHEL repos, and I don't see
> perl-ExtUtils-MakeMaker in there at all (or in EPEL or in RpmForge).

For the record, only DBD::Pg is really necessary - everything will
still work fine with an older verison of ExtUtils::MakeMaker.
DBD::Pg 1.49 is pretty old, but the good news is that nearly every
other repo in the world has a newer version, and that it has very
few dependencies if you want to install it manually.

- --
Greg Sabino Mullane greg@turnstep.com
PGP Key: 0x14964AC8 201105082239
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk3HU/wACgkQvJuQZxSWSsipmQCeK4pys0eZtBmlnhg+QVbKSzBE
J9wAnR6K7fxRTt/MIO1EqlHoMh/t8FcV
=+oAc
-----END PGP SIGNATURE-----



Re: Bidirectional replication

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Yeah.  One nasty property that async multi master solutions share is
> that they change the definition of what 'COMMIT' means -- the database
> can't guarantee the transaction is valid because not all the
> supporting facts are necessarily known.  Even after libpq gives you
> the green light that transaction could fail an arbitrary length of
> time later, and you can't rely in the assumption it's valid until
> you've done some synchronizing with the other 'masters'.  Maybe you
> don't need to rely on that assumption so a 'fix it later, or possibly
> never' methodology works well.  Those cases unfortunately fairly rare
> in the real world.

I don't quite follow you here. Are you talking about *synchronous* multi-master?
Async multi-master works just fine, as long as you are not expecting the
servers to give the exact same answer at the exact same time. But certainly
transactions are "valid".

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201105082243
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk3HVPgACgkQvJuQZxSWSsgouACfSUJuEy8rg3mosu+WQNU0wpHU
mJgAoJmprgcDef4Wb3wowwfuulvR46FI
=Sedp
-----END PGP SIGNATURE-----



Re: Bidirectional replication

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>> course, but it is trigger based. One notable difference between
>> Bucardo and Slony is that whereas Slony's triggers store the entire
>> row data in a separate log table when something changes, Bucardo
>> stores only the primary key.

> That's interesting.  An earlier replication system we had at Afilias
> (erserver, which was descended from the rserv code that used to be in
> contrib/) used this strategy.[1]

Yeah, I've talked with Jan about the similarities and differences
between eserver and Bucardo. Seem philosophically simliar, although
a bit diverged technically at this point.

> I liked to distinguish between the "latest consistent data" strategy
> and the "logical order application" strategy.
>
> There are some advantages to the latest consistent data strategy, the
> greatest of which is that you don't get the "lag" problems.  Under
> Slony, you have to capture all the state between the last replication
> sync and the current one, even if there are multiple changes to the
> same row.
>
> There is a problem, however, in that if you want to use your replica
> to capture various changes along the way, you can't do it.  Moreover,
> there's no guarantee under such a system that your replica is ever
> consistent with the way a given _client_ saw the database (there is a
> guarantee that it is consistent with some database state on the
> master, of course, but not a guarantee that it ever looks just as a
> client would have seen it at the moment of the client's action).

Not sure I really see why this is important. You mean as far as the
fact that tables X, Y, and Z are in a replicated set, but client A
makes changes to X and Y, and then client B makes changes to table
Z, and thus Bucardo slurps in X, Y, and Z, but never as client A
or B saw them? Any client connecting to the master after client
B commits would have the same "problem", no?

> [1] The code is still hanging around somewhere, I think, mostly
> as an example of what not to do.

Heh, I gotta look that up someday.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201105082255
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk3HV7wACgkQvJuQZxSWSsg23wCfemaF8EBf58C47omG0Fc8TMeb
WB4AoIZPZ57zDKLfoJ/wN2CFpUbQuq3k
=CrQ3
-----END PGP SIGNATURE-----



Re: Bidirectional replication

From
Sim Zacks
Date:
>> Yeah.  One nasty property that async multi master solutions share is
>> that they change the definition of what 'COMMIT' means -- the database
>> can't guarantee the transaction is valid because not all the
>> supporting facts are necessarily known.  Even after libpq gives you
>> the green light that transaction could fail an arbitrary length of
>> time later, and you can't rely in the assumption it's valid until
>> you've done some synchronizing with the other 'masters'.  Maybe you
>> don't need to rely on that assumption so a 'fix it later, or possibly
>> never' methodology works well.  Those cases unfortunately fairly rare
>> in the real world.
> I don't quite follow you here. Are you talking about *synchronous* multi-master?
> Async multi-master works just fine, as long as you are not expecting the
> servers to give the exact same answer at the exact same time. But certainly
> transactions are "valid".
Lets say you have a foreign key constraint on delete restrict. On one
master you delete the key as there are no child entities. On the other
master you add a child entity, which should prevent deleting the parent
record. Both masters allowed the transaction to be committed, which
means that the users have both been given acknowledgement that their
actions are valid. If the rules are that the guy who put in the child
wins that means the committed delete never happened. If the parent wins
that means that the insert of the child was illegal.

Sim

Re: Bidirectional replication

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


>>> Yeah.  One nasty property that async multi master solutions share is
>>> that they change the definition of what 'COMMIT' means -- the database
>>> can't guarantee the transaction is valid because not all the
>>> supporting facts are necessarily known.  Even after libpq gives you
>>> the green light that transaction could fail an arbitrary length of
>>> time later, and you can't rely in the assumption it's valid until
>>> you've done some synchronizing with the other 'masters'.  Maybe you
>>> don't need to rely on that assumption so a 'fix it later, or possibly
>>> never' methodology works well.  Those cases unfortunately fairly rare
>>> in the real world.

>> I don't quite follow you here. Are you talking about *synchronous* multi-master?
>> Async multi-master works just fine, as long as you are not expecting the
>> servers to give the exact same answer at the exact same time. But certainly
>> transactions are "valid".

> Lets say you have a foreign key constraint on delete restrict. On one
> master you delete the key as there are no child entities. On the other
> master you add a child entity, which should prevent deleting the parent
> record. Both masters allowed the transaction to be committed, which
> means that the users have both been given acknowledgement that their
> actions are valid. If the rules are that the guy who put in the child
> wins that means the committed delete never happened. If the parent wins
> that means that the insert of the child was illegal.

Well, that's one way to look at it, but you have to remember to treat the
async replication as the invisible hand of another session, that may
change what you have just committed, just like any other user may. If I
add a child entry, then user X deletes said entry, and then user Y deletes
the parent entry, that is for all intent and purposes the same as what happens
in a replication scenario. The difference is that technically I add the child
entry, user Y deletes said entry, and /then/ user R (replication) deletes both
the parent and child (or inserts the parent back in). But in both cases, both
the child creator and the parent deleter receive back a "ok commit". If you
have a very large async response time, and your application has a very tight
control over things, it may cause a problem, but in real life the syncing
happens quite quickly, and the window for even catching both writes, not to
mention sorting it out, is quite small. And I would expect an application
running against a MM database would be able to handle such events anyway.

- --
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201105282339
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAk3hwKIACgkQvJuQZxSWSsgu9gCgpBrlVa5xvmRNdIdcstlv60oJ
tQsAn0sPvDHNZI+CVIT46SP4mEP7aGLM
=4c4P
-----END PGP SIGNATURE-----