Thread: Replication

Replication

From

"Craig A. James"

Date:

15 June 2007, 00:40:08

Looking for replication solutions, I find:

Slony-I
  Seems good, single master only, master is a single point of failure,
  no good failover system for electing a new master or having a failed
  master rejoin the cluster.  Slave databases are mostly for safety or
  for parallelizing queries for performance.  Suffers from O(N^2)
  communications (N = cluster size).

Slony-II
  Seems brilliant, a solid theoretical foundation, at the forefront of
  computer science.  But can't find project status -- when will it be
  available?  Is it a pipe dream, or a nearly-ready reality?

PGReplication
  Appears to be a page that someone forgot to erase from the old GBorg site.

PGCluster
  Seems pretty good, but web site is not current, there are releases in use
  that are not on the web site, and also seems to always be a couple steps
  behind the current release of Postgres.  Two single-points failure spots,
  load balancer and the data replicator.

Is this a good summary of the status of replication?  Have I missed any important solutions or mischaracterized
anything?

Thanks!
Craig

(Sorry about the premature send of this message earlier, please ignore.)

Re: Replication

From

Eugene Ogurtsov

Date:

15 June 2007, 02:39:06

What about "Daffodil Replicator" - GPL -
http://sourceforge.net/projects/daffodilreplica/


--
Thanks,

Eugene Ogurtsov
Internal Development Chief Architect
SWsoft, Inc.



Craig A. James wrote:
> Looking for replication solutions, I find:
>
> Slony-I
>  Seems good, single master only, master is a single point of failure,
>  no good failover system for electing a new master or having a failed
>  master rejoin the cluster.  Slave databases are mostly for safety or
>  for parallelizing queries for performance.  Suffers from O(N^2)
>  communications (N = cluster size).
>
> Slony-II
>  Seems brilliant, a solid theoretical foundation, at the forefront of
>  computer science.  But can't find project status -- when will it be
>  available?  Is it a pipe dream, or a nearly-ready reality?
>
> PGReplication
>  Appears to be a page that someone forgot to erase from the old GBorg
> site.
>
> PGCluster
>  Seems pretty good, but web site is not current, there are releases in
> use
>  that are not on the web site, and also seems to always be a couple steps
>  behind the current release of Postgres.  Two single-points failure
> spots,
>  load balancer and the data replicator.
>
> Is this a good summary of the status of replication?  Have I missed
> any important solutions or mischaracterized anything?
>
> Thanks!
> Craig
>
> (Sorry about the premature send of this message earlier, please ignore.)
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster

Re: Replication

From

"Andreas Kostyrka"

Date:

15 June 2007, 05:23:22

Ok, slony supports two kinds of operation here: failover (which moves the master node to a new one without the old
masternode being present, it also drops the old node from replication) and move set (which moves the master node with
cooperation)

The usecases for these two are slightly different. one is for all kinds of scheduled maintenance, while the other is
whatyou do when you've got a hardware failure. 

Andreas

-- Ursprüngl. Mitteil. --
Betreff:    Re: [PERFORM] Replication
Von:    Craig James <craig_james@emolecules.com>
Datum:        15.06.2007 01:48

Andreas Kostyrka wrote:
> Slony provides near instantaneous failovers (in the single digit seconds
>  range). You can script an automatic failover if the master server
> becomes unreachable.

But Slony slaves are read-only, correct?  So the system isn't fully functional once the master goes down.

> That leaves you the problem of restarting your app
> (or making it reconnect) to the new master.

Don't you have to run a Slony app to convert one of the slaves into the master?

> 5-10MB data implies such a fast initial replication, that making the
> server rejoin the cluster by setting it up from scratch is not an issue.

The problem is to PREVENT it from rejoining the cluster.  If you have some semi-automatic process that detects the dead
serverand converts a slave to the master, and in the mean time the dead server manages to reboot itself (or its network
getsfixed, or whatever the problem was), then you have two masters sending out updates, and you're screwed. 

>> The problem is, there don't seem to be any "vote a new master" type of
>> tools for Slony-I, and also, if the original master comes back online,
>> it has no way to know that a new master has been elected.  So I'd have
>> to write a bunch of SOAP services or something to do all of this.
>
> You don't need SOAP services, and you do not need to elect a new master.
> if dbX goes down, dbY takes over, you should be able to decide on a
> static takeover pattern easily enough.

I can't see how that is true.  Any self-healing distributed system needs something like the following:

  - A distributed system of nodes that check each other's health
  - A way to detect that a node is down and to transmit that
    information across the nodes
  - An election mechanism that nominates a new master if the
    master fails
  - A way for a node coming online to determine if it is a master
    or a slave

Any solution less than this can cause corruption because you can have two nodes that both think they're master, or end
upwith no master and no process for electing a master.  As far as I can tell, Slony doesn't do any of this.  Is there a
simplersolution?  I've never heard of one. 

> The point here is, that the servers need to react to a problem, but you
> probably want to get the admin on duty to look at the situation as
> quickly as possible anyway.

No, our requirement is no administrator interaction.  We need instant, automatic recovery from failure so that the
systemstays online. 

> Furthermore, you need to checkout pgpool, I seem to remember that it has
> some bad habits in routing queries. (E.g. it wants to apply write
> queries to all nodes, but slony makes the other nodes readonly.
> Furthermore, anything inside a BEGIN is sent to the master node, which
> is bad with some ORMs, that by default wrap any access into a transaction)

I should have been more clear about this.  I was planning to use PGPool in the PGPool-1 mode (not the new PGPool-2
featuresthat allow replication).  So it would only be acting as a failover mechanism.  Slony would be used as the
replicationmechanism. 

I don't think I can use PGPool as the replicator, because then it becomes a new single point of failure that could
bringthe whole system down.  If you're using it for INSERT/UPDATE, then there can only be one PGPool server. 

I was thinking I'd put a PGPool server on every machine in failover mode only.  It would have the Slony master as the
primaryconnection, and a Slony slave as the failover connection.  The applications would route all INSERT/UPDATE
statementsdirectly to the Slony master, and all SELECT statements to the PGPool on localhost.  When the master failed,
allof the PGPool servers would automatically switch to one of the Slony slaves. 

This way, the system would keep running on the Slony slaves (so it would be read-only), until a sysadmin could get the
masterSlony back online.  And when the master came online, the PGPool servers would automatically reconnect and
write-accesswould be restored. 

Does this make sense?

Craig

Re: Replication

From

"Merlin Moncure"

Date:

15 June 2007, 14:27:47

On 6/14/07, Craig A. James <cjames@emolecules.com> wrote:
> Looking for replication solutions, I find:
>
> Slony-I
>   Seems good, single master only, master is a single point of failure,
>   no good failover system for electing a new master or having a failed
>   master rejoin the cluster.  Slave databases are mostly for safety or
>   for parallelizing queries for performance.  Suffers from O(N^2)
>   communications (N = cluster size).

with reasonable sysadmin you can implement failover system yourself.
regarding communications, you can cascade the replication to reduce
load on the master.  If you were implementing a large replication
cluster, this would probably be a good idea.  Slony is powerful,
trigger based, and highly configurable.

> Slony-II
>   Seems brilliant, a solid theoretical foundation, at the forefront of
>   computer science.  But can't find project status -- when will it be
>   available?  Is it a pipe dream, or a nearly-ready reality?

aiui, this has not gone beyond early planning phases.

> PGReplication
>   Appears to be a page that someone forgot to erase from the old GBorg site.
>
> PGCluster
>   Seems pretty good, but web site is not current, there are releases in use
>   that are not on the web site, and also seems to always be a couple steps
>   behind the current release of Postgres.  Two single-points failure spots,
>   load balancer and the data replicator.
>
> Is this a good summary of the status of replication?  Have I missed any important solutions or mischaracterized
anything?

pgpool 1/2 is a reasonable solution.  it's statement level
replication, which has some downsides, but is good for certain things.
pgpool 2 has a neat distributed table mechanism which is interesting.
You might want to be looking here if you have extremely high ratios of
read to write but need to service a huge transaction volume.

PITR is a HA solution which 'replicates' a database cluster to an
archive or a warm (can be brought up quickly, but not available for
querying) standby server.  Overhead is very low and it's easy to set
up.  This is maybe the simplest and best solution if all you care
about is continuous backup.  There are plans (a GSoC project,
actually) to make the warm standby live for (read only)
queries...if/when complete, this would provide a replication mechanism
similar. but significantly better to, mysql binary log replication,
and would provide an excellent compliment to slony.

there is also the mammoth replicator...I don't know anything about it,
maybe someone could comment?

merlin