Thread: Clustering, mirroriing, or replication?

Clustering, mirroriing, or replication?

From

"Jon Brisbin"

Date:

11 August 2004, 19:03:26

We're trying to figure out how we can implement a reasonably simple cluster
of postgres servers on a private network at our store locations.  The idea
is to have a group of 2-6 PCs each be able to share data in a replicated
manner.  Having a separate database server is not an option due to the
immense cost and logistics (we have over 800 stores), so we need to find a
way to mirror or replicate data between all the nodes currently up and on
the network.  Some nodes might be not always be up, so we need a failover
system.  I'm testing to see if the dbmirror that comes with Postgres will
work, but I haven't got it completely figured out yet.

If anyone has experience setting up and maintaing clusters with
replication/mirroring, I'd like to hear of your experience and maybe pick
your brain a little.  Any help you could give would be excellent!


Thanks!

Jon Brisbin
=======================================
Development Programmer
NPC International
(620) 235-7726

Re: Clustering, mirroriing, or replication?

From

Richard Huxton

Date:

12 August 2004, 05:32:27

Jon Brisbin wrote:
> We're trying to figure out how we can implement a reasonably simple cluster
> of postgres servers on a private network at our store locations.  The idea
> is to have a group of 2-6 PCs each be able to share data in a replicated
> manner.  Having a separate database server is not an option due to the
> immense cost and logistics (we have over 800 stores), so we need to find a
> way to mirror or replicate data between all the nodes currently up and on
> the network.  Some nodes might be not always be up, so we need a failover
> system.  I'm testing to see if the dbmirror that comes with Postgres will
> work, but I haven't got it completely figured out yet.

I'm talking well out of my experience here, having never tried to set up
800 sets of desktop PCs as replicated shared DB servers, but I can't
believe it's more expensive to put servers in. In my experience, even a
cheap white-box dedicated server is more reliable than a PC being used
for desktop apps.

Assume each user kicks the power cable out of their PC once a year, that
means you'll be dealing with several failovers a day on average.

Assuming you can't get a server for each store (that'd be a minimum of
£400 * 800 = £320,000 in the UK), is there any networking between them?

--
   Richard Huxton
   Archonet Ltd

Re: Clustering, mirroriing, or replication?

From

Richard Huxton

Date:

12 August 2004, 15:27:31

Jon Brisbin wrote:
> At each store, there are a number of POS registers that are networked.  If
> they enter a ticket at one register, they should be able to go to any other
> register to tender the transaction, etc...
>
> We already have mechanisms in place to remotely download and install
> software on existing PCs, so no additional outlay of cash would be required
> to set up these existing machines with more software.  To install new PCs in
> 800 stores at the same time would be a nightmare of logistics that we can't
> currently contemplate.  We also can't contemplate what nasty things will be
> done to us if we go to the executives and say we're going to fix their
> problems by forcing the stores to rely on only one machine.  The restaurant
> business is not kind to PC hardware.

OK, fair enough.

> One option is to have the POS software always write to a local database,
> with that change getting replicated to all other active nodes.  Another
> option is to have a "master" database, with a failover as backup.  Downside
> here is that if both machines fail, then they are dead in the water.  We
> can't make money if the registers aren't operational.  However, this is
> similar to what happens now (which is why we want to change it.)
>
> Having mucked with postgres replication a little in the last couple of days,
> I'm starting to wonder just how long it will take us to develop a good
> replication mechanism.  If one is already in place, I'd just like to use
> that.  Or at least learn from someone else trying to set up something
> similar.

Well, sounds like Slony would pretty much do what you want (one master,
multiple backups) apart from the failover.

I'm assuming your POS database is fairly simple and your transactions
are at a slow rate which means the traffic generated by replication
would be minimal. What constitutes a "dead" database is the tricky part.
Do you have a "master" terminal at the moment or are they all strictly
peers?

--
   Richard Huxton
   Archonet Ltd

Re: Clustering, mirroriing, or replication?

From

"Jon Brisbin"

Date:

12 August 2004, 17:13:55

At each store, there are a number of POS registers that are networked.  If
they enter a ticket at one register, they should be able to go to any other
register to tender the transaction, etc...

We already have mechanisms in place to remotely download and install
software on existing PCs, so no additional outlay of cash would be required
to set up these existing machines with more software.  To install new PCs in
800 stores at the same time would be a nightmare of logistics that we can't
currently contemplate.  We also can't contemplate what nasty things will be
done to us if we go to the executives and say we're going to fix their
problems by forcing the stores to rely on only one machine.  The restaurant
business is not kind to PC hardware.

One option is to have the POS software always write to a local database,
with that change getting replicated to all other active nodes.  Another
option is to have a "master" database, with a failover as backup.  Downside
here is that if both machines fail, then they are dead in the water.  We
can't make money if the registers aren't operational.  However, this is
similar to what happens now (which is why we want to change it.)

Having mucked with postgres replication a little in the last couple of days,
I'm starting to wonder just how long it will take us to develop a good
replication mechanism.  If one is already in place, I'd just like to use
that.  Or at least learn from someone else trying to set up something
similar.

Thanks!

Jon Brisbin
=======================================
Development Programmer
NPC International
(620) 235-7726

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org]On Behalf Of Richard Huxton
Sent: Thursday, August 12, 2004 3:32 AM
To: Jon Brisbin
Cc: pgSQL General
Subject: Re: [GENERAL] Clustering, mirroriing, or replication?

Jon Brisbin wrote:
> We're trying to figure out how we can implement a reasonably simple
cluster
> of postgres servers on a private network at our store locations.  The idea
> is to have a group of 2-6 PCs each be able to share data in a replicated
> manner.  Having a separate database server is not an option due to the
> immense cost and logistics (we have over 800 stores), so we need to find a
> way to mirror or replicate data between all the nodes currently up and on
> the network.  Some nodes might be not always be up, so we need a failover
> system.  I'm testing to see if the dbmirror that comes with Postgres will
> work, but I haven't got it completely figured out yet.

I'm talking well out of my experience here, having never tried to set up
800 sets of desktop PCs as replicated shared DB servers, but I can't
believe it's more expensive to put servers in. In my experience, even a
cheap white-box dedicated server is more reliable than a PC being used
for desktop apps.

Assume each user kicks the power cable out of their PC once a year, that
means you'll be dealing with several failovers a day on average.

Assuming you can't get a server for each store (that'd be a minimum of
£400 * 800 = £320,000 in the UK), is there any networking between them?

--
   Richard Huxton
   Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

               http://archives.postgresql.org

Re: Clustering, mirroriing, or replication?

From

Lincoln Yeoh

Date:

13 August 2004, 11:26:05

Most replication systems add a fair amount of complexity. How reliable are
current replication systems? Are they replication systems for performance
or for reliability+availability?

How much does it cost to make sure that the probability of both master and
failover machines failing is lower or as low as the probability of the
replication system failing? What would the impact be e.g. resulting
downtime etc?

I have seen some HA + load balancing firewall (not DB) systems which were
hardly worth the hassle - seemed to have just about the same amount of
downtime (if not more due to the added complexity and interactions with
other stuff -  e.g. buggy switches).

Some people have made similar mutterings about some DB clustering
solutions. Even if that's because they didn't do it right, it may be
because it's difficult to do it right, and that's not good is it? Could
just be a "feel good thing" that doesn't really add much to reliability for
all the added effort and overhead.

Does anyone know what's the most reliable platform postgresql can run on?
With or without scheduled downtime?

At 11:03 AM 8/12/2004 -0500, Jon Brisbin wrote:
>One option is to have the POS software always write to a local database,
>with that change getting replicated to all other active nodes.  Another
>option is to have a "master" database, with a failover as backup.  Downside
>here is that if both machines fail, then they are dead in the water.  We
>can't make money if the registers aren't operational.  However, this is
>similar to what happens now (which is why we want to change it.)
>
>Having mucked with postgres replication a little in the last couple of days,
>I'm starting to wonder just how long it will take us to develop a good
>replication mechanism.  If one is already in place, I'd just like to use
>that.  Or at least learn from someone else trying to set up something
>similar.

Re: Clustering, mirroriing, or replication?

From

Date:

13 August 2004, 12:15:32

I find 7.3 and 7.4 to be rock solid on dual processor servers of Intel P4 and Xeon running RH9 and
Xeon running Fedora Core 2.

My systems run heavy load for about 6 hours straight each night when it is syncing data with legacy
systems (disk io and cpu maxed out).

I don't use replication, just nightly dump and restore to backup server.

Indeed the only problems I have ever had turned out to be hardware, a bad motherboard.

Terry Fielder
Manager Software Development and Deployment
Great Gulf Homes / Ashton Woods Homes
terry@greatgulfhomes.com
Fax: (416) 441-9085


> -----Original Message-----
> From: pgsql-general-owner@postgresql.org
> [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Lincoln Yeoh
> Sent: Friday, August 13, 2004 10:25 AM
> To: Jon Brisbin; Richard Huxton
> Cc: pgSQL General
> Subject: Re: [GENERAL] Clustering, mirroriing, or replication?
>
>
> Most replication systems add a fair amount of complexity. How
> reliable are
> current replication systems? Are they replication systems for
> performance
> or for reliability+availability?
>
> How much does it cost to make sure that the probability of
> both master and
> failover machines failing is lower or as low as the
> probability of the
> replication system failing? What would the impact be e.g. resulting
> downtime etc?
>
> I have seen some HA + load balancing firewall (not DB)
> systems which were
> hardly worth the hassle - seemed to have just about the same
> amount of
> downtime (if not more due to the added complexity and
> interactions with
> other stuff -  e.g. buggy switches).
>
> Some people have made similar mutterings about some DB clustering
> solutions. Even if that's because they didn't do it right, it may be
> because it's difficult to do it right, and that's not good is
> it? Could
> just be a "feel good thing" that doesn't really add much to
> reliability for
> all the added effort and overhead.
>
> Does anyone know what's the most reliable platform postgresql
> can run on?
> With or without scheduled downtime?
>
> At 11:03 AM 8/12/2004 -0500, Jon Brisbin wrote:
> >One option is to have the POS software always write to a
> local database,
> >with that change getting replicated to all other active
> nodes.  Another
> >option is to have a "master" database, with a failover as
> backup.  Downside
> >here is that if both machines fail, then they are dead in
> the water.  We
> >can't make money if the registers aren't operational.
> However, this is
> >similar to what happens now (which is why we want to change it.)
> >
> >Having mucked with postgres replication a little in the last
> couple of days,
> >I'm starting to wonder just how long it will take us to
> develop a good
> >replication mechanism.  If one is already in place, I'd just
> like to use
> >that.  Or at least learn from someone else trying to set up something
> >similar.
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index
> scan if your
>       joining column's datatypes do not match
>

Re: Clustering, mirroriing, or replication?

From

"Chris Ochs"

Date:

13 August 2004, 12:55:08

>
> Does anyone know what's the most reliable platform postgresql can run on?
> With or without scheduled downtime?


This reminded me...  Not long ago I was looking at used sun servers.   You
can pick up refurbished sun enterprise systems for between $4,000 and
$10,000 fairly easily.  For instance an enterprise E3500 with 8 processors,
4gb ram, and 8 18gb fibre channel drives was only $5,000.  I couldn't put
together an equivalent intel system for the same price.

Chris