Thread: Parallel databases?

Parallel databases?

From
A James Lewis
Date:
Does anyone have any suggestions for a way to keep 2 databases in sync?

Ideally updates need to be made to both... this can't be too uncommon a
requirement..... any kind of HA would need it....

A. James Lewis (james@fsck.co.uk)
- Linux is swift and powerful.  Beware its wrath...


Re: Parallel databases?

From
Andrew Schmeder
Date:
> Does anyone have any suggestions for a way to keep 2 databases in sync?
> Ideally updates need to be made to both... this can't be too uncommon a
> requirement..... any kind of HA would need it....

I use a PHP or Perl codebase to access databases (mysql or postgres).  I always
use a wrapper class to control the database connection because it makes the
code a lot more simple -- additionally I could easily insert logic in the
class to open a second connection to another database and duplicate all
inserts, updates and deletes.  Alternatively the info could be queued in a log
file for delayed usage.  However this is still far from a true HA setup.  For
that either you need a lot of intelligent code or Oracle Parallel Server.

Andy

Re: Parallel databases?

From
Colin Smith
Date:
On Fri, 14 Apr 2000, A James Lewis wrote:

>
> Does anyone have any suggestions for a way to keep 2 databases in sync?
>
> Ideally updates need to be made to both... this can't be too uncommon a
> requirement..... any kind of HA would need it....

I believe the drbd system can do this across a LAN. I haven't looked in
detail at it. You should be able to find info on the linux-ha.org site. I
know it mirrors blocks across multiple servers but I have no idea how well
it handles locking etc.

--
|Colin Smith:  Colin.Smith@yelm.freeserve.co.uk  |   Windows 2000    |
|Configuration management library for Unix/Linux |        AKA        |
|    http://www.yelm.freeserve.co.uk/libcfg/     |    The W2K Bug    |


Re: Parallel databases?

From
Colin Smith
Date:
On Sat, 15 Apr 2000, Colin Smith wrote:

> On Fri, 14 Apr 2000, A James Lewis wrote:
>
> >
> > Does anyone have any suggestions for a way to keep 2 databases in sync?
> >
> > Ideally updates need to be made to both... this can't be too uncommon a
> > requirement..... any kind of HA would need it....

I've had a closer look at drbd. The home site is:
http://www.complang.tuwien.ac.at/reisner/drbd/

It will replicate the blocks on a device out to backup servers. If the
primary system fails the backup can take over. It doesn't do any
distributed locking so it's strictly a failover service at the moment.

It also looks like a real performance killer which is pretty much what
you'd expect with each block being sent over the LAN as well as to disk.
Definitely a case for a very high bandwidth low latency network (SCI, SP
switch) 100Mbit/1Gbit dedicated might be acceptable though.

--
|Colin Smith:  Colin.Smith@yelm.freeserve.co.uk  |   Windows 2000    |
|Linux: Delivers on the promises Microsoft make. |        AKA        |
|             http://www.linux.org/              |    The W2K Bug    |


Re: Parallel databases?

From
Jurgen Defurne
Date:

James Lewis wrote:

> Does anyone have any suggestions for a way to keep 2 databases in sync?
>
> Ideally updates need to be made to both... this can't be too uncommon a
> requirement..... any kind of HA would need it....

No. The way HA works is that the system is made in such a way that you
can't lose data, or that you don't lose CPU cycles. HA does not make any
assumptions on the kind of applications that are running on the system.

If you want to experiment with HA, start with building a mirroring disk on
your Linux system to get the feel of it.

Then try to asses what you really want : low down-time or 7x24 operation.
This is what determines your HA system.

If you don't want to lose CPY cycles, then you have to build a cluster
with e.g. two CPU's. These should share their mass storage. This mass
storage should be organised as RAID. With hot-plug capabilities, it is
possible to keep the system running either if a CPU goes down or if a
drive fails.

The worst thing that you can do is to base the implementation of your
application upon the fact that the system should have high-availability
requirements. That is not a database issue, but an operating system issue.

Jurgen Defurne
defurnj@glo.be



Re: Parallel databases?

From
Jurgen Defurne
Date:
A James Lewis wrote:

> What would you discribe Oracle Parallel Server then?

I know Oracle and PG. The big difference is that PG is an application which
runs
on top of an OS, while Oracle bypasses for its functioning a whole lot of the
OS
and replaces it with functions of its own.

This means that for these parallel systems, one needs a product which is
called
SQL*Net, which provides the functionality needed to channel data over a
network.
Only on top of that, Oracle Parallel Server is implemented.

The database administrator sets up the replication etc., but to the
programmer this
is all COMPLETELY TRANSPARENT!!

>
> I am well aware of RAID/Mirroring etc... but I want the ability for one
> database to seamlessly take over from the other (OR even with the
> appliction getting thrown off and having to re-connect) but the DATA must
> be kept in sync across both machines...
>
> What I want is some sort or replication, bi-directional would be nice but
> not vital...
>

Referring to the part above, this would mean that you will need to dig into
the code very deep, to the part where data is physically written to the data
files, and put code there by which it is possible to write this data across
the
network to the datafiles of the replication database. (Hey, guys, what would
you think of that ?)

>
> Even if the machine has mirrored disks its CPU can fail and a secondary
> machine is useless unless it has the identical data...

The hardware way of doing this (also see the High-Availability HOWTO) :
- Build a RAID : mirroring, etc, something which makes your data survive a
disk
crash
- SHARE this RAID between two CPU's (for sharing strategies, also see the HA
HOWTO)

When a CPU fails, the other one can take over with the exact data.
When a disk fails, your data is still safe, the system can continue running
until
the defective part is replaced.

> The only way round this would be to re-write the application to write to
> BOTH databases and this could be a problem with some pre-existing software
> packages and give the potential to get the DB's out of sync.
>
> Writing the application from scratch would mean that it was not so much of
> a problem.
>

If you do such a thing at the application level, it will always be a burden.
The
goal of a database programmer is to write solutions to problems, not to be
a systems programmer. If you have to take into account for every problem
that you must solve, that data must be replicated, then you will double the
time needed to write every application. That is why I put the emphasis of
replication on the system level.

>
> James Lewis wrote:
>
> > Does anyone have any suggestions for a way to keep 2 databases in sync?
> >
> > Ideally updates need to be made to both... this can't be too uncommon a
> > requirement..... any kind of HA would need it....
>
> No. The way HA works is that the system is made in such a way that you
> can't lose data, or that you don't lose CPU cycles. HA does not make any
> assumptions on the kind of applications that are running on the system.
>
> If you want to experiment with HA, start with building a mirroring disk on
> your Linux system to get the feel of it.
>
> Then try to asses what you really want : low down-time or 7x24 operation.
> This is what determines your HA system.
>
> If you don't want to lose CPY cycles, then you have to build a cluster
> with e.g. two CPU's. These should share their mass storage. This mass
> storage should be organised as RAID. With hot-plug capabilities, it is
> possible to keep the system running either if a CPU goes down or if a
> drive fails.
>
> The worst thing that you can do is to base the implementation of your
> application upon the fact that the system should have high-availability
> requirements. That is not a database issue, but an operating system issue.
>

Jurgen Defurne