Thread: Parallel databases?
Does anyone have any suggestions for a way to keep 2 databases in sync? Ideally updates need to be made to both... this can't be too uncommon a requirement..... any kind of HA would need it.... A. James Lewis (james@fsck.co.uk) - Linux is swift and powerful. Beware its wrath...
> Does anyone have any suggestions for a way to keep 2 databases in sync? > Ideally updates need to be made to both... this can't be too uncommon a > requirement..... any kind of HA would need it.... I use a PHP or Perl codebase to access databases (mysql or postgres). I always use a wrapper class to control the database connection because it makes the code a lot more simple -- additionally I could easily insert logic in the class to open a second connection to another database and duplicate all inserts, updates and deletes. Alternatively the info could be queued in a log file for delayed usage. However this is still far from a true HA setup. For that either you need a lot of intelligent code or Oracle Parallel Server. Andy
On Fri, 14 Apr 2000, A James Lewis wrote: > > Does anyone have any suggestions for a way to keep 2 databases in sync? > > Ideally updates need to be made to both... this can't be too uncommon a > requirement..... any kind of HA would need it.... I believe the drbd system can do this across a LAN. I haven't looked in detail at it. You should be able to find info on the linux-ha.org site. I know it mirrors blocks across multiple servers but I have no idea how well it handles locking etc. -- |Colin Smith: Colin.Smith@yelm.freeserve.co.uk | Windows 2000 | |Configuration management library for Unix/Linux | AKA | | http://www.yelm.freeserve.co.uk/libcfg/ | The W2K Bug |
On Sat, 15 Apr 2000, Colin Smith wrote: > On Fri, 14 Apr 2000, A James Lewis wrote: > > > > > Does anyone have any suggestions for a way to keep 2 databases in sync? > > > > Ideally updates need to be made to both... this can't be too uncommon a > > requirement..... any kind of HA would need it.... I've had a closer look at drbd. The home site is: http://www.complang.tuwien.ac.at/reisner/drbd/ It will replicate the blocks on a device out to backup servers. If the primary system fails the backup can take over. It doesn't do any distributed locking so it's strictly a failover service at the moment. It also looks like a real performance killer which is pretty much what you'd expect with each block being sent over the LAN as well as to disk. Definitely a case for a very high bandwidth low latency network (SCI, SP switch) 100Mbit/1Gbit dedicated might be acceptable though. -- |Colin Smith: Colin.Smith@yelm.freeserve.co.uk | Windows 2000 | |Linux: Delivers on the promises Microsoft make. | AKA | | http://www.linux.org/ | The W2K Bug |
James Lewis wrote: > Does anyone have any suggestions for a way to keep 2 databases in sync? > > Ideally updates need to be made to both... this can't be too uncommon a > requirement..... any kind of HA would need it.... No. The way HA works is that the system is made in such a way that you can't lose data, or that you don't lose CPU cycles. HA does not make any assumptions on the kind of applications that are running on the system. If you want to experiment with HA, start with building a mirroring disk on your Linux system to get the feel of it. Then try to asses what you really want : low down-time or 7x24 operation. This is what determines your HA system. If you don't want to lose CPY cycles, then you have to build a cluster with e.g. two CPU's. These should share their mass storage. This mass storage should be organised as RAID. With hot-plug capabilities, it is possible to keep the system running either if a CPU goes down or if a drive fails. The worst thing that you can do is to base the implementation of your application upon the fact that the system should have high-availability requirements. That is not a database issue, but an operating system issue. Jurgen Defurne defurnj@glo.be
A James Lewis wrote: > What would you discribe Oracle Parallel Server then? I know Oracle and PG. The big difference is that PG is an application which runs on top of an OS, while Oracle bypasses for its functioning a whole lot of the OS and replaces it with functions of its own. This means that for these parallel systems, one needs a product which is called SQL*Net, which provides the functionality needed to channel data over a network. Only on top of that, Oracle Parallel Server is implemented. The database administrator sets up the replication etc., but to the programmer this is all COMPLETELY TRANSPARENT!! > > I am well aware of RAID/Mirroring etc... but I want the ability for one > database to seamlessly take over from the other (OR even with the > appliction getting thrown off and having to re-connect) but the DATA must > be kept in sync across both machines... > > What I want is some sort or replication, bi-directional would be nice but > not vital... > Referring to the part above, this would mean that you will need to dig into the code very deep, to the part where data is physically written to the data files, and put code there by which it is possible to write this data across the network to the datafiles of the replication database. (Hey, guys, what would you think of that ?) > > Even if the machine has mirrored disks its CPU can fail and a secondary > machine is useless unless it has the identical data... The hardware way of doing this (also see the High-Availability HOWTO) : - Build a RAID : mirroring, etc, something which makes your data survive a disk crash - SHARE this RAID between two CPU's (for sharing strategies, also see the HA HOWTO) When a CPU fails, the other one can take over with the exact data. When a disk fails, your data is still safe, the system can continue running until the defective part is replaced. > The only way round this would be to re-write the application to write to > BOTH databases and this could be a problem with some pre-existing software > packages and give the potential to get the DB's out of sync. > > Writing the application from scratch would mean that it was not so much of > a problem. > If you do such a thing at the application level, it will always be a burden. The goal of a database programmer is to write solutions to problems, not to be a systems programmer. If you have to take into account for every problem that you must solve, that data must be replicated, then you will double the time needed to write every application. That is why I put the emphasis of replication on the system level. > > James Lewis wrote: > > > Does anyone have any suggestions for a way to keep 2 databases in sync? > > > > Ideally updates need to be made to both... this can't be too uncommon a > > requirement..... any kind of HA would need it.... > > No. The way HA works is that the system is made in such a way that you > can't lose data, or that you don't lose CPU cycles. HA does not make any > assumptions on the kind of applications that are running on the system. > > If you want to experiment with HA, start with building a mirroring disk on > your Linux system to get the feel of it. > > Then try to asses what you really want : low down-time or 7x24 operation. > This is what determines your HA system. > > If you don't want to lose CPY cycles, then you have to build a cluster > with e.g. two CPU's. These should share their mass storage. This mass > storage should be organised as RAID. With hot-plug capabilities, it is > possible to keep the system running either if a CPU goes down or if a > drive fails. > > The worst thing that you can do is to base the implementation of your > application upon the fact that the system should have high-availability > requirements. That is not a database issue, but an operating system issue. > Jurgen Defurne