Thread: Asynchronous replication of a PostgreSQL DB to a MySQL target
Hi! I'd like to export schema and data from a PostgreSQL database to a remote MySQL database; any changes to the PG-master should be reflected on the MySQL target in a matter of a few minutes to one hour max. Has anybody done something like this before? Here's some more background: We've got an Oracle database as our backend and a couple of PostgreSQL-DBs as our frontend databases; the schema of the backend DB is stable. There are so called "publishing jobs" running every few minutes; these jobs not only update the frontend databases with any changes in the backend, they also make changes to the frontend dbs schemas whenever the backend says so - the frontend schemas differ from the backend's, the DDL of the frontend dbs is partly defined by data in the backend. The logical thing to do would be to create another set of publishing jobs for the MySQL databases; however our current network layout makes this quite difficult, so I'd rather try and keep the MySQL db and one of the PostgreSQL dbs in near sync. My first problem is that the PostgreSQLs schema is not stable, so if I simply write a couple of jobs to transport the data, I need to alter these jobs and the MySQL schema whenever there are changes to the PG schema. The second problem lies in PostgreSQL-specifics such as tsearch2 - I actually do not need nor want to replicate such metadata. Custom datatypes and functions should also be exempt from this kind of replication. My hopes aren't all too high that there's an easy way to accomplish what I wish to do, so any advice would be very much welcome - even a "can't be done that way" by somebody who has tried to travel that path before :) Kind regards Markus
On Thu, 2006-12-07 at 09:56 +0100, Markus Wollny wrote: > My first problem is that the PostgreSQLs schema is not stable, so if I > simply write a couple of jobs to transport the data, I need to alter > these jobs and the MySQL schema whenever there are changes to the PG > schema. The second problem lies in PostgreSQL-specifics such as tsearch2 > - I actually do not need nor want to replicate such metadata. Custom > datatypes and functions should also be exempt from this kind of > replication. > > My hopes aren't all too high that there's an easy way to accomplish what > I wish to do, so any advice would be very much welcome - even a "can't > be done that way" by somebody who has tried to travel that path before Wow, tough problem. You're trying to do two quite difficult things at once: replicate schema changes and convert from PostgreSQL to MySQL. I think your best bet might be to hack PgPool to do what you need. First, do some analysis to figure out what kinds of schema changes are pushed onto PostgreSQL. Since the schema changes are automated, you should see a few types of queries that change the schema. Maybe all the commands begin with "ALTER" or "CREATE". Then, in PgPool, hack it to recognize any DDL statements, and put them in a log in PostgreSQL (the query in raw text), in addition to passing it to PostgreSQL like normal. You then need to make a process that reads this log of DDL changes, and can recognize the types of DDL statements they are, and convert them to the MySQL equivalent, and put them in MySQL. In order for this to work, the statements must match one of a few patterns so that you can pre- define the translations necessary. This won't work for people issuing arbitrary schema changes. Then, for the data replication, do something clever with triggers and a function that can use the MySQL protocol. Make sure that the DDL stuff makes it to the MySQL database before your trigger tries to send the data. I have no idea whether that will work for you, but there isn't going to be a simple solution. You should strongly reconsider your options; you're asking for a lot of trouble. Regards, Jeff Davis
I think Sequoia (open source) and Continuent (proprietary) do this. --------------------------------------------------------------------------- Markus Wollny wrote: > Hi! > > I'd like to export schema and data from a PostgreSQL database to a > remote MySQL database; any changes to the PG-master should be reflected > on the MySQL target in a matter of a few minutes to one hour max. > > Has anybody done something like this before? > > Here's some more background: We've got an Oracle database as our backend > and a couple of PostgreSQL-DBs as our frontend databases; the schema of > the backend DB is stable. There are so called "publishing jobs" running > every few minutes; these jobs not only update the frontend databases > with any changes in the backend, they also make changes to the frontend > dbs schemas whenever the backend says so - the frontend schemas differ > from the backend's, the DDL of the frontend dbs is partly defined by > data in the backend. > > The logical thing to do would be to create another set of publishing > jobs for the MySQL databases; however our current network layout makes > this quite difficult, so I'd rather try and keep the MySQL db and one of > the PostgreSQL dbs in near sync. > > My first problem is that the PostgreSQLs schema is not stable, so if I > simply write a couple of jobs to transport the data, I need to alter > these jobs and the MySQL schema whenever there are changes to the PG > schema. The second problem lies in PostgreSQL-specifics such as tsearch2 > - I actually do not need nor want to replicate such metadata. Custom > datatypes and functions should also be exempt from this kind of > replication. > > My hopes aren't all too high that there's an easy way to accomplish what > I wish to do, so any advice would be very much welcome - even a "can't > be done that way" by somebody who has tried to travel that path before > :) > > Kind regards > Markus > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend -- Bruce Momjian bruce@momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. +