Thread: How to merge data from two separate databases into one (maybe using xlogs)?

How to merge data from two separate databases into one (maybe using xlogs)?

From
Daniel.Crespo@l-3com.com
Date:

Hi,

 

The quick question is:

How (if possible) can I do to get data (maybe through xlogs) from two separate databases and merge them into one?

 

For those that want to know my situation, here it is:

 

I have two postgresql 9.0 that are working as master/slave using streaming replication. At some point, the slave will become primary. As most of you already know, I can do this by just creating a trigger file. Everything is fine so far: when I create a trigger file, the slave becomes writable, therefore becoming master.

 

My setup is using pgpool on top of 2 databases, as shown in the following drawing:

 

    Server A                   Server B

+--------------+           +--------------+

|     AP0      |           |     AP1      |

|      |       |           |      |       |

|      V       |           |      V       |

|   pgpool0----|---.   .---|---pgpool1    |

|      |       |    \ /    |      |       |

|      V       |     X     |      V       |

|     DB0 <----|----' '----|---> DB1      |

+--------------+           +--------------+

  172.10.10.2                172.10.10.3

 

If Server A is disconnected from the network, pgpool0 will not see DB1, and pgpool1 will not see DB0. Therefore, pgpool0 will degenerate the failed backend. (For those who don’t know pgpool, it’s just a database pool that has the same interface as postgresql. The application thinks it’s talking to a postgres database)

 

In my case above, this is what will happen:

1. pgpool0 detects DB1 disconnection and issues a failover to DB0, which is already Master. That is, nothing to do.

2. pgpool1 detects DB0 disconnection and issues a failover to DB1, which is slave. That is, create trigger file on DB1. At this point, both DBs are writable.

3. At this point, split-brain problem could raise. That is, AP0 could potentially insert new valid records through pgpool0, as well as AP1 through pgpool1. This means, both DB0 and DB1 could potentially have valid data.

 

I’m aware of the existence of techniques like STONITH, or heartbeat. The problem is that AP* must be running and available in the network and connected to a database, therefore can not be just shutdown.

 

So, the question would be: How can I do to merge data from DB0 and DB1 and make it available in the new master, whichever is chosen? Any ideas?

 

Thanks in advance,

-Daniel

 

 

On Wed, Oct 27, 2010 at 9:58 AM,  <Daniel.Crespo@l-3com.com> wrote:
> So, the question would be: How can I do to merge data from DB0 and DB1 and
> make it available in the new master, whichever is chosen? Any ideas?

Perhaps investigate bucardo for replication, as it is supposed to be
able to help in situations like this.  I think you will have to write
some policy so it knows how to resolve conflicting updates unless you
don't care which one wins.

--- On Wed, 10/27/10, Vick Khera <vivek@khera.org> wrote:

From: Vick Khera <vivek@khera.org>
Subject: Re: [GENERAL] How to merge data from two separate databases into one (maybe using xlogs)?
To: "pgsql-general" <pgsql-general@postgresql.org>
Date: Wednesday, October 27, 2010, 8:26 PM

On Wed, Oct 27, 2010 at 9:58 AM,  <Daniel.Crespo@l-3com.com> wrote:
> So, the question would be: How can I do to merge data from DB0 and DB1 and
> make it available in the new master, whichever is chosen? Any ideas?

Perhaps investigate bucardo for replication, as it is supposed to be
able to help in situations like this.  I think you will have to write
some policy so it knows how to resolve conflicting updates unless you
don't care which one wins.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

IMHO pgpool is the solution

On Wed, Oct 27, 2010 at 4:37 PM, Lennin Caro <lennin.caro@yahoo.com> wrote:
IMHO pgpool is the solution

How does that solve the problem of having two disconnected networks, each thinking their DB is the master?

Re: How to merge data from two separate databases into one (maybe using xlogs)?

From
Daniel.Crespo@l-3com.com
Date:
> On Wed, Oct 27, 2010 at 4:37 PM, Lennin Caro <lennin.caro@yahoo.com>
wrote:
> IMHO pgpool is the solution
>
> How does that solve the problem of having two disconnected networks,
each thinking their DB is the master?

The original question is how can I do to merge the data of the two
master databases. The answer I'm looking for is the answer to your
question as well. My solution does not solve that problem. I was
thinking about the possibility of shipping all xlogs of both databases
and putting them into the final master (one of them), and replay them to
have all data. Later, I would take care of the conflicts.

Any ideas?

Thanks,
-Daniel

On Wed, Oct 27, 2010 at 5:19 PM,  <Daniel.Crespo@l-3com.com> wrote:
> thinking about the possibility of shipping all xlogs of both databases
> and putting them into the final master (one of them), and replay them to
> have all data. Later, I would take care of the conflicts.
>

Again, I recommend you look at Bucardo if you want multi-master replication.

Re: How to merge data from two separate databases into one (maybe using xlogs)?

From
Daniel.Crespo@l-3com.com
Date:
> On Wed, Oct 27, 2010 at 5:19 PM,  <Daniel.Crespo@l-3com.com> wrote:
> > thinking about the possibility of shipping all xlogs of both
> databases
> > and putting them into the final master (one of them), and replay
them
> to
> > have all data. Later, I would take care of the conflicts.
> >
>
> Again, I recommend you look at Bucardo if you want multi-master
> replication.

Thanks for your response, Vick.

Unfortunately, I need to replicate DDLs, therefore Bucardo is not an
option. I was thinking about having a way to replay xlogs from different
servers into one, to have "all" data. I just don't know if it's
possible. Anybody knows?

Thanks,
-Daniel

Re: How to merge data from two separate databases into one (maybe using xlogs)?

From
John R Pierce
Date:
On 10/28/10 7:49 AM, Daniel.Crespo@l-3com.com wrote:
> Unfortunately, I need to replicate DDLs, therefore Bucardo is not an
> option. I was thinking about having a way to replay xlogs from different
> servers into one, to have "all" data. I just don't know if it's
> possible. Anybody knows?

you want multimaster replication WITH multimaster DDL changes?


wow.

how do you resolve conflicts?



Re: How to merge data from two separate databases into one (maybe using xlogs)?

From
Daniel.Crespo@l-3com.com
Date:
> you want multimaster replication WITH multimaster DDL changes?
>
>
> wow.
>
> how do you resolve conflicts?
>

I don't know how to get to that point yet. The question is: Can I use
xlogs from two different servers and replay them all in one? Depending
on how it is done, I can tell you how would I resolve the conflicts.

Re: How to merge data from two separate databases into one (maybe using xlogs)?

From
John R Pierce
Date:
On 10/28/10 12:55 PM, Daniel.Crespo@l-3com.com wrote:
>> you want multimaster replication WITH multimaster DDL changes?
>>
>>
>> wow.
>>
>> how do you resolve conflicts?
>>
> I don't know how to get to that point yet. The question is: Can I use
> xlogs from two different servers and replay them all in one? Depending
> on how it is done, I can tell you how would I resolve the conflicts.

xlogs just contain blocks by block number.   not high level SQL
operations that could be resolved.