Home > mailing lists

Re: [HACKERS] replicator - Mailing list pgsql-hackers

From	Philippe Marchesseault
Subject	Re: [HACKERS] replicator
Date	January 3, 2000 23:20:48
Msg-id	38714B6F.2DECAEC0@Videotron.ca Whole thread Raw
In response to	replicator (Karel Zak - Zakkr <zakkr@zf.jcu.cz>)
Responses	Re: [HACKERS] replicator
List	pgsql-hackers

Tree view

Hi Karel!

Karel Wrote:
>    node1:  SQL --IPC--> node-broker
>                       |
>                     TCP/IP
>                       |
>                    master-node --IPC--> replikator
>                                         |   |   |
>                                           libpq
>                                         |   |   |
>                                       node2 node..n
>
>(Is it right picture?)

Yes, you got the concept right. I admit it's a bit complicated. Your comments
made me go back to the drawing board and I found several flaws with the design.
The first one is that this design does not allow us to use Transaction Blocks.
An example might go a long way:

Node 1, Client1 (1,1) Issues a begin statement ---> Node 2 Client 1 (2,1) (the
replicator process) sends this command.
(1, 1) Sends a INSERT statement.  ---> (2,1) Sends the INSERT to the backend.
Node 2 Client 2 (2,2) Checks (SELECT) the data and the INSERT of 1,1 is not
there. That's normal, it was not commited.

Node 1, Client 2 (1,2) Issues a BEGIN statement. ---> (2,1) Receives a warning,
about the state not being in progress.
(1,2) Does some stuff...
(1, 2) issues a Rollback Statement ---> (2,1) Sends the rollback. Node 2 rolls
back all the transactions made since 1,1 sent the BEGIN.
(1, 1) Sends the final Commit  , It fails on the remote nodes because it was
rolled back.

So the problem is that we have more than two connections on a single link. It
could be fixed by sending the statements in a block only when we do a COMMIT.
But then we might have some performance problems with big blocks of inserts.
Also I am worried about UPDATES that could be done between separate COMMITs
thus putting the database out of sync. :-(

> IMHO is problem with node registration / authentification on master node.
> Why concept is not more upright? As:
>
>         SQL --IPC--> node-replicator
>                         |  |  |
>                      via libpq send data to all nodes with
>                      current client/backend auth.

Yes, the concept can be more simple but The above would create some performance
problems. If you had many nodes, it would take a long time to send the last
statement. You would have to wait until the statement was completly processed
by all the nodes. A better solution IMHO would be to have a bit more padding
between the node-replicator and the backend.

So it could become:

SQL --IPC--> node-replicator                          |   |   |     via TCP send statements to each node
    replicator (on local node)                          |        via libpq send data to       current (local) backend.
 

>  (not exist any master node, all nodes have connection to all nodes)

Exactly, if the replicator dies only the node dies, everything else keeps
working.

Looking foward to hearing from you,

Philippe Marchesseault

PS: Please excuse me for the DIFF, it's the first time I'm contributing to an
OSS project.

pgsql-hackers by date:

From: The Hermit Hacker
Date: 03 January 2000, 22:22:47
Subject: Re: [HACKERS] Source code format vote

From: Stephen Birch
Date: 03 January 2000, 23:27:48
Subject: Re: [HACKERS] Inprise/Borland releasing Interbase as Open source

Re: [HACKERS] replicator - Mailing list pgsql-hackers

Previous

Next