Re: Big 7.4 items - Replication - Mailing list pgsql-hackers
From | Al Sutton |
---|---|
Subject | Re: Big 7.4 items - Replication |
Date | |
Msg-id | 01d101c2a36f$7c529740$0100a8c0@cloud Whole thread Raw |
In response to | Re: Big 7.4 items (<darren@up.hrcoxmail.com>) |
Responses |
Re: Big 7.4 items - Replication
|
List | pgsql-hackers |
For live replication could I propose that we consider the systems A,B, and C connected to each other independantly (i.e. A has links to B and C, B has links to A and C, and C has links to A and B), and that replication is handled by the node receiving the write based transaction. If we consider a write transaction that arrives at A (called WT(A)), system A will then send WT(A) to systems B and C via it's direct connections. System A will receive back either an OK response if there are not conflicts, a NOT_OK response if there are conflicts, or no response if the system is unavailable. If system A receives a NOT_OK response from any other node it begins the process of rolling back the transaction from all nodes which previously issued an OK, and the transaction returns a failure code to the client which submitted WT(A). The other systems (B and C) would track recent transactions and there would be a specified timeout after which the transaction is considered safe and could not be rolled out. Any system not returning an OK or NOT_OK state is assumed to be down, and error messages are logged to state that the transaction could not be sent to the system due it it's unavailablility, and any monitoring system would alter the administrator that a replicant is faulty. There would also need to be code developed to ensure that a system could be brought into sync with the current state of other systems within the group in order to allow new databases to be added, and faulty databases to be re-entered to the group. This code could also be used for non-realtime replication to allow databases to be syncronised with the live master. This would give a multi-master solution whereby a write transaction to any one node would guarentee that all available replicants would also hold the data once it is completed, and would also provide the code to handle scenarios where non-realtime data replication is required. This system assumes that a majority of transactions will be sucessful (which should be the case for a well designed system). Comments? Al. ----- Original Message ----- From: "Darren Johnson" <darren@up.hrcoxmail.com> To: "Jan Wieck" <JanWieck@Yahoo.com> Cc: "Bruce Momjian" <pgman@candle.pha.pa.us>; <shridhar_daithankar@persistent.co.in>; "PostgreSQL-development" <pgsql-hackers@postgresql.org> Sent: Saturday, December 14, 2002 1:28 AM Subject: [mail] Re: [HACKERS] Big 7.4 items > > > > > >> > >>Lets say we have systems A, B and C. Each one has some > >>changes and sends a writeset to the group communication > >>system (GSC). The total order dictates WS(A), WS(B), and > >>WS(C) and the writes sets are recieved in that order at > >>each system. Now C gets WS(A) no conflict, gets WS(B) no > >>conflict, and receives WS(C). Now C can commit WS(C) even > >>before the commit messages C(A) or C(B), because there is no > >>conflict. > >> > > > >And that is IMHO not synchronous. C does not have to wait for A and B to > >finish the same tasks. If now at this very moment two new transactions > >query system A and system C (assuming A has not yet committed WS(C) > >while C has), they will get different data back (thanks to non-blocking > >reads). I think this is pretty asynchronous. > > > > So if we hold WS(C) until we receive commit messages for WS(A) and > WS(B), will that meet > your synchronous expectations, or do all the systems need to commit the > WS in the same order > and at the same exact time. > > > > > > >It doesn't lead to inconsistencies, because the transaction on A cannot > >do something that is in conflict with the changes made by WS(C), since > >it's WS(A)2 will come back after WS(C) arrived at A and thus WS(C) > >arriving at A will cause WS(A)2 to rollback (WS used synonymous to Xact > >in this context). > > > Right > > > > >Hope this doesn't add too much confusion :-) > > > No, however I guess I need to adjust my slides to include your > definition of synchronous > replication. ;-) > > Darren > > > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
pgsql-hackers by date: