Re: Proposal for a cascaded master-slave replication system - Mailing list pgsql-general
From | Jan Wieck |
---|---|
Subject | Re: Proposal for a cascaded master-slave replication system |
Date | |
Msg-id | 3FB29D21.4050308@Yahoo.com Whole thread Raw |
In response to | Re: Proposal for a cascaded master-slave replication system (Christopher Browne <cbbrowne@acm.org>) |
List | pgsql-general |
Christopher Browne wrote: > In the last exciting episode, JanWieck@Yahoo.com (Jan Wieck) wrote: >> I look forward to your comments. > > It is not evident from the paper what approach is taken to dealing > with the duplicate key conflicts. > > The example: > > UPDATE table SET col1 = 'temp' where col = 'A'; > UPDATE table SET col1 = 'A' where col = 'B'; > UPDATE table SET col1 = 'B' where col = 'temp'; > > I can think of several approaches to this: One fundamental flaw in eRServer is that it tries to "combine" multiple updates into one update at snapshot-time in the first place. The application can do these three steps in one single transaction, how do you split that? You can develop an automatic recovery for that. At the time you got a dupkey error, you rollback but remember the _rserv_ts and table_id that caused the dupkey. In the next sync attempt, you fetch the row with that _rserv_ts and delete all rows from the slave table with that primary key plus fake INSERT log rows on the master for the same. Then you prepare and apply and cross fingers that nobody touched the same row again already between your last attempt and now ... which was how many hours ago? And since you can only find one dupkey per round, you might do this a few times with larger and larger lists of _rserv_ts,table_id. The idea of not accumulating log forever, but just holding this status table (the name log is misleading in eRServer, it holds flags telling "the row with _rserv_ts=nnnn got INS|UPD|DEL'd") has one big advantage. However long your slave does not sync, your master will not run out of space. But I don't think that there is value in the attempt to let a slave catch up the last 4 days at once anyway. Drop it and use COPY. When your slave does not come up before you have modified half your database, it will be faster this way anyway. Jan > > 1. The present eRserv code reads what is in the table at the time of > the 'snapshot', and so tries to pass on: > > update table set col1 = 'B' where otherkey = 123; > update table set col1 = 'A' where otherkey = 456; > > which breaks because at some point, col1 is not unique, irrespective > of what order we apply the changes in. > > 2. If the contents as at the time of the COMMIT are stored in the log > table, then we would do all three updates in the destination DB, in > order, as shown above. > > Either we have to: > a) Store the updated fields in the replication tables somewhere, or > b) Make the third UPDATE wait for the updates to be stored in a > file somewhere. > > 3. The replication code requires that any given key only be updated > once in a 'snapshot', so that the updates may be unambiguously > partitioned: > > UPDATE table SET col1 = 'temp' where col = 'A' ; -- and otherkey = 123 > UPDATE table SET col1 = 'A' where col = 'B'; -- and otherkey = 456 > -- Must partition here before hitting #123 again -- > UPDATE table SET col1 = 'B' where col = 'temp'; -- and otherkey = 123 > > The third UPDATE may have to be held up until the "partition" is set > up, right? > > 4. I seem to recall a recent discussion about the possibility of > deferring the UNIQUE constraint 'til the END of a commit, with the > result that we could simplify to > > update table set col1 = 'B' where otherkey = 123; > update table set col1 = 'A' where otherkey = 456; > > and discover that the UNIQUE constraint was relaxed just long enough > for us to make the TWO changes that in the end combined to being > unique. > > None of these look like they turn out totally happily, or am I missing > an approach? -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com #
pgsql-general by date: