Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit - Mailing list pgsql-general
From | Ed L. |
---|---|
Subject | Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit |
Date | |
Msg-id | 200304111418.41056.pgsql@bluepolka.net Whole thread Raw |
In response to | Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit (Jan Wieck <JanWieck@Yahoo.com>) |
Responses |
Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit
(Stephan Szabo <sszabo@megazone23.bigpanda.com>)
Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit ("Jim C. Nasby" <jim@nasby.net>) Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit (Steven Singer <ssinger@navtechinc.com>) |
List | pgsql-general |
On Friday April 11 2003 1:07, Jan Wieck wrote: > "Ed L." wrote: > > On Friday April 11 2003 10:08, Jan Wieck wrote: > > > Clearly a bug, but we had memory leaks that clear up at transaction > > > end. > > > > That seems like yet another reason for constraining the size of a batch > > of transactions. > > What I cannot imagine is why one would want to try to make batches any > other size than the original transaction. Again, if there are a large number of transactions queued on the master, possibly constituted by an even much larger number of actual row changes, a batch size of 1 transaction means, in this context, the master will be hit once for every single transaction to retrieve the queued rows corresponding to that transaction. A batchsize of, say 1000 transactions, on the other hand, will save 999 queries. Granted, you may not share my concern about the impact of that on master or network. > "the original transaction" - singular!!! Not a couple, few, maybe some, > part, fraction or anything in between, above or below. Exactly ONE. Repeated exclamation points, pedanticism, and all caps shouting duly noted. Now, suppose there are in fact memory leaks that only clear up at transaction end. Suppose the memory leak increases with the size of the transaction (which nobody asserted but I assume may be true). Suppose one wishes to avoid a per-transaction hit on the master during replication. Suppose one needs to process a replication queue in batches due to an overly large replication queue. Then, I claim, one may quite reasonably wish to retrieve replication tuples from the master in batches containing rows constituting more than one transaction. All complete transactions, yes, but more than one. > If you have a deferred referential integrity constraint defined (one of > the reasons why half of a transaction cannot work at all), where does > the backend remember the ctid's and other information for the triggers > to call at commit time? Again, I have no interest in partial transactions, only groups of multiple transactions. === For those of you who do see the validity in batching multiple transactions, my question is restated here: My question: Is there an ordering algorithm that would make a consistent but limited batchsize replication possible? I propose one below. Suppose you have a replication queue on the master, with dbmirror-like trigger-based insertions, that looks something like this: create table replication_queue ( xid integer, seqid serial primary key, data... ); Here, 'xid' is the transaction ID, 'seqid' is the queue insertion order, and 'data' has all the info needed to replicate the update. Every row update/delete/inserte results in a row in this queue, and a transaction may consist of one to many rows in the queue. Suppose further you wish to limit the number of updates replicated in a particular cycle. For example, suppose you have a million changes queued for replication on the master, but for obvious reasons you don't want to select a million rows to replicate all at once. Suppose you also don't want to grab them one tuple or one transaction at a time, preferring to avoid hammering the master. Rather, you want to grab them in batches of no more than N transactions, replicate them all to a slave and commit on the slave, take a breather, repeating until the slave is caught up. And during each breather, you want to have committed only complete transactions so that any slave clients see consistent data. The algorithm I'm considering right now is the following: select xid, max(seqid) as "max_seqid" into temp replication_order from replication_queue group by xid order by max(seqid) limit N; Then, to get the actual queue replication order, select q.xid, q.seqid, q.data from replication_queue q, replication_order o where q.xid = o.xid order by o.max_seqid, q.seqid; [This is a batched variation of dbmirror's original algorithm.] So, replication is done by transaction groupings, in ascending order according to the maximum seqid in each transaction. I'm hoping someone can poke holes in this algorithm if they exist. Ed
pgsql-general by date: