Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit - Mailing list pgsql-general

From Ed L.
Subject Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit
Date
Msg-id 200304111729.01028.pgsql@bluepolka.net
Whole thread Raw
In response to Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit  (Jan Wieck <JanWieck@Yahoo.com>)
Responses Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit  ("Ed L." <pgsql@bluepolka.net>)
List pgsql-general
On Friday April 11 2003 4:34, Jan Wieck wrote:
> >
> > Repeated exclamation points, pedanticism, and all caps shouting duly
> > noted.
>
> I apologize. Wasn't meant that way.

Apology accepted, thanks.

> > Again, I have no interest in partial transactions, only groups of
> > multiple transactions.
>
> Okay, you want to lower the impact on the master by slurping the effects
> of multiple transactions (across all tables they hit) in one
> transaction.

Your phrase "across all the tables they hit" makes me wonder if you're
talking about dbmirror-style replication.  Replication, in this modified
dbmirror context, is essentially reading one table on the master, and
replaying it on the slave, not counting a couple of bookkeeping tables.

During normal master activity, when table T1 has a row updated, a trigger
inserts a new row into the queue table Q with enough information for a
replicator to update the same row in table T1 on the slave.  When table T2
gets an insert or delete, the same queue table Q again gets a new row via
the trigger.  Then, during master-to-slave replication, the replicator
comes along and reads rows from Q, not all tables.  The replicator does not
read any user tables besides Q and a bookkeeping table or two.

> Now you can do two basic things.
>
> A) You apply those changes in the order you read them out of the master
> on the slave. This requires that you do it all in one big transaction on
> the slave side and that you can set all foreign key constraints to
> deferred, getting you into the deferred trigger queue overflow risk
> mentioned.

Well, this leads back to my original question.  If it is possible to order
them in transaction groupings, then intermediate commits are possible and
one seems less likely to slam into the deferred trigger queue issue.  The
ordering under examination would allow you to replay them in transaction
order, committing every M transactions with consistency intact, and even
resetting the connection if you wish.  But is it a correct ordering?  I
think so, but I dunno.

> B) You read all the changes across all tables, but regroup them into
> their correct order and original transaction boundaries for playback on
> the slaves. Now you need some staging area where you collect and sort it
> all running the risk to run out of disk space before you even begin the
> first transaction on the slave side.

If the ordering I seek exists via the algorithm I've described, I don't
think you need to stage the entire queue at once.  The overhead I'd use is
minimally, I think, about 56 bytes per transaction (int xid, bigint
max_seqid, int slaveid + 40 Pg overhead bytes), not counting my own
audit/debug columns.  If I add my 40 bytes of debug/audit column bloat,
that's 96 bytes/row in ordering.  Since we only need N transactions, the
staging area only needs to have N rows of 96 bytes or less.  In the bloat
row case, a batchsize of 100,000 transactions would only require 10MB, and
I'd expect to use a much smaller batchsize.  And no need to keep the batch
ordering around once it has been replicated.

ISTM, the key is whether or not one can group and replay according to
transaction groups, with the groups ordered by max seqid within each
transaction.

Ed


pgsql-general by date:

Previous
From: P G
Date:
Subject: Re: Has anyone seen this error? I cannot login into my database.
Next
From: "Ed L."
Date:
Subject: Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?)