Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?) - Mailing list pgsql-general

From Ed L.
Subject Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?)
Date
Msg-id 200304102003.18271.pgsql@bluepolka.net
Whole thread Raw
In response to Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?)  (Richard Huxton <dev@archonet.com>)
Re: Batch replication ordering (was Re: [GENERAL] 32/64-bit transaction IDs?)  (elein <elein@sbcglobal.net>)
List pgsql-general
On Thursday April 10 2003 5:26, Tom Lane wrote:
> "Ed L." <pgsql@bluepolka.net> writes:
> > I don't think so.  Can you imagine a replication queue big enough to
> > that someone might not want to process it entirely in one transaction?
>
> No, I can't.  The bigger the queue is, the further behind you are, and
> the more you need to catch up; twiddling your thumbs for awhile gets
> progressively less attractive.

Well, if an arbitrarily large queue of transactions doesn't persuade you,
you are not going to be persuaded.   If (a) the catch-up rate is fast
enough, (b) the master load periodically ebbs enough, as it does at various
predictably slow times, (c) you care about lessening traffic between master
and slave, and/or (d) you care about lessening constancy of the replication
load on the master, having the replicator periodically back-off is very
attractive.

> Also, AFAIR from prior discussion, the *slave* side doesn't need to
> commit the whole batch in one transaction.  I can't recall if this
> could expose transaction intermediate states on the slave, but if
> you're that far behind you'd best not be having any live clients
> querying the slave anyway.

Exposing intermediate transaction states is precisely the issue and the
reason for my original question.  Your apparent presumption of the lack of
value of querying a slave that's running significantly behind is a false
blanket assumption.  Of course it depends on the situation and the nature
of the data.  I can think of a number of past instances where some
considerable lagtime in the data propagation was just fine, but
inconsistency was not.  If you aren't replicating to the slave and
committing in one big all-inclusive batch, then there needs to be some care
to commit in transaction units if you don't want to offer room for
inconsistent views to slave clients.  We left the conversation a while back
thinking there was no need for anything other than a serial replay by
insertion order, but a batching requirement seems to change that, which is
why I posed the original question.

> In any case you can throttle the load by sleeping between selects while
> holding the transaction open.  I think your concern is predicated on
> Oracle-ish assumptions about the cost of holding open a transaction.
> The only such cost in PG is that it interferes with VACUUM cleaning
> out old tuples --- which I'm not sure VACUUM could do anyway for stuff
> that still hasn't propagated to a slave.

I'd have thought there might be other problems resulting from holding a
transaction open for hours, like the progress you lose if the master or
slave or the connection between goes down before the mondo commit.  Maybe
that kind of thing just never happens.

Ed


pgsql-general by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: OS/X and PL/PGSQL
Next
From: Tom Lane
Date:
Subject: Re: OS/X and PL/PGSQL