Re: Changeset Extraction Interfaces - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Changeset Extraction Interfaces
Date
Msg-id CA+TgmoZTSRqOWqgkJAirvG4foiAFNd3nanTN3y3om2HLEoaKLg@mail.gmail.com
Whole thread Raw
In response to Re: Changeset Extraction Interfaces  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Changeset Extraction Interfaces
List pgsql-hackers
On Thu, Dec 12, 2013 at 1:52 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Puh. I honestly have zero confidence in DBAs making an informed decision
> about something like this. Honestly, for a replication solution, how
> often do you think this will be an issue?

If you imagine a scenario where somebody establishes a replication
slot and then keeps it forever, not often.  But if you're trying to do
something more ad hoc, where replication slots might be used just for
short periods of time and then abandoned, I think it could come up
pretty frequently.  Generally, I think you're being too dismissive of
the stuff I'm complaining about here.  If we just can't get this, well
then I suppose we can't.  But I think the amount of time that it takes
Hot Standby to open for connections is an issue, precisely because
it's got to wait until certain criteria are met before it can
establish a snapshot, and sometimes that takes an unpleasantly long
time.  I think it unlikely that we can export that logic to this case
also and experience no pain as a result.

In fact, I think that even restricting things to streaming changes
from transactions started after we initiate replication is going to be
an annoying amount of delay for some purposes.  People will accept it
because, no matter how you slice it, this is an awesome new
capability.  Full stop.  That having been said, I don't find it at all
hard to imagine someone wanting to jump into the replication stream at
an arbitrary point in time and see changes from every transaction that
*commits* after that point, even if it began earlier, or even to see
changes from transactions that have not yet committed as they happen.
I realize that's asking for a pony, and I'm not saying you have to go
off and do that right now in order for this to move forward, or indeed
that it will ever happen at all.  What I am saying is that I find it
entirely likely that people are going to push the limits of this
thing, that this is one of the limits I expect them to push, and that
the more we can do to put policy in the hands of the user without
pre-judging the sanity of what they're trying to do, the happier we
(and our users) will be.

>> > It's not too difficult to provide an option to do that. What I've been
>> > thinking of was to correlate the confirmation of consumption with the
>> > transaction the SRF is running in. So, confirm the data as consumed if
>> > it commits, and don't if not. I think we could do that relatively easily
>> > by registering a XACT_EVENT_COMMIT.
>>
>> That's a bit too accident-prone for my taste.  I'd rather the DBA had
>> some equivalent of peek_at_replication(nchanges int).
>
> One point for my suggested behaviour is that it closes a bigger
> racecondition. Currently as soon as start_logical_replication() has
> finished building the tuplestore it marks the endposition as
> received. But we very well can fail before the user has received all
> those changes.

Right.  I think your idea is good, but maybe there should also be a
version of the function that never confirms receipt even if the
transaction commits.  That would be useful for ad-hoc poking at the
queue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Time-Delayed Standbys
Next
From: David Rowley
Date:
Subject: Re: logical changeset generation v6.8