Re: Changeset Extraction Interfaces - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Changeset Extraction Interfaces |
Date | |
Msg-id | CA+TgmoYXhW1fhewbeNWixMge+P3A7m3d9YEFLW_WjRH1i+nvmw@mail.gmail.com Whole thread Raw |
In response to | Re: Changeset Extraction Interfaces (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Changeset Extraction Interfaces
|
List | pgsql-hackers |
On Fri, Dec 13, 2013 at 9:14 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> If you imagine a scenario where somebody establishes a replication >> slot and then keeps it forever, not often. But if you're trying to do >> something more ad hoc, where replication slots might be used just for >> short periods of time and then abandoned, I think it could come up >> pretty frequently. > > But can you imagine those users needing an exported snapshot? I can > think of several short-lived usages, but all of those are unlikely to > need a consistent view of the overall database. And those are less > likely to be full blown replication solutions. > I.e. it's not the DBA making that decision but the developer making the > decision based on whether he requires the snapshot or not. Well, it still seems to me that the right way to think about this is that the change stream begins at a certain point, and then once you cross a certain threshold (all transactions in progress at that time have ended) any subsequent snapshot is a possible point from which to roll forward. You'll need to avoid applying any transactions that are already included during the snapshot, but I don't really think that's any great matter. You're focusing on the first point at which the consistent snapshot can be taken, and on throwing away any logical changes that might have been available before that point so that they don't have to be ignored in the application code, but I think that's myopic. For example, suppose somebody is replication tables on node A to node B. And then the decide to replicate some of the same tables to node C. Well, one way to do this is to have node C connect to node A and acquire its own slot, but that means decoding everything twice. Alternatively, you could reuse the same change stream, but you'll need a new snapshot to roll forward from. That doesn't seem like a problem unless the API makes it a problem. >> Generally, I think you're being too dismissive of the stuff I'm >> complaining about here. If we just can't get this, well then I >> suppose we can't. > > I think I am just scared of needing to add more features before getting > the basics done and in consequence overrunning 9.4... I am sensitive to that. On the other hand, this API is going to be a lot harder to change once it's released, so we really need to avoid painting ourselves into a corner with v1. As far as high-level design concerns go, there are three things that I'm not happy with: 1. Slots. We know we need physical slots as well as logical slots, but the patch as currently constituted only offers logical slots. 2. Snapshot Mangement. This issue. 3. Incremental Decoding. So that we can begin applying a really big transaction speculatively before it's actually committed. I'm willing to completely punt #3 as far as 9.4 is concerned, because I see a pretty clear path to fixing that later. I am not yet convinced that either of the other two can or should be postponed. >> Right. I think your idea is good, but maybe there should also be a >> version of the function that never confirms receipt even if the >> transaction commits. That would be useful for ad-hoc poking at the >> queue. > > Ok, that sounds easy enough, maybe > pg_decoding_slot_get_[binary_]changes() > pg_decoding_slot_peek_[binary_]changes() > ? s/pg_decoding_slot/pg_logical_stream/? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: