Re: Logical decoding & exported base snapshot - Mailing list pgsql-hackers
From | Steve Singer |
---|---|
Subject | Re: Logical decoding & exported base snapshot |
Date | |
Msg-id | BLU0-SMTP8417A16727A6855996B47DC4F0@phx.gbl Whole thread Raw |
In response to | Logical decoding & exported base snapshot (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: Logical decoding & exported base snapshot
|
List | pgsql-hackers |
On 12-12-11 06:52 PM, Andres Freund wrote: > Hi, > > Problem 1: > > One problem I see is that while exporting a snapshot solves the > visibility issues of the table's contents it does not protect against > schema changes. I am not sure whether thats a problem. > > If somebody runs a CLUSTER or something like that, the table's contents > will be preserved including MVCC semantics. That's fine. > The more problematic cases I see are TRUNCATE, DROP and ALTER > TABLE. Imagine the following: > > S1: INIT_LOGICAL_REPLICATION > S1: get snapshot: 00000333-1 > S2: ALTER TABLE foo ALTER COLUMN blub text USING (blub::text); > S3: pg_dump --snapshot 00000333-1 > S1: START_LOGICAL_REPLICATION > > In that case the pg_dump would dump foo using the schema *after* the > ALTER TABLE but showing only rows visible to our snapshot. After > START_LOGICAL_REPLICATION all changes after the xlog position from > INIT_LOGICAL_REPLICATION will be returned though - including all the > tuples from the ALTER TABLE and potentially - if some form of schema > capturing was in place - the ALTER TABLE itself. The copied schema would > have the new format already though. > > Does anybody see that as aproblem or is it just a case of PEBKAC? One > argument for the latter is that thats already a problematic case for > normal pg_dump's. Its just that the window is a bit larger here. Is there anyway to detect this situation as part of the pg_dump? If I could detect this, abort my pg_dump then go and get a new snapshot then I don't see this as a big deal. I can live with telling users, "don't do DDL like things while subscribing a new node, if you do the subscription will restart". I am less keen on telling users "don't do DDL like things while subscribing a new node or the results will be unpredictable" I'm trying to convince myself if I will be able to take a pg_dump from an exported snapshot plus the changes made after in between the snapshot id to some later time and turn the results into a consistent database. What if something like this comes along INIT REPLICATION insert into foo (id,bar) values (1,2); alter table foo drop column bar; pg_dump --snapshot The schema I get as part of the pg_dump won't have bar because it has been dropped, even though it will have the rows that existed with bar. I then go to process the INSERT statement. It will have a WAL record with column data for bar and the logical replication replay will lookup the catalog rows from before the alter table so it will generate a logical INSERT record with BAR. That will fail on the replica. I'm worried that it will be difficult to pragmatically stitch together the inconsistent snapshot from the pg_dump plus the logical records generated in between the snapshot and the dump (along with any record of the DDL if it exists). > Problem 2: > > Control Flow. > > To be able to do a "SET TRANSACTION SNAPSHOT" the source transaction > needs to be alive. That's currently solved by exporting the snapshot in > the walsender connection that did the INIT_LOGICAL_REPLICATION. The > question is how long should we preserve that snapshot? > > There is no requirement - and there *cannot* be one in the general case, > the slot needs to be usable after a restart - that > START_LOGICAL_REPLICATION has to be run in the same replication > connection as INIT. > > Possible solutions: > 1) INIT_LOGICAL_REPLICATION waits for an answer from the client that > confirms that logical replication initialization is finished. Before > that the walsender connection cannot be used for anything else. > > 2) we remove the snapshot as soon as any other commend is received, this > way the replication connection stays usable, e.g. to issue a > START_LOGICAL_REPLICATION in parallel to the initial data dump. In that > case the snapshot would have to be imported *before* the next command > was received as SET TRANSACTION SNAPSHOT requires the source transaction > to be still open. > > Opinions? Option 2 sounds more flexible. Is it more difficult to implement? > Andres > -- > Andres Freund http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services > >
pgsql-hackers by date: