Re: logical changeset generation v6.2 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: logical changeset generation v6.2 |
Date | |
Msg-id | CA+TgmoY9MY0hh4Od=fBZW3n+5e9dPh8Ey3axdR547TT_ZfnG7Q@mail.gmail.com Whole thread Raw |
In response to | Re: logical changeset generation v6.2 (Andres Freund <andres@2ndquadrant.com>) |
Responses |
Re: logical changeset generation v6.2
|
List | pgsql-hackers |
On Tue, Oct 22, 2013 at 11:02 AM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-10-22 10:52:48 -0400, Robert Haas wrote: >> On Fri, Oct 18, 2013 at 2:26 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> > So. As it turns out that solution isn't sufficient in the face of VACUUM >> > FULL and mixed DML/DDL transaction that have not yet been decoded. >> > >> > To reiterate, as published it works like: >> > For every modification of catalog tuple (insert, multi_insert, update, >> > delete) that has influence over visibility issue a record that contains: >> > * filenode >> > * ctid >> > * (cmin, cmax) >> > >> > When doing a visibility check on a catalog row during decoding of mixed >> > DML/DDL transaction lookup (cmin, cmax) for that row since we don't >> > store both for the tuple. >> > >> > That mostly works great. >> > >> > The problematic scenario is decoding a transaction that has done mixed >> > DML/DDL *after* a VACUUM FULL/CLUSTER has been performed. The VACUUM >> > FULL obviously changes the filenode and the ctid of a tuple, so we >> > cannot successfully do a lookup based on what we logged before. >> >> So I have a new idea for handling this problem, which seems obvious in >> retrospect. What if we make the VACUUM FULL or CLUSTER log the old >> CTID -> new CTID mappings? This would only need to be done for >> catalog tables, and maybe could be skipped for tuples whose XIDs are >> old enough that we know those transactions must already be decoded. > > Ah. If it only were so simple ;). That was my first idea, and after I'd > bragged in an 2ndq internal chat that I'd found a simple idea I > obviously had to realize it doesn't work. > > Consider: > INIT_LOGICAL_REPLICATION; > CREATE TABLE foo(...); > BEGIN; > INSERT INTO foo; > ALTER TABLE foo ...; > INSERT INTO foo; > COMMIT TX 3; > VACUUM FULL pg_class; > START_LOGICAL_REPLICATION; > > When we decode tx 3 we haven't yet read the mapping from the vacuum > freeze. That scenario can happen either because decoding was stopped for > a moment, or because decoding couldn't keep up (slow connection, > whatever). That strikes me as a flaw in the implementation rather than the idea. You're presupposing a patch where the necessary information is available in WAL yet you don't make use of it at the proper time. It seems to me that you have to think of the CTID map as tied to a relfilenode; if you try to use one relfilenode's map with a different relfilenode, it's obviously not going to work. So don't do that. /me looks innocent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: