Re: logical changeset generation v6.2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: logical changeset generation v6.2
Date
Msg-id 20131022150212.GC7435@awork2.anarazel.de
Whole thread Raw
In response to Re: logical changeset generation v6.2  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: logical changeset generation v6.2  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2013-10-22 10:52:48 -0400, Robert Haas wrote:
> On Fri, Oct 18, 2013 at 2:26 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> > So. As it turns out that solution isn't sufficient in the face of VACUUM
> > FULL and mixed DML/DDL transaction that have not yet been decoded.
> >
> > To reiterate, as published it works like:
> > For every modification of catalog tuple (insert, multi_insert, update,
> > delete) that has influence over visibility issue a record that contains:
> > * filenode
> > * ctid
> > * (cmin, cmax)
> >
> > When doing a visibility check on a catalog row during decoding of mixed
> > DML/DDL transaction lookup (cmin, cmax) for that row since we don't
> > store both for the tuple.
> >
> > That mostly works great.
> >
> > The problematic scenario is decoding a transaction that has done mixed
> > DML/DDL *after* a VACUUM FULL/CLUSTER has been performed. The VACUUM
> > FULL obviously changes the filenode and the ctid of a tuple, so we
> > cannot successfully do a lookup based on what we logged before.
> 
> So I have a new idea for handling this problem, which seems obvious in
> retrospect.  What if we make the VACUUM FULL or CLUSTER log the old
> CTID -> new CTID mappings?  This would only need to be done for
> catalog tables, and maybe could be skipped for tuples whose XIDs are
> old enough that we know those transactions must already be decoded.

Ah. If it only were so simple ;). That was my first idea, and after I'd
bragged in an 2ndq internal chat that I'd found a simple idea I
obviously had to realize it doesn't work.

Consider:
INIT_LOGICAL_REPLICATION;
CREATE TABLE foo(...);
BEGIN;
INSERT INTO foo;
ALTER TABLE foo ...;
INSERT INTO foo;
COMMIT TX 3;
VACUUM FULL pg_class;
START_LOGICAL_REPLICATION;

When we decode tx 3 we haven't yet read the mapping from the vacuum
freeze. That scenario can happen either because decoding was stopped for
a moment, or because decoding couldn't keep up (slow connection,
whatever).

There also can be nasty variations where the VACUUM FULL happens while a
transaction is writing data since we don't hold locks on system
relations for very long.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Why the asprintf patch is still breaking the buildfarm
Next
From: Rajeev rastogi
Date:
Subject: Re: COPY table FROM STDIN doesn't show count tag