Re: logical changeset generation v6.2 - Mailing list pgsql-hackers

From Andres Freund
Subject Re: logical changeset generation v6.2
Date
Msg-id 20131029154326.GD21284@awork2.anarazel.de
Whole thread Raw
In response to Re: logical changeset generation v6.2  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: logical changeset generation v6.2  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2013-10-29 11:28:44 -0400, Robert Haas wrote:
> On Tue, Oct 29, 2013 at 10:47 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > On 2013-10-28 11:54:31 -0400, Robert Haas wrote:
> >> > There's one snag I currently can see, namely that we actually need to
> >> > prevent that a formerly dropped relfilenode is getting reused. Not
> >> > entirely sure what the best way for that is.
> >>
> >> I'm not sure in detail, but it seems to me that this all part of the
> >> same picture.  If you're tracking changed relfilenodes, you'd better
> >> track dropped ones as well.
> >
> > What I am thinking about is the way GetNewRelFileNode() checks for
> > preexisting relfilenodes. It uses SnapshotDirty to scan for existing
> > relfilenodes for a newly created oid. Which means already dropped
> > relations could be reused.
> > I guess it could be as simple as using SatisfiesAny (or even better a
> > wrapper around SatisfiesVacuum that knows about recently dead tuples).
>
> I think modifying GetNewRelFileNode() is attacking the problem from
> the wrong end.  The point is that when a table is dropped, that fact
> can be communicated to the same machine machinery that's been tracking
> the CTID->CTID mappings.  Instead of saying "hey, the tuples that were
> in relfilenode 12345 are now in relfilenode 67890 in these new
> positions", it can say "hey, the tuples that were in relfilenode 12345
> are now GONE".

Unfortunately I don't understand what you're suggesting. What I am
worried about is something like:

<- decoding is here
VACUUM FULL pg_class; -- rewrites filenode 1 to 2
VACUUM FULL pg_class; -- rewrites filenode 2 to 3
VACUUM FULL pg_class; -- rewrites filenode 3 to 1
<- now decode up to here

In this case there are two possible (cmin,cmax) values for a specific
tuple. One from the original filenode 1 and one for the one generated
from 3.
Now that will only happen if there's an oid wraparound which hopefully
shouldn't happen very often, but I'd like to not rely on that.

> >> Completely aside from this issue, what
> >> keeps a relation from being dropped before we've decoded all of the
> >> changes made to its data before the point at which it was dropped?  (I
> >> hope the answer isn't "nothing".)
> >
> > Nothing. But there's no need to prevent it, it'll still be in the
> > catalog and we don't ever access a non-catalog relation's data during
> > decoding.
>
> Oh, right.  But what about a drop of a user-catalog table?

Currently nothing prevents that. I am not sure it's worth worrying about
it, do you think we should?

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: CLUSTER FREEZE
Next
From: Leonardo Francalanci
Date:
Subject: Re: Fast insertion indexes: why no developments