Re: preserving forensic information when we freeze - Mailing list pgsql-hackers

From Andres Freund
Subject Re: preserving forensic information when we freeze
Date
Msg-id 20130703170715.GD5667@awork2.anarazel.de
Whole thread Raw
In response to Re: preserving forensic information when we freeze  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: preserving forensic information when we freeze
Re: preserving forensic information when we freeze
List pgsql-hackers
On 2013-07-02 16:29:56 -0400, Robert Haas wrote:
> On Mon, Jun 24, 2013 at 8:12 AM, Andres Freund <andres@2ndquadrant.com> wrote:
> > Agreed. And it also improves the status quo when debugging. I wish this
> > would have been the representation since the beginning.
> >
> > Some remarks:
> > * I don't really like that HeapTupleHeaderXminFrozen() now is frequently
> >   performed without checking for FrozenTransactionId. I think the places
> >   where that's done are currently safe, but it seems likely that we
> >   introduce bugs due to people writing similar code.
> >   I think replacing *GetXmin by a wrapper that returns
> >   FrozenTransactionId if the hint bit tell us so would be a good
> >   idea. Then add *GetRawXmin for the rest (probably mostly display).
> >   Seems like it would make the patch actually smaller.
> 
> I did think about this approach, but it seemed like it would add
> cycles in a significant number of places.  For example, consider the
> HeapTupleSatisfies() routines, which are perhaps PostgreSQL's finest
> example of a place where you DON'T want to add any cycles.  All of the
> checks on xmin are conditional on HeapTupleHeaderXminCommitted()
> having been found already to be false.  That implies that the frozen
> bits aren't set either, so if HeapTupleHeaderGetXmin() were to recheck
> the bits it would be a waste.  As I got to the end of the patch I was
> a little dismayed by the number of places that did need adjustment to
> check both things, but there are still plenty of important places that
> don't.

Well, nothing would prevent using the HeapTupleHeaderGetRawXmin() in
those places. Exactly the number of callsites is what makes me think
that somebody will get this wrong in the future.

> > * The PLs imo should set fn_xmin to FrozenTransactionId if the hint bit
> >   tell us the tuple is frozen.
> 
> Why?  I thought about that, but it seemed to me that the purpose of
> that caching was to avoid confusing two functions whose pg_proc tuples
> ended up at the same TID.
> [good reasoning]

Man. I hate this hack. I wonder what happens if somebody does a VACUUM
FULL of pg_class and there are a bunch of functions created in the same
transaction...

> > * I think rewrite_heap_dead_tuple needs to check for a frozen xmin and
> >   store that. We might looking at a chain which partially was done in
> >   <9.4. Not sure if that's a realistic scenario, but I'd rather be safe.
> 
> IIUC, you're talking about the scenario where we have an update chain
> X -> Y, where X is dead but not actually removed and Y is
> (forensically) frozen.   We're examining tuple Y and trying to
> determine whether X has been entered in rs_unresolved_tups.  If, as I
> think you're proposing, we consider the xmin of Y to be
> FrozenTransactionId, we will definitely not find it - because the way
> it got into the table in the first place was based on the value
> returned by HeapTupleHeaderGetUpdateXid().  And that value is certain
> not to be FrozenTransactionId, because we never set the xmax of a
> tuple to FrozenTransactionId.

I am thinking of something slightly different. rewrite_heap_dead_tuple()
now removes tuples/xids from the unresolved table that could be from a
different xid epoch since it unconditionally does a HASH_REMOVE if it
finds an entry doing a lookup using the *preserved* xid. Earlier that
was harmless since for frozen tuples it only ever used
FrozenTransactionId which obviously cannot be part of a valid chain and
couldn't even get entered into unresolved_tups.

I am not sure at all if that actually can be harmful but there isn't any
reason we would need to do the delete so I wouldn't. There can be
complex enough situation where later parts of a ctid chain are dead and
earlier ones are recently dead and such that I would rather be cautious.

> There's no possibility of getting confused here; if X is still around
> at all, it's xmax is of the same generation as Y's xmin.  Otherwise,
> we've had an undetected XID wraparound.

Another issue I thought about is what we will return for SELECT xmin
FROM blarg; Some people use that in their applications (IIRC
skytools/pqg/londiste does so) and they might get confused if we
suddently return xids from the future. On the other hand, not returning
the original xid would be a shame as well...

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: proposal: simple date constructor from numeric values
Next
From: Pavel Stehule
Date:
Subject: Re: proposal: simple date constructor from numeric values