Re: preserving forensic information when we freeze - Mailing list pgsql-hackers

From Robert Haas
Subject Re: preserving forensic information when we freeze
Date
Msg-id CA+TgmoZ8Lcv+b614DsrZeROf1bosLy-MAxjzj0uVgXnRURc=yA@mail.gmail.com
Whole thread Raw
In response to Re: preserving forensic information when we freeze  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
On Wed, Jul 3, 2013 at 1:07 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Well, nothing would prevent using the HeapTupleHeaderGetRawXmin() in
> those places. Exactly the number of callsites is what makes me think
> that somebody will get this wrong in the future.

Well, I guess I could go rework the whole patch that way.  It's a fair
request, but I kinda doubt it's going to make the patch smaller.

>> > * I think rewrite_heap_dead_tuple needs to check for a frozen xmin and
>> >   store that. We might looking at a chain which partially was done in
>> >   <9.4. Not sure if that's a realistic scenario, but I'd rather be safe.
>>
>> IIUC, you're talking about the scenario where we have an update chain
>> X -> Y, where X is dead but not actually removed and Y is
>> (forensically) frozen.   We're examining tuple Y and trying to
>> determine whether X has been entered in rs_unresolved_tups.  If, as I
>> think you're proposing, we consider the xmin of Y to be
>> FrozenTransactionId, we will definitely not find it - because the way
>> it got into the table in the first place was based on the value
>> returned by HeapTupleHeaderGetUpdateXid().  And that value is certain
>> not to be FrozenTransactionId, because we never set the xmax of a
>> tuple to FrozenTransactionId.
>
> I am thinking of something slightly different. rewrite_heap_dead_tuple()
> now removes tuples/xids from the unresolved table that could be from a
> different xid epoch since it unconditionally does a HASH_REMOVE if it
> finds an entry doing a lookup using the *preserved* xid. Earlier that
> was harmless since for frozen tuples it only ever used
> FrozenTransactionId which obviously cannot be part of a valid chain and
> couldn't even get entered into unresolved_tups.
>
> I am not sure at all if that actually can be harmful but there isn't any
> reason we would need to do the delete so I wouldn't. There can be
> complex enough situation where later parts of a ctid chain are dead and
> earlier ones are recently dead and such that I would rather be cautious.

OK, I think I see your point, and I think you're right.

>> There's no possibility of getting confused here; if X is still around
>> at all, it's xmax is of the same generation as Y's xmin.  Otherwise,
>> we've had an undetected XID wraparound.
>
> Another issue I thought about is what we will return for SELECT xmin
> FROM blarg; Some people use that in their applications (IIRC
> skytools/pqg/londiste does so) and they might get confused if we
> suddently return xids from the future. On the other hand, not returning
> the original xid would be a shame as well...

Yeah.  I think the system columns that we have today are pretty much
crap.  I wonder if we couldn't throw them out and replace them with
some kind of functions that you can pass a row to.  That would allow
us to expose a lot more detail without adding a bajillion hidden
columns, and for a bonus we'd save substantially on catalog bloat.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Fix pgstattuple/pgstatindex to use regclass-type as the argument
Next
From: Alvaro Herrera
Date:
Subject: Re: Add regression tests for COLLATE