Re: preserving forensic information when we freeze - Mailing list pgsql-hackers

From Tom Lane
Subject Re: preserving forensic information when we freeze
Date
Msg-id 32543.1388684794@sss.pgh.pa.us
Whole thread Raw
In response to Re: preserving forensic information when we freeze  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: preserving forensic information when we freeze  (Robert Haas <robertmhaas@gmail.com>)
Re: preserving forensic information when we freeze  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-hackers
Andres Freund <andres@2ndquadrant.com> writes:
> On 2014-01-02 09:40:54 -0500, Tom Lane wrote:
>> Actually, I thought the function approach was a good proposal.  You are
>> right that func(tab.*) isn't going to work, because it's going to get a
>> Datum-ified tuple not a pointer to raw on-disk storage.  But an inspection
>> function that's handed a ctid could work.

> Well, we discussed that upthread, and the overhead of going through a
> function is quite noticeable because the tuple needs to be fetched from
> the heap again.

Yeah, I read those results, but that seems like it could probably be
optimized.  I'm guessing the function was doing a new heap_open every
time?  There's probably a way around that.

In any case, upon further reflection I'm not convinced that doing this
with a SELECT-based query is the right thing, no matter whether the query
looks at a function or a system column; because by definition, you'll only
be able to see tuples that are visible to your current snapshot.  For real
forensics work, you need to be able to see all tuples, which makes me
think that something akin to pgstattuple is the right API; that is "return
a set of the header info for all tuples on such-and-such pages of this
relation".  That should dodge any performance problem, because the
heap_open overhead could be amortized across lots of tuples, and it also
sidesteps all problems with adding new system columns.

> Upthread there's a POC patch of mine, that started to explore what's
> necessary to simply never store system columns (except maybe oid) in
> pg_attribute. While it passes the regression tests it's not complete,
> but the amount of work looks reasonable.

I think this will inevitably break a lot of code, not all of it ours,
so I'm not in favor of pursuing that direction.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: INSERT...ON DUPLICATE KEY LOCK FOR UPDATE
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Remove some duplicate if conditions