Re: preserving forensic information when we freeze - Mailing list pgsql-hackers

From Greg Stark
Subject Re: preserving forensic information when we freeze
Date
Msg-id CAM-w4HO47osezyR=x0hGabhExp1T0z0VfDEmMq4d4Vk2nvGUcA@mail.gmail.com
Whole thread
In response to Re: preserving forensic information when we freeze  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: preserving forensic information when we freeze
Re: preserving forensic information when we freeze
List pgsql-hackers

> I fail to see why.  What is so ugly about this:

> select x.* from pgbench_accounts a, pg_tuple_header(a.tableoid, a.ctid) x;

Two points:

1) it's a bit weird to go to this effort to eliminate system columns by using a scheme that depends on having a system column -- ctid

2) refetching a row could conceivably end up retrieving different data than was present when the row was originally read. (In some cases that might actually be the intended behaviour)

If this came up earlier I'm sorry but I suppose it's too hard to have a function like foo(tab.*) which magically can tell that the record is a heap tuple and look in the header? And presumably throw an error if passed a non heap tuple.

--
greg

On 1 Jan 2014 21:34, "Robert Haas" <robertmhaas@gmail.com> wrote:
On Fri, Dec 27, 2013 at 9:24 PM, Stephen Frost <sfrost@snowman.net> wrote:
>> I'm not sure what you mean by "doesn't work", because it clearly does
>> work.  I've already posted a patch.  You may find it ugly, but that's
>> not the same as not working.
>
> I meant *practically*, it doesn't work.  By which, I mean, "it sucks" as
> a solution. :)

I fail to see why.  What is so ugly about this:

select x.* from pgbench_accounts a, pg_tuple_header(a.tableoid, a.ctid) x;

It may be true that that query wouldn't have worked before Tom added
LATERAL, but he did add LATERAL and we have it now and nobody's going
to care about this function on versions that haven't got LATERAL,
because it won't exist there anyway.  The whole point of adding
features like LATERAL (which doesn't even appear in that query,
interestingly enough) is that it was really annoying to use functions
like this before we had it.  But now we *do* have it, so the fact a
function like this would have been annoying to use in older releases
isn't a reason not to add it now.

I will admit that performance may be a reason.  After pgbench -i:

rhaas=# explain analyze select xmax from pgbench_accounts;
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pgbench_accounts  (cost=0.00..2640.00 rows=100000
width=4) (actual time=0.009..14.950 rows=100000 loops=1)
 Total runtime: 20.354 ms
(2 rows)

rhaas=# explain analyze select (pg_tuple_header(tableoid,ctid)).xmax
from pgbench_accounts;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Seq Scan on pgbench_accounts  (cost=0.00..2890.00 rows=100000
width=10) (actual time=0.023..314.783 rows=100000 loops=1)
 Total runtime: 322.714 ms
(2 rows)

>> > Hindsight being what it is, perhaps we should have stuffed the system
>> > columns into a complex type instead of having individual columns, but
>> > I'm not sure changing that now would be worth the backwards
>> > compatibility break (yes, I know they're system columns, but I've seen
>> > more than one case of using ctid to break ties in otherwise identical
>> > rows..).
>>
>> Well, if the consensus is in favor of adding more system columns,
>> that's not my first choice, but I'm OK with it.  However, I wonder how
>> we plan to name them.  If we add pg_infomask and pg_infomask2, it
>> won't be consistent with the existing naming convention which doesn't
>> include any kind of pg-prefix, but if we don't use such a prefix then
>> every column we add further pollutes the namespace.
>
> Yeah, I agree that it gets a bit ugly...  What would you think of doing
> *both*?  Keep the existing system columns for backwards compatibility,
> but then also have a complex 'pg_header' type which provides all of the
> existing columns, as well as infomask && infomask2 ...?

I don't really see what that accomplishes, TBH.  If we further modify
the header format in the future, we'll need to modify the definition
of that type, and that will be a backward-compatibility break for
anyone using it.  Adding new system columns is probably less
intrusive.

Maybe it's best to just add pg_infomask and pg_infomask2 as system
columns and not worry about the inconsistency with the naming
convention for existing columns.

Other opinions?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: more psprintf() use
Next
From: Heikki Linnakangas
Date:
Subject: Re: more psprintf() use