Re: alter table drop column status - Mailing list pgsql-hackers

From Tom Lane
Subject Re: alter table drop column status
Date
Msg-id 28866.1013615847@sss.pgh.pa.us
Whole thread Raw
In response to Re: alter table drop column status  (Hiroshi Inoue <Inoue@tpf.co.jp>)
List pgsql-hackers
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> My trial implementation uses logical(for clients) and
> physical (for backend internal) attribute numbers but
> there were many places where I wasn't able to judge which
> to use immediately. I'm pretty suspicious if a developer
> could be careful about the choise when he is implementing
> an irrevant feature. (Un)fortunately the numbers have
> the same values mostly and he could hardly notice the
> mistake even if he chose the wrong attribute numbers.

I think this was the thing that really scared everyone about the trial
implementation: the near-certainty of bugs that might remain unnoticed
for a long time.


At the last OSDB conference I had an interesting discussion with
Ann Harrison about how Interbase (Firebird) deals with this problem.
Essentially, they mark every tuple with an identifier for the schema
that it follows.  Translated to Postgres terms, it'd work like this:

1. Composite types (row types) could exist independently of tables;
this is something we've wanted for awhile anyway.  A composite type
is identified by its OID in pg_type.  pg_attribute rows would have
to be considered to belong to pg_type entries not pg_class entries.

2. A relation in pg_class has a pointer to its current preferred schema
(row type).  This link exists already (reltype), but it would no longer
be necessarily fixed for the life of the relation.  To implement ADD,
DROP or ALTER COLUMN, you'd construct a new row type and update
pg_class.reltype to point to it.  And that's all you'd do --- you'd not
touch the stored data.

3. Tuples being inserted/updated would always be coerced to the current
preferred schema of the relation.  However, old tuples would remain
with their original schema, perhaps indefinitely.  (Or we could offer
a special command to forcibly update all tuples to current schema.)

4. Internally, we'd probably need to create a "row type cache" separate
from the existing relcache, so that the attribute structure shown by a
given tuple header could be looked up quickly, whether or not it is the
current preferred schema of the relation.

5. It'd no longer be possible to identify a particular column solely
by column number, since the column number might vary between schemas.
Nor would identification by name be reliable (think RENAME COLUMN).
I think what we'd have to do is go back to giving OIDs to individual
pg_attribute entries ... they wouldn't be true OIDs in the current sense
because not unique across all pg_attribute entries, but we could
generate them using the OID counter.  Perhaps call them serial numbers
not OIDs.  When constructing a new schema, the serial number would be
carried over from each column that is logically the same column as some
pre-existing column --- but the physical column numbers might be quite
different.  Then, initial construction of a query plan would resolve
column name to column serial number using the current schema of the
relation, and at runtime the serial number would have to be looked up
in the actual schema of each tuple.  If it's not found, use the default
value of the column as shown in the current schema (this supports ADD
COLUMN).  If it's found but does not have the same datatype as the Var
shows that the current schema expects, perform a runtime type coercion
(this supports ALTERing a column datatype).

The main thing that this supports that Hiroshi's trial implementation
didn't is altering column datatype.

It'd also considerably simplify processing of inheritance-tree table
scans: rather than the current kluge that translates parent to child
column numbers, you'd just make sure that a child table is created with
column serial numbers matching the parent for its inherited columns.
Then the above-described mechanism takes care of finding the child
columns for you: essentially, a child-table tuple can be treated just
like a tuple that's not of the current schema in the parent table.
(I'm not sure if the trial implementation could do that too.)

The weakest feature of the whole scheme is the per-tuple runtime lookups
implied by points 4 and 5.  We could probably avoid any noticeable
slowdown in normal cases by caching the results in Var nodes of
execution plans, but in cases where a relation has a wild mix of tuples
of different vintages a single-entry cache wouldn't help much.

Another objection is the need to add an OID field to tuple headers; 4
more bytes per tuple adds up (and on some platforms it'd be 8 bytes due
to alignment considerations).

Another problem is that the distinction between column positions and
column serial numbers has the same kind of potential for confusion as
between logical and physical numbers in the trial implementation.  It
wouldn't be as bad, because the values would be different in most cases.


This'd be a sufficiently big change that I'm not at all sure we'd want
to do it that way.  But I thought I'd sketch out the idea and see if
anyone likes it.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Jean-Paul ARGUDO
Date:
Subject: Re: feature request START WITH ... CONNECT BY
Next
From: Tom Lane
Date:
Subject: Re: "Bug" in statistics for v7.2?