> The sticky thing is dropping a column. There are two options, and
> postgresql developers just can't make up their mind :P)
>
> a) keep old column data in database (wasted space, but fast)
> b) immediately 'compress' table, removing old data (slow, needs a
lot of
> space for compression)
>
> Option a) was implemented once, but kludgy, and had a few kinks, and
it
> was removed. Option b) plain sucks :P)
Out of curiosity how was option a) implemented? I could envision
supporting multiple versions of a tuple style to be found within a
table (each described in pg_attribute). Gradually these would be
upgraded through normal use.
I'm personally not fond of the option b) due to the time involved in
completing the action. Not only is space an issue, but locking the
database up for a day while removing a column isn't the nicest thing
to do -- rename, make nullable, drop all constraints and try to ignore
it right?
One would expect that keeping multiple versions of a tuple structure
inside a single table to be slower than normal for selects, but I
don't think it would require marking the rows themselves -- just base
it on the max and min transactions in the table at that time. Vacuum
would have to push the issue (5k tuples at a time?) of upgrading some
of the tuples each time it's run in order to enfore that they were all
gone before XID wrap. Background vacuum is ideal for that (if
implemented). Drop all constraints, indexes and the name (change to
$1 or something) of the column immediatly. Vacuum can determine when
XID Min in a table is > XID Max of another version and drop the
information from pg_attribute.
Obviously affected:
- pg_attribute, and anything dealing with it (add XID Max, XID Min
wraps for known ranges)
- storage machanism. On read of a tuple attempt to make it fit latest
version (XID Max is NULL) by ignoring select fields.
I'll have to leave it up to the pros as to whether it can be done,
should be done, and what else it'll affect.
I suppose this was option a) that was removed due to it's kludgyness
:)