Re: [HACKERS] Happy column dropping - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: [HACKERS] Happy column dropping
Date
Msg-id Pine.LNX.4.21.0001242130430.525-100000@localhost.localdomain
Whole thread Raw
In response to Re: [HACKERS] Happy column dropping  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 2000-01-22, Tom Lane mentioned:

> AFAIK there is nothing particularly magic about OIDs.  You could
> perfectly well create the new table with the same OIDs as are in the
> old table (see COPY WITH OIDS if you are wondering how).

Okay, the oids of the user data are safe.

> > Is it possible/safe to change to oid of the new pg_class entry back to the
> > old one? In that case the trouble of moving over all the constraints, etc.
> > would be half the work.
> 
> Wrong way to think about it.  You should be doing a heap_update of the
> catalog tuples that need to change, ISTM.
> 
> You could almost get away with doing it like you describe, except that
> there is a unique index on pg_class OIDs these days (right, Bruce?)
> and that index will kick out an error.  But heap_update on the original
> table tuple will work.
> 
> I think what we may want here is something comparable to what's been
> discussed recently for VACUUM: build the new table as a new heap file
> and then rename the physical file into place, without really doing
> anything to the pg_class tuple --- except of course you'd need to
> heap_update it to adjust the number-of-attributes field.

I've tried to do that but it messed me up big time. If I drop the column
and do a select * from tablename it segfaults. The same happens on vacuum
analyze (but not plain vacuum). For the analyze part I traced it down to
the tuple that gets passed to nocachegetattr() still has t_data->t_natts
set at the old value put the data itself is no longer there, hence
segfault. Vacuum does some Page and MemoryContext things to get that
tuple, so I'm lost there, but I figure select has the same problem. Seems
like not all the buffers have been flushed, but there aren't any more I
could think of. Because when I reconnect, everything is fine. Help!

What more needs to be done?

newrel = heap_create(...);
{ copy data from oldrel to newrel }
{ update pg_class.relnatts }
{ update pg_attribute }
{ update pg_attrdef }
/* get rid of old one */
ReleaseRelationBuffers(oldrel);
smgrunlink(..., oldrelname);
RelationForgetRelation(oldrel_oid);
/* Rename the new one */
FlushRelationBuffers(tempname, (BlockNumber) 0, true);
smgrclose(..., tempname);
heap_close(newrel);
rename(tempname, oldrelname);

This is pretty much what it does. Not enough flushing? Too much flushing?

However, the more severe problem with this approach is this: If for some
reason whatsoever the rename() call fails you've already deleted the
original table on disk and now you're stuck with a heap file that you
can't rename plus a catalog entry to a table that doesn't exist on disk.
Panic. Any reordering of the above suggested? Of course, rename would
normally be able to atomically overwrite the old table, but since the
disk representation might be split over several files we lose. The table
renaming code makes some pretty dangerous assumptions in this regard, I
just noticed. Perhaps we should forbid this kind of messing with big
tables, at least until we come up with a better scheme. And that's leaving
alone the fact that calling rename in the first place is a pretty bad
violation of the storage manager encapsulation. Darn it.

> 
> > 2) how do I find out if the dropped column is referenced in a constraint,
> > trigger, rule (this is necessary for a correct RESTRICT/CASCADE
> > implementation)
> 
> Actually it's worse than that: you need to be prepared to renumber the
> columns after the dropped one, too.  Probably what you will need to do
> is read in and deparse all the relevant rules and triggers, then reparse
> them against the updated table schema.  Ugly.  And no, I have no idea
> how you even *find* all the relevant rules.  (Jan?)

I think this will be postponed. This is not even done on DROP TABLE, so it
will be consistent. ;)

> 
> > Oh, btw., heaven help you if you try this on tables that are inherited
> > from.
> 
> The whole thing should be done inside a recursive routine that applies
> the same change to all children of the target table.  See ALTER TABLE
> ADD COLUMN for an example.  (ADD COLUMN is pretty broken too, since it
> doesn't preserve consistency of column numbering across child tables ---
> want to reimplement it in this same style?)

Yes I saw that. I'm just going to forbid use of this on inheritance trees
until all of this gets fixed in one run.

-- 
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: RE: [HACKERS] Happy column dropping
Next
From: Peter Eisentraut
Date:
Subject: Re: [HACKERS] Happy column dropping