Thread: pg_migrator versus inherited columns
I was testing pg_migrator the other day on the regression database, and found out that it doesn't work: Restoring database schema psql:/home/postgres/pg_migrator_dump_db.sql:4545: ERROR: column"........pg.dropped.1........" of relation "dropcolumnchild" does not exist Some investigation found out the reason. pg_migrator needs to be able to create tables that exactly match the physical column layout of the old database, including dropped columns if any. The way that it does this is that the --binary-upgrade switch added to pg_dump causes pg_dump to go ahead and include dropped columns in CREATE TABLE commands, and then immediately drop them. That's okay for simple tables, but it's got exactly zero hope of working in inheritance scenarios. A column that was created and later dropped in a parent table might or might not exist (though now dropped) in child tables. More generally, the approach is fatally broken for inheritance children even if they contain no dropped columns. Adding a column to a parent table results in child-table column ordering that is different from what you get after a dump and reload, for example after create table p (f1 int, f2 int);create table c (f3 int) inherits (p);alter table p add column f4 int; the column order in c is f1,f2,f3,f4 ... but after dump and reload it will be f1,f2,f4,f3, because f4 will be part of the initial definition of p rather than getting tacked on later. We understood years ago that pg_dump has to take extra measures to deal with this --- that's why it has to use COPY with a column list. The only solution I can see that has a chance of working without a massive amount of new logic in pg_dump is to change the way that inheritance is managed. Instead of dumping the above situation as create table p (f1 int, f2 int, f4 int);create table c (f3 int) inherits (p); I propose that when --binary-upgrade is active, it should be dumped as create table p (f1 int, f2 int, f4 int);create table c (f1 int, f2 int, f4 int, f3 int);alter table c add inherit p; (It's a good thing we added ADD INHERIT in 8.4, or we'd be completely up the creek here.) In this way we can ensure that the physical order of columns really is what it needs to be in the child table. Dropped columns can then be managed in the same way as the current code does, but it'll actually work. (We have to drop them before doing the ADD INHERIT of course.) Comments? regards, tom lane
On Wed, Jul 1, 2009 at 11:36 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > (It's a good thing we added ADD INHERIT in 8.4, or we'd be completely > up the creek here.) In this way we can ensure that the physical order > of columns really is what it needs to be in the child table. Dropped > columns can then be managed in the same way as the current code does, > but it'll actually work. (We have to drop them before doing the > ADD INHERIT of course.) > > Comments? Uhm, we've had ADD INHERIT since 8.2 -- that was my first patch. This will result in all the columns being marked attislocal. Ie, if they're dropped from the parent they won't be dropped automatically from the children any longer. Frankly I never really liked attislocal -- it seems to me the user knows when he's dropping the column whether he wants it dropped from the children and should be able to explicitly request it to cascade or not. I can't imagine many use cases where you want it to cascade to some children but not others based on how they were originally created. -- greg http://mit.edu/~gsstark/resume.pdf
Greg Stark <gsstark@mit.edu> writes: > On Wed, Jul 1, 2009 at 11:36 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote: >> Comments? > Uhm, we've had ADD INHERIT since 8.2 -- that was my first patch. Hmm, 8.3 doesn't seem to recognize the syntax: regression=# alter table c add inherit p; ERROR: type "p" does not exist LINE 1: alter table c add inherit p; ^ > This will result in all the columns being marked attislocal. Ie, if > they're dropped from the parent they won't be dropped automatically > from the children any longer. Good point. We could have pg_dump fix that up, I suppose. On the other hand, I'm not entirely sure that the current dump methodology guarantees to preserve attislocal correctly anyway. > Frankly I never really liked attislocal -- it seems to me the user > knows when he's dropping the column whether he wants it dropped from > the children and should be able to explicitly request it to cascade or > not. The original discussions about attislocal/attinhcount came up with some cases that seemed to make it necessary, but I've long forgotten the details. regards, tom lane
On Thu, Jul 2, 2009 at 1:25 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote: > > Hmm, 8.3 doesn't seem to recognize the syntax: > > regression=# alter table c add inherit p; > ERROR: type "p" does not exist > LINE 1: alter table c add inherit p; > ^ Oh I remember being caught by this myself. The above is trying to add a new column named "inherit". The syntax to add an inheritance parent is just "alter table c inherit p" -- greg http://mit.edu/~gsstark/resume.pdf