Thread: issue, dumping and restoring tables with table inheritance can alter column order

issue, dumping and restoring tables with table inheritance can alter column order

From
Ingmar Brouns
Date:

Hi,

Im running:

 PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6), 64-bit

I make use of table inheritance. I've created a table, added a
child table, and after that added a column to the parent table
(example code below).

As you can see below the newly added column is listed last in the
child table. I would have expected column 'c' to come after
column 'a', but can imagine that you can argue that it is safest
to put it last as people may have code that depends on column
positions.

However, when I dump the schema using pg_dump, and then load the
resulting sql file, suddenly 'c' does follow 'a'. So restoring my
schema has changed my table's layout. I feel dumping and loading
should not alter column positions. Any thoughts?

create schema test_issue;
create table test_issue.foo ( a integer );
create table test_issue.bar ( b text ) inherits ( test_issue.foo );
alter table test_issue.foo add column c integer;

postgres=# \d test_issue.bar;
    Table "test_issue.bar"
 Column |  Type   | Modifiers
--------+---------+-----------
 a      | integer |
 b      | text    |
 c      | integer |
Inherits: test_issue.foo

]$ ~/bin/pg_dump -n test_issue > test_issue.sql;

postgres=# drop schema test_issue cascade;

]$ psql -f test_issue.sql;

postgres=# \d test_issue.bar;
    Table "test_issue.bar"
 Column |  Type   | Modifiers
--------+---------+-----------
 a      | integer |
 c      | integer |
 b      | text    |
Inherits: test_issue.foo


Ingmar

Re: issue, dumping and restoring tables with table inheritance can alter column order

From
"David G. Johnston"
Date:
On Mon, Sep 14, 2015 at 11:11 AM, Ingmar Brouns <swingi@gmail.com> wrote:

However, when I dump the schema using pg_dump, and then load the
resulting sql file, suddenly 'c' does follow 'a'. So restoring my
schema has changed my table's layout. I feel dumping and loading
should not alter column positions. Any thoughts?


​Any or all of the following:​

1) Help the community implement the outstanding concepts surrounding the separation and recording of separate data for physical and logical column order.
2) Suggest, and/or implement, ways that the current behavior could be more readily discovered and comprehended by users without having to discover it by accident.
3) Understand the problem and mitigate its impact in your specific work.

David J.

 
On 9/14/15 11:59 AM, David G. Johnston wrote:
> On Mon, Sep 14, 2015 at 11:11 AM, Ingmar Brouns <swingi@gmail.com
> <mailto:swingi@gmail.com>>wrote:
>
>
>     However, when I dump the schema using pg_dump, and then load the
>     resulting sql file, suddenly 'c' does follow 'a'. So restoring my
>     schema has changed my table's layout. I feel dumping and loading
>     should not alter column positions. Any thoughts?
>
>
> ​Any or all of the following:​
>
> 1) Help the community implement the outstanding concepts surrounding the
> separation and recording of separate data for physical and logical
> column order.
> 2) Suggest, and/or implement, ways that the current behavior could be
> more readily discovered and comprehended by users without having to
> discover it by accident.
> 3) Understand the problem and mitigate its impact in your specific work.

To elaborate... without looking at the code I'm pretty sure what's
happening here is that pg_dump simply dumps the entire parent table,
including the added column. In fact, it must be doing this because we
don't have any way to track when a column is added after table creation.

pg_dump then spits out CREATE TABLE child(...) INHERITS(parent), and the
database correctly puts all the parent fields first in the child.

I'm pretty certain that nothing here violates relational theory. It's
another example of why SELECT * is a bad idea. Hence why you should do
#3. (I've thought about adding a "chaos" setting where all row results
get ordered by random(). That wouldn't help this case until we get #1
though.)

There's basically 0 chance of this being changed until #1 is done. At
that point I'd expect pg_dump to start working correctly here, but it's
also possible that adding a field to a parent would no longer go to the
end of the children.

#2 could be as simple as a change to the documentation. Patches (or even
just re-written text) welcome.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com


Re: issue, dumping and restoring tables with table inheritance can alter column order

From
Michael Paquier
Date:
On Tue, Sep 15, 2015 at 7:19 AM, Jim Nasby wrote:
> On 9/14/15 11:59 AM, David G. Johnston wrote:
>>
>> On Mon, Sep 14, 2015 at 11:11 AM, Ingmar Brouns wrote:
>>     However, when I dump the schema using pg_dump, and then load the
>>     resulting sql file, suddenly 'c' does follow 'a'. So restoring my
>>     schema has changed my table's layout. I feel dumping and loading
>>     should not alter column positions. Any thoughts?

When using pg_dump --inserts, meaning that the column names are not
dumped in the INSERT query used, this would cause a problem. In other
cases, aka the use of COPY or --column-inserts the dump will be nicely
consistent. It is true that we have a problem though as pg_dump relies
on attnum to order its columns in its logic.

> To elaborate... without looking at the code I'm pretty sure what's happening
> here is that pg_dump simply dumps the entire parent table, including the
> added column. In fact, it must be doing this because we don't have any way
> to track when a column is added after table creation.
>
> pg_dump then spits out CREATE TABLE child(...) INHERITS(parent), and the
> database correctly puts all the parent fields first in the child.

Yep. Taking the problem the other way around, we could fix the backend
such as the column ordering is not messed up on the child table when
the new column is added on the parent instead of putting the blame in
pg_dump. FWIW, that's something I am investigating on this thread of
-hackers:
http://www.postgresql.org/message-id/20150926132237.GJ5702@alap3.anarazel.de
--
Michael