Re: Column storage positions - Mailing list pgsql-hackers

From Phil Currier
Subject Re: Column storage positions
Date
Msg-id c58979e50702211012i2d3c1963j8b5aae1b50a68b6c@mail.gmail.com
Whole thread Raw
In response to Re: Column storage positions  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: Column storage positions  ("Florian G. Pflug" <fgp@phlo.org>)
List pgsql-hackers
On 2/21/07, Martijn van Oosterhout <kleptog@svana.org> wrote:
> > don't see any good way to perform an upgrade between PG versions
> > without rewriting each table's data.  Maybe most people aren't doing
> > upgrades like this right now, but it seems like it will only become
> > more common in the future.  In my opinion, this is more important than
> > #1.
>
> I don't see this either. For all current tables, the storage position
> is the attribute number, no exception. You say:
>
> > because the version X table could
> > have dropped columns that might or might not be present in any given
> > tuple on disk.
>
> Whether they're there or not is irrelevent. Drop columns are not
> necesarily empty, but in any case they occupy a storage position until
> the table is rewritten. A dump/restore doesn't need to preserve this,
> but pg_migrator will need some smarts to handle it. The system will
> need to create a column of the appropriate type and drop it to get to
> the right state.

I agree, a dump/restore that rewrites all the table datafiles doesn't
need to handle this.  And I agree that the system will need to create
dropped columns and then drop them again, that's exactly what I
suggested in fact.  We're talking about pg_migrator-style upgrades
only here.

Say we do this in 8.2:

create table foo (a varchar(10), b int);
insert into foo ....
alter table foo add column c int;

At this point, the column storage order is (a, b, c) because 8.2 never
changes storage order.  Then you upgrade to 8.3.  pg_dump now wants to
write out some DDL that will create a table matching the existing
table datafile, since we don't want to have to rewrite it.  pg_dump
prints out:

create table foo (a varchar(10), b int, c int);

The 8.3 system will try to create the table with column order (b, c,
a), since it's trying to optimize storage order, and that won't match
the existing table datafile.  What we need is a way to make sure that
the table matches the original datafile.

Now say that it's not an 8.2 -> 8.3 upgrade, say it's an 8.3 -> 8.4
upgrade.  In this case, 8.3 would have the table with storage order
(b, a, c).  (Column c would have been added at the end since it was
added without a default, and didn't force a table rewrite.)  How do
you get pg_dump to print out table creation DDL that will result in a
table matching the existing (b, a, c) table datafile?

This is why I think pg_dump needs to be able to print an ALTER TABLE
statement that will explicitly assign storage positions.  This happens
to have the side-effect of being potentially useful to admins who
might want control over that.

If this only affected 8.2 -> 8.3 upgrades, then maybe it's not as
important an issue.  But I think it affects *all* future upgrades,
which is why I'm trying to raise the issue now.


> If you really want to use pg_dump I'd suggest an option to pg_dump
> --dump-dropped-columns which will include the dropped columns in the
> CREATE TABLE but drop them immediatly after. It's really more a corner
> case than anything else.


pgsql-hackers by date:

Previous
From: "Simon Riggs"
Date:
Subject: Re: HOT for PostgreSQL 8.3
Next
From: Andrew Dunstan
Date:
Subject: Re: Column storage positions