Thread: pg_migrator and handling dropped columns

pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
bruce wrote:
> Peter Eisentraut wrote:
> > Bruce Momjian wrote:
> > > Now that pg_migrator is BSD licensed, and already in C, I am going to
> > > spend my time trying to improve pg_migrator for 8.4:
> > > 
> > >     http://pgfoundry.org/projects/pg-migrator/
> > 
> > What is the plan now?  Get pg_upgrade working, get pg_migrator working, 
> > ship pg_migrator in core or separately?  Is there any essential 
> > functionality that we need to get into the server code before release? 
> > Should we try to get dropped columns working?  It's quite late to be 

I have thought about how to handle dumped columns and would like to get
some feedback on this.

It is easy to find the dropped columns with 'pg_attribute.attisdropped 
= true'.

The basic problem is that dropped columns do not appear in the pg_dump
output schema, but still exist in the data files.  While the missing
data is not a problem, the dropped column's existence affects all
subsequent columns, increasing their attno values and their placement in
the data files.

I can think of three possible solutions, all involve recreating and
dropping the dropped column in the new schema:
1  modify the pg_dumpall --schema-only output file before   loading to add the dropped column2  drop/recreate the table
afterloading to add the dropped   column3  modify the system tables directly to add the dropped column,   perhaps using
pg_dependinformation
 

#1 seems like the best option, though it requires parsing the pg_dump
file to some extent.  #2 is a problem because dropping/recreating the
table might be difficult because of foreign key relationships, even for
empty tables. #3 seems prone to maintenance requirements every time we
change system object relationships.

Once the dropped column is created in the new server, it can be dropped
to match the incoming data files.

Comments?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator and handling dropped columns

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I can think of three possible solutions, all involve recreating and
> dropping the dropped column in the new schema:

(4) add a switch to pg_dump to include dropped columns in its
schema output and then drop them.  This seems far more maintainable
than writing separate code that tries to parse the output.
        regards, tom lane


Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I can think of three possible solutions, all involve recreating and
> > dropping the dropped column in the new schema:
> 
> (4) add a switch to pg_dump to include dropped columns in its
> schema output and then drop them.  This seems far more maintainable
> than writing separate code that tries to parse the output.

That would certainly be the easiest.  I was going to have trouble
generating the exact column creation string anyway in pg_migrator.

I assume I would also drop the column in the pg_dump output.  

Is this acceptable to everyone?  We could name the option
-u/--upgrade-compatible.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator and handling dropped columns

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> (4) add a switch to pg_dump to include dropped columns in its
>> schema output and then drop them.  This seems far more maintainable
>> than writing separate code that tries to parse the output.

> I assume I would also drop the column in the pg_dump output.  

Right, that's what I meant --- do all the work within pg_dump.

> Is this acceptable to everyone?  We could name the option
> -u/--upgrade-compatible.

If the switch is specifically for pg_upgrade support (enabling this as
well as any other hacks we find necessary), which seems like a good
idea, then don't chew up a short option letter for it.  There should be
a long form only.  And probably not even list it in the user
documentation.
        regards, tom lane


Re: pg_migrator and handling dropped columns

From
"Joshua D. Drake"
Date:
On Thu, 2009-02-12 at 13:39 -0500, Tom Lane wrote:

> Right, that's what I meant --- do all the work within pg_dump.
> 
> > Is this acceptable to everyone?  We could name the option
> > -u/--upgrade-compatible.
> 
> If the switch is specifically for pg_upgrade support (enabling this as
> well as any other hacks we find necessary), which seems like a good
> idea, then don't chew up a short option letter for it.  There should be
> a long form only.  And probably not even list it in the user
> documentation.

Why wouldn't we want to list it?

Joshua D. Drake

> 
>             regards, tom lane
> 
-- 
PostgreSQL - XMPP: jdrake@jabber.postgresql.org  Consulting, Development, Support, Training  503-667-4564 -
http://www.commandprompt.com/ The PostgreSQL Company, serving since 1997
 



Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> (4) add a switch to pg_dump to include dropped columns in its
> >> schema output and then drop them.  This seems far more maintainable
> >> than writing separate code that tries to parse the output.
> 
> > I assume I would also drop the column in the pg_dump output.  
> 
> Right, that's what I meant --- do all the work within pg_dump.
> 
> > Is this acceptable to everyone?  We could name the option
> > -u/--upgrade-compatible.
> 
> If the switch is specifically for pg_upgrade support (enabling this as
> well as any other hacks we find necessary), which seems like a good
> idea, then don't chew up a short option letter for it.  There should be
> a long form only.  And probably not even list it in the user
> documentation.

OK, works for me;  any objections from anyone?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Joshua D. Drake wrote:
> On Thu, 2009-02-12 at 13:39 -0500, Tom Lane wrote:
> 
> > Right, that's what I meant --- do all the work within pg_dump.
> > 
> > > Is this acceptable to everyone?  We could name the option
> > > -u/--upgrade-compatible.
> > 
> > If the switch is specifically for pg_upgrade support (enabling this as
> > well as any other hacks we find necessary), which seems like a good
> > idea, then don't chew up a short option letter for it.  There should be
> > a long form only.  And probably not even list it in the user
> > documentation.
> 
> Why wouldn't we want to list it?

Because it is for internal use by upgrade utilities, not for user use.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator and handling dropped columns

From
Tom Lane
Date:
"Joshua D. Drake" <jd@commandprompt.com> writes:
> On Thu, 2009-02-12 at 13:39 -0500, Tom Lane wrote:
>> a long form only.  And probably not even list it in the user
>> documentation.

> Why wouldn't we want to list it?

Because it's for internal use only.  Although the effect we're
discussing here is relatively harmless, it seems possible that
further down the road we might find a need for hacks that would
render the output entirely unfit for ordinary dump purposes.
I don't see a need to encourage people to play with fire.

It's hardly unprecedented for us to have undocumented internal
options --- there are some in postgres.c for example.
        regards, tom lane


Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Joshua D. Drake" <jd@commandprompt.com> writes:
> > On Thu, 2009-02-12 at 13:39 -0500, Tom Lane wrote:
> >> a long form only.  And probably not even list it in the user
> >> documentation.
> 
> > Why wouldn't we want to list it?
> 
> Because it's for internal use only.  Although the effect we're
> discussing here is relatively harmless, it seems possible that
> further down the road we might find a need for hacks that would
> render the output entirely unfit for ordinary dump purposes.
> I don't see a need to encourage people to play with fire.
> 
> It's hardly unprecedented for us to have undocumented internal
> options --- there are some in postgres.c for example.

The important point is that we add comments in the source code about why
it is undocumented.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator and handling dropped columns

From
Peter Eisentraut
Date:
Tom Lane wrote:
>> Is this acceptable to everyone?  We could name the option
>> -u/--upgrade-compatible.
> 
> If the switch is specifically for pg_upgrade support (enabling this as
> well as any other hacks we find necessary), which seems like a good
> idea, then don't chew up a short option letter for it.  There should be
> a long form only.

Note that pg_dump's output is already upgrade compatible.  That's what 
pg_dump is often used for after all.  I believe what we are after here 
is something like "in-place upgrade compatible" or "upgrade binary 
compatible".

> And probably not even list it in the user documentation.

I think we should still list it somewhere and say it is for use by 
in-place upgrade utilities.  It will only confuse people if it is not 
documented at all.


Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane wrote:
> >> Is this acceptable to everyone?  We could name the option
> >> -u/--upgrade-compatible.
> >
> > If the switch is specifically for pg_upgrade support (enabling this as
> > well as any other hacks we find necessary), which seems like a good
> > idea, then don't chew up a short option letter for it.  There should be
> > a long form only.
>
> Note that pg_dump's output is already upgrade compatible.  That's what
> pg_dump is often used for after all.  I believe what we are after here
> is something like "in-place upgrade compatible" or "upgrade binary
> compatible".
>
> > And probably not even list it in the user documentation.
>
> I think we should still list it somewhere and say it is for use by
> in-place upgrade utilities.  It will only confuse people if it is not
> documented at all.

OK, I have completed the patch;  attached.

I ran into a little problem, as documented by this comment in
catalog/heap.c:

        /*
         * Set the type OID to invalid.  A dropped attribute's type link
         * cannot be relied on (once the attribute is dropped, the type might
         * be too). Fortunately we do not need the type row --- the only
         * really essential information is the type's typlen and typalign,
         * which are preserved in the attribute's attlen and attalign.  We set
         * atttypid to zero here as a means of catching code that incorrectly
         * expects it to be valid.
         */

Basically, drop column zeros pg_attribute.atttypid, and there doesn't
seem to be enough information left in pg_attribute to guess the typid
that, combined with atttypmod, would restore the proper values for
pg_attribute.atttypid and pg_attribute.attalign.  Therefore, I just
brute-forced an UPDATE into dump to set the values properly after
dropping the fake TEXT column.

I did a minimal documentation addition by adding something to the
"Notes" section of the manual pages.

Here is what a dump of a table with dropped columns looks like:

    --
    -- Name: test; Type: TABLE; Schema: public; Owner: postgres; Tablespace:
    --

    CREATE TABLE test (
        x integer,
        "........pg.dropped.2........" TEXT
    );
    ALTER TABLE ONLY test DROP COLUMN "........pg.dropped.2........";

    -- For binary upgrade, recreate dropped column's length and alignment.
    UPDATE pg_attribute
    SET attlen = -1, attalign = 'i'
    WHERE   attname = '........pg.dropped.2........'
            AND attrelid =
            (
                    SELECT oid
                    FROM pg_class
                    WHERE   relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = CURRENT_SCHEMA)
                            AND relname = 'test'
            );

    ALTER TABLE public.test OWNER TO postgres;

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/ref/pg_dump.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
retrieving revision 1.109
diff -c -c -r1.109 pg_dump.sgml
*** doc/src/sgml/ref/pg_dump.sgml    10 Feb 2009 00:55:21 -0000    1.109
--- doc/src/sgml/ref/pg_dump.sgml    17 Feb 2009 01:57:10 -0000
***************
*** 827,832 ****
--- 827,837 ----
     editing of the dump file might be required.
    </para>

+   <para>
+    <application>pg_dump</application> also supports a
+    <literal>--binary-upgrade</> option for upgrade utility usage.
+   </para>
+
   </refsect1>

   <refsect1 id="pg-dump-examples">
Index: doc/src/sgml/ref/pg_dumpall.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dumpall.sgml,v
retrieving revision 1.75
diff -c -c -r1.75 pg_dumpall.sgml
*** doc/src/sgml/ref/pg_dumpall.sgml    7 Feb 2009 14:31:30 -0000    1.75
--- doc/src/sgml/ref/pg_dumpall.sgml    17 Feb 2009 01:57:10 -0000
***************
*** 489,494 ****
--- 489,499 ----
     locations.
    </para>

+   <para>
+    <application>pg_dump</application> also supports a
+    <literal>--binary-upgrade</> option for upgrade utility usage.
+   </para>
+
   </refsect1>


Index: src/bin/pg_dump/pg_dump.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
retrieving revision 1.521
diff -c -c -r1.521 pg_dump.c
*** src/bin/pg_dump/pg_dump.c    16 Feb 2009 23:06:55 -0000    1.521
--- src/bin/pg_dump/pg_dump.c    17 Feb 2009 01:57:10 -0000
***************
*** 99,104 ****
--- 99,106 ----
  /* default, if no "inclusion" switches appear, is to dump everything */
  static bool include_everything = true;

+ static int    binary_upgrade = 0;
+
  char        g_opaque_type[10];    /* name for the opaque type */

  /* placeholders for the delimiters for comments */
***************
*** 236,242 ****
      static int  outputNoTablespaces = 0;
      static int    use_setsessauth = 0;

!     static struct option long_options[] = {
          {"data-only", no_argument, NULL, 'a'},
          {"blobs", no_argument, NULL, 'b'},
          {"clean", no_argument, NULL, 'c'},
--- 238,245 ----
      static int  outputNoTablespaces = 0;
      static int    use_setsessauth = 0;

!     struct option long_options[] = {
!         {"binary-upgrade", no_argument, &binary_upgrade, 1},    /* not documented */
          {"data-only", no_argument, NULL, 'a'},
          {"blobs", no_argument, NULL, 'b'},
          {"clean", no_argument, NULL, 'c'},
***************
*** 4611,4616 ****
--- 4614,4621 ----
      int            i_attnotnull;
      int            i_atthasdef;
      int            i_attisdropped;
+     int            i_attlen;
+     int            i_attalign;
      int            i_attislocal;
      PGresult   *res;
      int            ntups;
***************
*** 4655,4661 ****
              appendPQExpBuffer(q, "SELECT a.attnum, a.attname, a.atttypmod, "
                                   "a.attstattarget, a.attstorage, t.typstorage, "
                                     "a.attnotnull, a.atthasdef, a.attisdropped, "
!                                  "a.attislocal, "
                     "pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname "
               "FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t "
                                "ON a.atttypid = t.oid "
--- 4660,4666 ----
              appendPQExpBuffer(q, "SELECT a.attnum, a.attname, a.atttypmod, "
                                   "a.attstattarget, a.attstorage, t.typstorage, "
                                     "a.attnotnull, a.atthasdef, a.attisdropped, "
!                                    "a.attlen, a.attalign, a.attislocal, "
                     "pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname "
               "FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t "
                                "ON a.atttypid = t.oid "
***************
*** 4674,4680 ****
              appendPQExpBuffer(q, "SELECT a.attnum, a.attname, "
                                "a.atttypmod, -1 AS attstattarget, a.attstorage, "
                                "t.typstorage, a.attnotnull, a.atthasdef, "
!                               "false AS attisdropped, false AS attislocal, "
                                "format_type(t.oid,a.atttypmod) AS atttypname "
                                "FROM pg_attribute a LEFT JOIN pg_type t "
                                "ON a.atttypid = t.oid "
--- 4679,4686 ----
              appendPQExpBuffer(q, "SELECT a.attnum, a.attname, "
                                "a.atttypmod, -1 AS attstattarget, a.attstorage, "
                                "t.typstorage, a.attnotnull, a.atthasdef, "
!                               "false AS attisdropped, 0 AS attlen, "
!                               "' ' AS attalign, false AS attislocal, "
                                "format_type(t.oid,a.atttypmod) AS atttypname "
                                "FROM pg_attribute a LEFT JOIN pg_type t "
                                "ON a.atttypid = t.oid "
***************
*** 4690,4696 ****
                                "-1 AS attstattarget, attstorage, "
                                "attstorage AS typstorage, "
                                "attnotnull, atthasdef, false AS attisdropped, "
!                               "false AS attislocal, "
                                "(SELECT typname FROM pg_type WHERE oid = atttypid) AS atttypname "
                                "FROM pg_attribute a "
                                "WHERE attrelid = '%u'::oid "
--- 4696,4703 ----
                                "-1 AS attstattarget, attstorage, "
                                "attstorage AS typstorage, "
                                "attnotnull, atthasdef, false AS attisdropped, "
!                               "0 AS attlen, ' ' AS attalign, "
!                                "false AS attislocal, "
                                "(SELECT typname FROM pg_type WHERE oid = atttypid) AS atttypname "
                                "FROM pg_attribute a "
                                "WHERE attrelid = '%u'::oid "
***************
*** 4714,4719 ****
--- 4721,4728 ----
          i_attnotnull = PQfnumber(res, "attnotnull");
          i_atthasdef = PQfnumber(res, "atthasdef");
          i_attisdropped = PQfnumber(res, "attisdropped");
+         i_attlen = PQfnumber(res, "attlen");
+         i_attalign = PQfnumber(res, "attalign");
          i_attislocal = PQfnumber(res, "attislocal");

          tbinfo->numatts = ntups;
***************
*** 4724,4729 ****
--- 4733,4740 ----
          tbinfo->attstorage = (char *) malloc(ntups * sizeof(char));
          tbinfo->typstorage = (char *) malloc(ntups * sizeof(char));
          tbinfo->attisdropped = (bool *) malloc(ntups * sizeof(bool));
+         tbinfo->attlen = (int *) malloc(ntups * sizeof(int));
+         tbinfo->attalign = (char *) malloc(ntups * sizeof(char));
          tbinfo->attislocal = (bool *) malloc(ntups * sizeof(bool));
          tbinfo->notnull = (bool *) malloc(ntups * sizeof(bool));
          tbinfo->attrdefs = (AttrDefInfo **) malloc(ntups * sizeof(AttrDefInfo *));
***************
*** 4747,4752 ****
--- 4758,4765 ----
              tbinfo->attstorage[j] = *(PQgetvalue(res, j, i_attstorage));
              tbinfo->typstorage[j] = *(PQgetvalue(res, j, i_typstorage));
              tbinfo->attisdropped[j] = (PQgetvalue(res, j, i_attisdropped)[0] == 't');
+             tbinfo->attlen[j] = atoi(PQgetvalue(res, j, i_attlen));
+             tbinfo->attalign[j] = *(PQgetvalue(res, j, i_attalign));
              tbinfo->attislocal[j] = (PQgetvalue(res, j, i_attislocal)[0] == 't');
              tbinfo->notnull[j] = (PQgetvalue(res, j, i_attnotnull)[0] == 't');
              tbinfo->attrdefs[j] = NULL; /* fix below */
***************
*** 4760,4765 ****
--- 4773,4793 ----

          PQclear(res);

+
+         /*
+          *    ALTER TABLE DROP COLUMN clears pg_attribute.atttypid, so we
+          *    set the column data type to 'TEXT;  we will later drop the
+          *    column.
+          */
+         if (binary_upgrade)
+         {
+             for (j = 0; j < ntups; j++)
+             {
+                 if (tbinfo->attisdropped[j])
+                     tbinfo->atttypnames[j] = strdup("TEXT");
+             }
+         }
+
          /*
           * Get info about column defaults
           */
***************
*** 9680,9686 ****
          for (j = 0; j < tbinfo->numatts; j++)
          {
              /* Is this one of the table's own attrs, and not dropped ? */
!             if (!tbinfo->inhAttrs[j] && !tbinfo->attisdropped[j])
              {
                  /* Format properly if not first attr */
                  if (actual_atts > 0)
--- 9708,9715 ----
          for (j = 0; j < tbinfo->numatts; j++)
          {
              /* Is this one of the table's own attrs, and not dropped ? */
!             if (!tbinfo->inhAttrs[j] &&
!                 (!tbinfo->attisdropped[j] || binary_upgrade))
              {
                  /* Format properly if not first attr */
                  if (actual_atts > 0)
***************
*** 9786,9791 ****
--- 9815,9867 ----

          appendPQExpBuffer(q, ";\n");

+         /*
+          * For binary-compatible heap files, we create dropped columns
+          * above and drop them here.
+          */
+         if (binary_upgrade)
+         {
+             for (j = 0; j < tbinfo->numatts; j++)
+             {
+                 if (tbinfo->attisdropped[j])
+                 {
+                     appendPQExpBuffer(q, "ALTER TABLE ONLY %s ",
+                                       fmtId(tbinfo->dobj.name));
+                     appendPQExpBuffer(q, "DROP COLUMN %s;\n",
+                                       fmtId(tbinfo->attnames[j]));
+
+                     /*
+                      *    ALTER TABLE DROP COLUMN clears pg_attribute.atttypid,
+                      *    so we have to set pg_attribute.attlen and
+                      *    pg_attribute.attalign values because that is what
+                      *    is used to skip over dropped columns in the heap tuples.
+                      *    We have atttypmod, but it seems impossible to know the
+                      *    correct data type that will yield pg_attribute values
+                      *    that match the old installation.
+                      *    See comment in backend/catalog/heap.c::RemoveAttributeById()
+                      */
+                     appendPQExpBuffer(q, "\n-- For binary upgrade, recreate dropped column's length and
alignment.\n");
+                     appendPQExpBuffer(q, "UPDATE pg_attribute\n"
+                                          "SET attlen = %d, "
+                                          "attalign = '%c'\n"
+                                          "WHERE    attname = '%s'\n"
+                                          "    AND attrelid = \n"
+                                          "    (\n"
+                                          "        SELECT oid\n"
+                                          "        FROM pg_class\n"
+                                          "        WHERE    relnamespace = "
+                                          "(SELECT oid FROM pg_namespace "
+                                          "WHERE nspname = CURRENT_SCHEMA)\n"
+                                          "            AND relname = '%s'\n"
+                                          "    );",
+                                          tbinfo->attlen[j],
+                                          tbinfo->attalign[j],
+                                          tbinfo->attnames[j],
+                                          tbinfo->dobj.name);
+                 }
+             }
+         }
+
          /* Loop dumping statistics and storage statements */
          for (j = 0; j < tbinfo->numatts; j++)
          {
Index: src/bin/pg_dump/pg_dump.h
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.h,v
retrieving revision 1.150
diff -c -c -r1.150 pg_dump.h
*** src/bin/pg_dump/pg_dump.h    2 Feb 2009 19:31:39 -0000    1.150
--- src/bin/pg_dump/pg_dump.h    17 Feb 2009 01:57:10 -0000
***************
*** 245,250 ****
--- 245,252 ----
      char       *attstorage;        /* attribute storage scheme */
      char       *typstorage;        /* type storage scheme */
      bool       *attisdropped;    /* true if attr is dropped; don't dump it */
+     int           *attlen;            /* attribute length, used by binary_upgrade */
+     char       *attalign;        /* attribute align, used by binary_upgrade */
      bool       *attislocal;        /* true if attr has local definition */

      /*
Index: src/bin/pg_dump/pg_dumpall.c
===================================================================
RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dumpall.c,v
retrieving revision 1.113
diff -c -c -r1.113 pg_dumpall.c
*** src/bin/pg_dump/pg_dumpall.c    22 Jan 2009 20:16:08 -0000    1.113
--- src/bin/pg_dump/pg_dumpall.c    17 Feb 2009 01:57:10 -0000
***************
*** 90,97 ****
      const char *std_strings;
      int            c,
                  ret;

!     static struct option long_options[] = {
          {"data-only", no_argument, NULL, 'a'},
          {"clean", no_argument, NULL, 'c'},
          {"inserts", no_argument, NULL, 'd'},
--- 90,99 ----
      const char *std_strings;
      int            c,
                  ret;
+     int            binary_upgrade = 0;

!     struct option long_options[] = {
!         {"binary-upgrade", no_argument, &binary_upgrade, 1},    /* not documented */
          {"data-only", no_argument, NULL, 'a'},
          {"clean", no_argument, NULL, 'c'},
          {"inserts", no_argument, NULL, 'd'},
***************
*** 310,315 ****
--- 312,319 ----
      }

      /* Add long options to the pg_dump argument list */
+     if (binary_upgrade)
+         appendPQExpBuffer(pgdumpopts, " --binary-upgrade");
      if (disable_dollar_quoting)
          appendPQExpBuffer(pgdumpopts, " --disable-dollar-quoting");
      if (disable_triggers)

Re: pg_migrator and handling dropped columns

From
Bruce Momjian
Date:
Applied.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Peter Eisentraut wrote:
> > Tom Lane wrote:
> > >> Is this acceptable to everyone?  We could name the option
> > >> -u/--upgrade-compatible.
> > > 
> > > If the switch is specifically for pg_upgrade support (enabling this as
> > > well as any other hacks we find necessary), which seems like a good
> > > idea, then don't chew up a short option letter for it.  There should be
> > > a long form only.
> > 
> > Note that pg_dump's output is already upgrade compatible.  That's what 
> > pg_dump is often used for after all.  I believe what we are after here 
> > is something like "in-place upgrade compatible" or "upgrade binary 
> > compatible".
> > 
> > > And probably not even list it in the user documentation.
> > 
> > I think we should still list it somewhere and say it is for use by 
> > in-place upgrade utilities.  It will only confuse people if it is not 
> > documented at all.
> 
> OK, I have completed the patch;  attached.
> 
> I ran into a little problem, as documented by this comment in
> catalog/heap.c:
> 
>         /*
>          * Set the type OID to invalid.  A dropped attribute's type link
>          * cannot be relied on (once the attribute is dropped, the type might
>          * be too). Fortunately we do not need the type row --- the only
>          * really essential information is the type's typlen and typalign,
>          * which are preserved in the attribute's attlen and attalign.  We set
>          * atttypid to zero here as a means of catching code that incorrectly
>          * expects it to be valid.
>          */
> 
> Basically, drop column zeros pg_attribute.atttypid, and there doesn't
> seem to be enough information left in pg_attribute to guess the typid
> that, combined with atttypmod, would restore the proper values for
> pg_attribute.atttypid and pg_attribute.attalign.  Therefore, I just
> brute-forced an UPDATE into dump to set the values properly after
> dropping the fake TEXT column.
> 
> I did a minimal documentation addition by adding something to the
> "Notes" section of the manual pages.
> 
> Here is what a dump of a table with dropped columns looks like:
> 
>     --
>     -- Name: test; Type: TABLE; Schema: public; Owner: postgres; Tablespace:
>     --
>     
>     CREATE TABLE test (
>         x integer,
>         "........pg.dropped.2........" TEXT
>     );
>     ALTER TABLE ONLY test DROP COLUMN "........pg.dropped.2........";
>     
>     -- For binary upgrade, recreate dropped column's length and alignment.
>     UPDATE pg_attribute
>     SET attlen = -1, attalign = 'i'
>     WHERE   attname = '........pg.dropped.2........'
>             AND attrelid =
>             (
>                     SELECT oid
>                     FROM pg_class
>                     WHERE   relnamespace = (SELECT oid FROM pg_namespace WHERE nspname = CURRENT_SCHEMA)
>                             AND relname = 'test'
>             );
>     
>     ALTER TABLE public.test OWNER TO postgres;
> 
> -- 
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +

[ text/x-diff is unsupported, treating like TEXT/PLAIN ]

> Index: doc/src/sgml/ref/pg_dump.sgml
> ===================================================================
> RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dump.sgml,v
> retrieving revision 1.109
> diff -c -c -r1.109 pg_dump.sgml
> *** doc/src/sgml/ref/pg_dump.sgml    10 Feb 2009 00:55:21 -0000    1.109
> --- doc/src/sgml/ref/pg_dump.sgml    17 Feb 2009 01:57:10 -0000
> ***************
> *** 827,832 ****
> --- 827,837 ----
>      editing of the dump file might be required.
>     </para>
>   
> +   <para>
> +    <application>pg_dump</application> also supports a
> +    <literal>--binary-upgrade</> option for upgrade utility usage.
> +   </para>
> + 
>    </refsect1>
>   
>    <refsect1 id="pg-dump-examples">
> Index: doc/src/sgml/ref/pg_dumpall.sgml
> ===================================================================
> RCS file: /cvsroot/pgsql/doc/src/sgml/ref/pg_dumpall.sgml,v
> retrieving revision 1.75
> diff -c -c -r1.75 pg_dumpall.sgml
> *** doc/src/sgml/ref/pg_dumpall.sgml    7 Feb 2009 14:31:30 -0000    1.75
> --- doc/src/sgml/ref/pg_dumpall.sgml    17 Feb 2009 01:57:10 -0000
> ***************
> *** 489,494 ****
> --- 489,499 ----
>      locations.
>     </para>
>   
> +   <para>
> +    <application>pg_dump</application> also supports a
> +    <literal>--binary-upgrade</> option for upgrade utility usage.
> +   </para>
> + 
>    </refsect1>
>   
>   
> Index: src/bin/pg_dump/pg_dump.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.c,v
> retrieving revision 1.521
> diff -c -c -r1.521 pg_dump.c
> *** src/bin/pg_dump/pg_dump.c    16 Feb 2009 23:06:55 -0000    1.521
> --- src/bin/pg_dump/pg_dump.c    17 Feb 2009 01:57:10 -0000
> ***************
> *** 99,104 ****
> --- 99,106 ----
>   /* default, if no "inclusion" switches appear, is to dump everything */
>   static bool include_everything = true;
>   
> + static int    binary_upgrade = 0;
> + 
>   char        g_opaque_type[10];    /* name for the opaque type */
>   
>   /* placeholders for the delimiters for comments */
> ***************
> *** 236,242 ****
>       static int  outputNoTablespaces = 0;
>       static int    use_setsessauth = 0;
>   
> !     static struct option long_options[] = {
>           {"data-only", no_argument, NULL, 'a'},
>           {"blobs", no_argument, NULL, 'b'},
>           {"clean", no_argument, NULL, 'c'},
> --- 238,245 ----
>       static int  outputNoTablespaces = 0;
>       static int    use_setsessauth = 0;
>   
> !     struct option long_options[] = {
> !         {"binary-upgrade", no_argument, &binary_upgrade, 1},    /* not documented */
>           {"data-only", no_argument, NULL, 'a'},
>           {"blobs", no_argument, NULL, 'b'},
>           {"clean", no_argument, NULL, 'c'},
> ***************
> *** 4611,4616 ****
> --- 4614,4621 ----
>       int            i_attnotnull;
>       int            i_atthasdef;
>       int            i_attisdropped;
> +     int            i_attlen;
> +     int            i_attalign;
>       int            i_attislocal;
>       PGresult   *res;
>       int            ntups;
> ***************
> *** 4655,4661 ****
>               appendPQExpBuffer(q, "SELECT a.attnum, a.attname, a.atttypmod, "
>                                    "a.attstattarget, a.attstorage, t.typstorage, "
>                                      "a.attnotnull, a.atthasdef, a.attisdropped, "
> !                                  "a.attislocal, "
>                      "pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname "
>                "FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t "
>                                 "ON a.atttypid = t.oid "
> --- 4660,4666 ----
>               appendPQExpBuffer(q, "SELECT a.attnum, a.attname, a.atttypmod, "
>                                    "a.attstattarget, a.attstorage, t.typstorage, "
>                                      "a.attnotnull, a.atthasdef, a.attisdropped, "
> !                                    "a.attlen, a.attalign, a.attislocal, "
>                      "pg_catalog.format_type(t.oid,a.atttypmod) AS atttypname "
>                "FROM pg_catalog.pg_attribute a LEFT JOIN pg_catalog.pg_type t "
>                                 "ON a.atttypid = t.oid "
> ***************
> *** 4674,4680 ****
>               appendPQExpBuffer(q, "SELECT a.attnum, a.attname, "
>                                 "a.atttypmod, -1 AS attstattarget, a.attstorage, "
>                                 "t.typstorage, a.attnotnull, a.atthasdef, "
> !                               "false AS attisdropped, false AS attislocal, "
>                                 "format_type(t.oid,a.atttypmod) AS atttypname "
>                                 "FROM pg_attribute a LEFT JOIN pg_type t "
>                                 "ON a.atttypid = t.oid "
> --- 4679,4686 ----
>               appendPQExpBuffer(q, "SELECT a.attnum, a.attname, "
>                                 "a.atttypmod, -1 AS attstattarget, a.attstorage, "
>                                 "t.typstorage, a.attnotnull, a.atthasdef, "
> !                               "false AS attisdropped, 0 AS attlen, "
> !                               "' ' AS attalign, false AS attislocal, "
>                                 "format_type(t.oid,a.atttypmod) AS atttypname "
>                                 "FROM pg_attribute a LEFT JOIN pg_type t "
>                                 "ON a.atttypid = t.oid "
> ***************
> *** 4690,4696 ****
>                                 "-1 AS attstattarget, attstorage, "
>                                 "attstorage AS typstorage, "
>                                 "attnotnull, atthasdef, false AS attisdropped, "
> !                               "false AS attislocal, "
>                                 "(SELECT typname FROM pg_type WHERE oid = atttypid) AS atttypname "
>                                 "FROM pg_attribute a "
>                                 "WHERE attrelid = '%u'::oid "
> --- 4696,4703 ----
>                                 "-1 AS attstattarget, attstorage, "
>                                 "attstorage AS typstorage, "
>                                 "attnotnull, atthasdef, false AS attisdropped, "
> !                               "0 AS attlen, ' ' AS attalign, "
> !                                "false AS attislocal, "
>                                 "(SELECT typname FROM pg_type WHERE oid = atttypid) AS atttypname "
>                                 "FROM pg_attribute a "
>                                 "WHERE attrelid = '%u'::oid "
> ***************
> *** 4714,4719 ****
> --- 4721,4728 ----
>           i_attnotnull = PQfnumber(res, "attnotnull");
>           i_atthasdef = PQfnumber(res, "atthasdef");
>           i_attisdropped = PQfnumber(res, "attisdropped");
> +         i_attlen = PQfnumber(res, "attlen");
> +         i_attalign = PQfnumber(res, "attalign");
>           i_attislocal = PQfnumber(res, "attislocal");
>   
>           tbinfo->numatts = ntups;
> ***************
> *** 4724,4729 ****
> --- 4733,4740 ----
>           tbinfo->attstorage = (char *) malloc(ntups * sizeof(char));
>           tbinfo->typstorage = (char *) malloc(ntups * sizeof(char));
>           tbinfo->attisdropped = (bool *) malloc(ntups * sizeof(bool));
> +         tbinfo->attlen = (int *) malloc(ntups * sizeof(int));
> +         tbinfo->attalign = (char *) malloc(ntups * sizeof(char));
>           tbinfo->attislocal = (bool *) malloc(ntups * sizeof(bool));
>           tbinfo->notnull = (bool *) malloc(ntups * sizeof(bool));
>           tbinfo->attrdefs = (AttrDefInfo **) malloc(ntups * sizeof(AttrDefInfo *));
> ***************
> *** 4747,4752 ****
> --- 4758,4765 ----
>               tbinfo->attstorage[j] = *(PQgetvalue(res, j, i_attstorage));
>               tbinfo->typstorage[j] = *(PQgetvalue(res, j, i_typstorage));
>               tbinfo->attisdropped[j] = (PQgetvalue(res, j, i_attisdropped)[0] == 't');
> +             tbinfo->attlen[j] = atoi(PQgetvalue(res, j, i_attlen));
> +             tbinfo->attalign[j] = *(PQgetvalue(res, j, i_attalign));
>               tbinfo->attislocal[j] = (PQgetvalue(res, j, i_attislocal)[0] == 't');
>               tbinfo->notnull[j] = (PQgetvalue(res, j, i_attnotnull)[0] == 't');
>               tbinfo->attrdefs[j] = NULL; /* fix below */
> ***************
> *** 4760,4765 ****
> --- 4773,4793 ----
>   
>           PQclear(res);
>   
> + 
> +         /*
> +          *    ALTER TABLE DROP COLUMN clears pg_attribute.atttypid, so we
> +          *    set the column data type to 'TEXT;  we will later drop the
> +          *    column.
> +          */
> +         if (binary_upgrade)
> +         {
> +             for (j = 0; j < ntups; j++)
> +             {
> +                 if (tbinfo->attisdropped[j])
> +                     tbinfo->atttypnames[j] = strdup("TEXT");
> +             }
> +         }
> +             
>           /*
>            * Get info about column defaults
>            */
> ***************
> *** 9680,9686 ****
>           for (j = 0; j < tbinfo->numatts; j++)
>           {
>               /* Is this one of the table's own attrs, and not dropped ? */
> !             if (!tbinfo->inhAttrs[j] && !tbinfo->attisdropped[j])
>               {
>                   /* Format properly if not first attr */
>                   if (actual_atts > 0)
> --- 9708,9715 ----
>           for (j = 0; j < tbinfo->numatts; j++)
>           {
>               /* Is this one of the table's own attrs, and not dropped ? */
> !             if (!tbinfo->inhAttrs[j] &&
> !                 (!tbinfo->attisdropped[j] || binary_upgrade))
>               {
>                   /* Format properly if not first attr */
>                   if (actual_atts > 0)
> ***************
> *** 9786,9791 ****
> --- 9815,9867 ----
>   
>           appendPQExpBuffer(q, ";\n");
>   
> +         /*
> +          * For binary-compatible heap files, we create dropped columns
> +          * above and drop them here.
> +          */
> +         if (binary_upgrade)
> +         {
> +             for (j = 0; j < tbinfo->numatts; j++)
> +             {
> +                 if (tbinfo->attisdropped[j])
> +                 {
> +                     appendPQExpBuffer(q, "ALTER TABLE ONLY %s ",
> +                                       fmtId(tbinfo->dobj.name));
> +                     appendPQExpBuffer(q, "DROP COLUMN %s;\n",
> +                                       fmtId(tbinfo->attnames[j]));
> + 
> +                     /*
> +                      *    ALTER TABLE DROP COLUMN clears pg_attribute.atttypid,
> +                      *    so we have to set pg_attribute.attlen and
> +                      *    pg_attribute.attalign values because that is what
> +                      *    is used to skip over dropped columns in the heap tuples.
> +                      *    We have atttypmod, but it seems impossible to know the
> +                      *    correct data type that will yield pg_attribute values
> +                      *    that match the old installation.
> +                      *    See comment in backend/catalog/heap.c::RemoveAttributeById()
> +                      */
> +                     appendPQExpBuffer(q, "\n-- For binary upgrade, recreate dropped column's length and
alignment.\n");
> +                     appendPQExpBuffer(q, "UPDATE pg_attribute\n"
> +                                          "SET attlen = %d, "
> +                                          "attalign = '%c'\n"
> +                                          "WHERE    attname = '%s'\n"
> +                                          "    AND attrelid = \n"
> +                                          "    (\n"
> +                                          "        SELECT oid\n"
> +                                          "        FROM pg_class\n"
> +                                          "        WHERE    relnamespace = "
> +                                          "(SELECT oid FROM pg_namespace "
> +                                          "WHERE nspname = CURRENT_SCHEMA)\n"
> +                                          "            AND relname = '%s'\n"
> +                                          "    );",
> +                                          tbinfo->attlen[j],
> +                                          tbinfo->attalign[j],
> +                                          tbinfo->attnames[j],
> +                                          tbinfo->dobj.name);
> +                 }
> +             }
> +         }
> +     
>           /* Loop dumping statistics and storage statements */
>           for (j = 0; j < tbinfo->numatts; j++)
>           {
> Index: src/bin/pg_dump/pg_dump.h
> ===================================================================
> RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dump.h,v
> retrieving revision 1.150
> diff -c -c -r1.150 pg_dump.h
> *** src/bin/pg_dump/pg_dump.h    2 Feb 2009 19:31:39 -0000    1.150
> --- src/bin/pg_dump/pg_dump.h    17 Feb 2009 01:57:10 -0000
> ***************
> *** 245,250 ****
> --- 245,252 ----
>       char       *attstorage;        /* attribute storage scheme */
>       char       *typstorage;        /* type storage scheme */
>       bool       *attisdropped;    /* true if attr is dropped; don't dump it */
> +     int           *attlen;            /* attribute length, used by binary_upgrade */
> +     char       *attalign;        /* attribute align, used by binary_upgrade */
>       bool       *attislocal;        /* true if attr has local definition */
>   
>       /*
> Index: src/bin/pg_dump/pg_dumpall.c
> ===================================================================
> RCS file: /cvsroot/pgsql/src/bin/pg_dump/pg_dumpall.c,v
> retrieving revision 1.113
> diff -c -c -r1.113 pg_dumpall.c
> *** src/bin/pg_dump/pg_dumpall.c    22 Jan 2009 20:16:08 -0000    1.113
> --- src/bin/pg_dump/pg_dumpall.c    17 Feb 2009 01:57:10 -0000
> ***************
> *** 90,97 ****
>       const char *std_strings;
>       int            c,
>                   ret;
>   
> !     static struct option long_options[] = {
>           {"data-only", no_argument, NULL, 'a'},
>           {"clean", no_argument, NULL, 'c'},
>           {"inserts", no_argument, NULL, 'd'},
> --- 90,99 ----
>       const char *std_strings;
>       int            c,
>                   ret;
> +     int            binary_upgrade = 0;
>   
> !     struct option long_options[] = {
> !         {"binary-upgrade", no_argument, &binary_upgrade, 1},    /* not documented */
>           {"data-only", no_argument, NULL, 'a'},
>           {"clean", no_argument, NULL, 'c'},
>           {"inserts", no_argument, NULL, 'd'},
> ***************
> *** 310,315 ****
> --- 312,319 ----
>       }
>   
>       /* Add long options to the pg_dump argument list */
> +     if (binary_upgrade)
> +         appendPQExpBuffer(pgdumpopts, " --binary-upgrade");
>       if (disable_dollar_quoting)
>           appendPQExpBuffer(pgdumpopts, " --disable-dollar-quoting");
>       if (disable_triggers)

> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +