Thread: pg_migrator issue with contrib

pg_migrator issue with contrib

From
Brad Nicholson
Date:
I've been kicking the tires on this a bit, and I've found an issue when
dealing with contrib/ (specifically dblink, although I haven't looked
around anymore).

dblink_current_query() is not in the 8.4 version - when I run
pg_migrator on an 8.3 cluster that has dblink installed, I get the
following:                     
Restoring database schema
psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR:  could not find
function "dblink_current_query" in file
"/opt/dbs/pgsql84-beta2/lib/dblink.so"

There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
--dbname template1 >> "/dev/null"

pg_migrator exits leaving me with a corrupted 8.3 instance.

At the very least, a mention in the documentation of incompatible
contrib module(s) would be nice.  Even better would be a sanity check
added to prevent this.
-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.




Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Brad Nicholson wrote:
> I've been kicking the tires on this a bit, and I've found an issue when
> dealing with contrib/ (specifically dblink, although I haven't looked
> around anymore).
> 
> dblink_current_query() is not in the 8.4 version - when I run
> pg_migrator on an 8.3 cluster that has dblink installed, I get the
> following:
>                       
> Restoring database schema
> psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR:  could not find
> function "dblink_current_query" in file
> "/opt/dbs/pgsql84-beta2/lib/dblink.so"
> 
> There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
> ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
> --dbname template1 >> "/dev/null"

Yep, pg_migrator will exit on any restore error.  This really relates to
the problem of how we handle /contrib schema migration from one
release to the other.  Good thing you posted to hackers because that is
really the group that can address this.  pg_migrator is really just
restoring the schema that pg_dump is producing.

> pg_migrator exits leaving me with a corrupted 8.3 instance.

When you say "corrupted", I assume you mean you have remove the _old
suffixes to restart your 8.3 instance, right?  I hope that is the only
corruption issue --- please confirm.

> At the very least, a mention in the documentation of incompatible
> contrib module(s) would be nice.  Even better would be a sanity check
> added to prevent this.

OK, I am looking to the hackers group for recommentations on this.  I
wonder if I should recommend uninstalling /contrib modules before the
upgrade, but that is not possible for custom data types that have
columns already defined in the old cluster.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Andrew Dunstan
Date:

Bruce Momjian wrote:
>> At the very least, a mention in the documentation of incompatible
>> contrib module(s) would be nice.  Even better would be a sanity check
>> added to prevent this.
>>     
>
> OK, I am looking to the hackers group for recommentations on this.  I
> wonder if I should recommend uninstalling /contrib modules before the
> upgrade, but that is not possible for custom data types that have
> columns already defined in the old cluster.
>
>   

There is no nice answer to this. It goes way beyond data types: you 
could be using the module stuff in indexes, functions, views etc. You 
can't just drop the stuff. The best I have been able to do in similar 
cases is to install the updated module in the database before restoring, 
and ignore any restoration errors about "foo already exists" or "foo not 
found in .so file". Not sure how well that translates to pg_migrator, 
though.

cheers

andrew


Re: pg_migrator issue with contrib

From
Brad Nicholson
Date:
On Fri, 2009-06-05 at 15:50 -0400, Bruce Momjian wrote:
> Brad Nicholson wrote:
> > I've been kicking the tires on this a bit, and I've found an issue when
> > dealing with contrib/ (specifically dblink, although I haven't looked
> > around anymore).
> > 
> > dblink_current_query() is not in the 8.4 version - when I run
> > pg_migrator on an 8.3 cluster that has dblink installed, I get the
> > following:
> >                       
> > Restoring database schema
> > psql:/home/postgres/pg_migrator_dump_db.sql:271: ERROR:  could not find
> > function "dblink_current_query" in file
> > "/opt/dbs/pgsql84-beta2/lib/dblink.so"
> > 
> > There were problems executing "/opt/dbs/pgsql84-beta2/bin/psql" --set
> > ON_ERROR_STOP=on --port 5432 -f "/home/postgres/pg_migrator_dump_db.sql"
> > --dbname template1 >> "/dev/null"
> 
> Yep, pg_migrator will exit on any restore error.  This really relates to
> the problem of how we handle /contrib schema migration from one
> release to the other.  Good thing you posted to hackers because that is
> really the group that can address this.  pg_migrator is really just
> restoring the schema that pg_dump is producing.
> 
> > pg_migrator exits leaving me with a corrupted 8.3 instance.
> 
> When you say "corrupted", I assume you mean you have remove the _old
> suffixes to restart your 8.3 instance, right?  I hope that is the only
> corruption issue --- please confirm.

Unfortunately no - when I try and start the old version, I get:

pg_ctl -D pgsql83/ start
postgres: could not find the database system
Expected to find it in the directory "/opt/DATA/pgsql83",
but could not open file "/opt/DATA/pgsql83/global/pg_control": No such
file or directory

I am specifying both new and old paths to pg_migrator, as I don't have
my data directories in standard locations.

Test case is pretty straight forward.

Create an 8.3 instance
Add dblink
follow through the pg_migrator upgrade path
try and start your 8.3 instance after the failure

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.




Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Brad Nicholson wrote:
>> At the very least, a mention in the documentation of incompatible
>> contrib module(s) would be nice.  Even better would be a sanity check
>> added to prevent this.

> OK, I am looking to the hackers group for recommentations on this.

One thing I was going to suggest is that pg_migrator check that all the
.so's in the old installation also exist in the new one.  However that
could be overkill since you don't know for sure if the old .so is
referenced anywhere in the database.

As for the specific problem at hand, it might've been a mistake to
replace dblink_current_query() with a SQL function instead of changing
the internal implementation of the C function.  We could still fix that.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Brad Nicholson wrote:
> > When you say "corrupted", I assume you mean you have remove the _old
> > suffixes to restart your 8.3 instance, right?  I hope that is the only
> > corruption issue --- please confirm.
> 
> Unfortunately no - when I try and start the old version, I get:
> 
> pg_ctl -D pgsql83/ start
> postgres: could not find the database system
> Expected to find it in the directory "/opt/DATA/pgsql83",
> but could not open file "/opt/DATA/pgsql83/global/pg_control": No such
> file or directory
> 
> I am specifying both new and old paths to pg_migrator, as I don't have
> my data directories in standard locations.
> 
> Test case is pretty straight forward.
> 
> Create an 8.3 instance
> Add dblink
> follow through the pg_migrator upgrade path
> try and start your 8.3 instance after the failure

Uh, I assume you read all of the INSTALL instructions, including the
last item:
10.  Reverting to old cluster
...

-->    If you ran pg_migrator _without_ --link or did not start the new server,the old cluster was not modified except
thatan ".old" suffix wasappended to $PGDATA/global/pg_control and tablespaces directories.  Toreuse the old cluster,
removethe tablespace directories created by thenew cluster and remove the ".old" suffix from the old cluster
tablespacedirectorynames and $PGDATA/global/pg_control; then you can restart theold cluster.
 

I just modified the arrow text to be clearer.  The rename of
pg_controldata is done to prevent accidental starting of the old cluster
in case of migration success.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> 
> 
> Bruce Momjian wrote:
> >> At the very least, a mention in the documentation of incompatible
> >> contrib module(s) would be nice.  Even better would be a sanity check
> >> added to prevent this.
> >>     
> >
> > OK, I am looking to the hackers group for recommentations on this.  I
> > wonder if I should recommend uninstalling /contrib modules before the
> > upgrade, but that is not possible for custom data types that have
> > columns already defined in the old cluster.
> >
> >   
> 
> There is no nice answer to this. It goes way beyond data types: you 
> could be using the module stuff in indexes, functions, views etc. You 
> can't just drop the stuff. The best I have been able to do in similar 
> cases is to install the updated module in the database before restoring, 
> and ignore any restoration errors about "foo already exists" or "foo not 
> found in .so file". Not sure how well that translates to pg_migrator, 
> though.

I suspected this answer but figured I would get a definative answer
rather than guessing.

Based on the way pg_migrator works I am going to suggest that if
/contrib restore generates an error that they uninstall the /contrib
from the old cluster and retry.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> > There is no nice answer to this. It goes way beyond data types: you 
> > could be using the module stuff in indexes, functions, views etc. You 
> > can't just drop the stuff. The best I have been able to do in similar 
> > cases is to install the updated module in the database before restoring, 
> > and ignore any restoration errors about "foo already exists" or "foo not 
> > found in .so file". Not sure how well that translates to pg_migrator, 
> > though.
> 
> I suspected this answer but figured I would get a definative answer
> rather than guessing.
> 
> Based on the way pg_migrator works I am going to suggest that if
> /contrib restore generates an error that they uninstall the /contrib
> from the old cluster and retry.

I have added the following paragraph to the pg_migrator INSTALL docs:
If an error occurs while restoring the database schema, pg_migrator willexit and you will have to revert to the old
clusteras outlined in step#10 below.  To try pg_migrator again, you will need to modify the oldcluster so the
pg_migratorschema restore succeeds.  If the problem is a/contrib module, you might need to uninstall the /contrib
modulefromthe old cluster and install it in the new cluster after migration.
 

The only other idea I have would be to add a switch that allows
schema restore errors to be ignored, but I would like to see if the
above text is enough first.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Brad Nicholson wrote:
> >> At the very least, a mention in the documentation of incompatible
> >> contrib module(s) would be nice.  Even better would be a sanity check
> >> added to prevent this.
> 
> > OK, I am looking to the hackers group for recommentations on this.
> 
> One thing I was going to suggest is that pg_migrator check that all the
> .so's in the old installation also exist in the new one.  However that
> could be overkill since you don't know for sure if the old .so is
> referenced anywhere in the database.

Or in this case that the new *.so has the same functions.

> As for the specific problem at hand, it might've been a mistake to
> replace dblink_current_query() with a SQL function instead of changing
> the internal implementation of the C function.  We could still fix that.

I am afraid /contrib is going to be a mine field for this type of
problem so I am going to recommend uninstaling the /contrib module if
possible and retry the migration.  That should work in this case.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Dimitri Fontaine
Date:
Hi,

Ah, we need module/extension/package/plugin so badly...

Le 5 juin 09 à 22:19, Bruce Momjian a écrit :
> I am afraid /contrib is going to be a mine field for this type of
> problem so I am going to recommend uninstaling the /contrib module if
> possible and retry the migration.  That should work in this case.

You can't seriously recommend that, I'm afraid.
As Andrew (Dunstan) was saying up-thread, the faulty module (from
contrib or pgfoundry) could hold some indexes (btree, gist, gin) and/
or data types in the cluster relations.

So if you uninstall the module (drop type cascade, drop operator
class, ...) you lose  data.

Some example modules that I can think of and are wildspread in the
field, as far as I know, are ip4r (data type and indexes), orafce
(functions, views, tables), and some less spread are prefix (data type
and indexes) or temporal (period data type, indexes).

Please do not recommend people to lose their precious data to be able
to upgrade. You could tell them pg_migrator isn't an option in their
case, though. At least we're left with a faster (multi-threaded)
pg_restore :)

Regards,
--
dim

Re: pg_migrator issue with contrib

From
Josh Berkus
Date:
Bruce,

Assuming a contrib module *hasn't* changed its API, does pg_migrator 
link against the 8.4 version of the module's .so, or the old 8.3 version?

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: pg_migrator issue with contrib

From
David Blewett
Date:
On Fri, Jun 5, 2009 at 6:11 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote:
> Some example modules that I can think of and are wildspread in the field, as
> far as I know, are ip4r (data type and indexes), orafce (functions, views,
> tables), and some less spread are prefix (data type and indexes) or temporal
> (period data type, indexes).

And hstore...

David Blewett


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Josh Berkus wrote:
> Bruce,
> 
> Assuming a contrib module *hasn't* changed its API, does pg_migrator 
> link against the 8.4 version of the module's .so, or the old 8.3 version?

8.4 version, or whatever is in the 8.4 lib, which should be 8.4.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> Hi,
> 
> Ah, we need module/extension/package/plugin so badly...
> 
> Le 5 juin 09 ? 22:19, Bruce Momjian a ?crit :
> > I am afraid /contrib is going to be a mine field for this type of
> > problem so I am going to recommend uninstaling the /contrib module if
> > possible and retry the migration.  That should work in this case.
> 
> You can't seriously recommend that, I'm afraid.
> As Andrew (Dunstan) was saying up-thread, the faulty module (from  
> contrib or pgfoundry) could hold some indexes (btree, gist, gin) and/ 
> or data types in the cluster relations.
> 
> So if you uninstall the module (drop type cascade, drop operator  
> class, ...) you lose  data.
> 
> Some example modules that I can think of and are wildspread in the  
> field, as far as I know, are ip4r (data type and indexes), orafce  
> (functions, views, tables), and some less spread are prefix (data type  
> and indexes) or temporal (period data type, indexes).
> 
> Please do not recommend people to lose their precious data to be able  
> to upgrade. You could tell them pg_migrator isn't an option in their  
> case, though. At least we're left with a faster (multi-threaded)  
> pg_restore :)

Very good point, and something I had not considered.  I tried
uninstalling hstore and it gladly dropped any hstore columns!

I have updated the INSTALL instructions:
If an error occurs while restoring the database schema, pg_migrator willexit and you will have to revert to the old
clusteras outlined in step#10 below.  To try pg_migrator again, you will need to modify the oldcluster so the
pg_migratorschema restore succeeds.  If the problem is a/contrib module, you might need to uninstall the /contrib
modulefromthe old cluster and install it in the new cluster after the migration,assuming the module is not being used
tostore user data.
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Josh Berkus
Date:
On 6/5/09 6:27 PM, Bruce Momjian wrote:
> Josh Berkus wrote:
>> Bruce,
>>
>> Assuming a contrib module *hasn't* changed its API, does pg_migrator
>> link against the 8.4 version of the module's .so, or the old 8.3 version?
>
> 8.4 version, or whatever is in the 8.4 lib, which should be 8.4.

So, here's what we need for 8.3 --> 8.4 for contrib modules:

1) make a list of contrib modules which do not convert cleanly (testers?)
2) document these.
3) give pg_migrator some hackish way to install 8.4 contrib modules from 
source before copying over the database.
4) set pg_migrator to ignore duplicate object warnings if it does the above.

Note that we expect NOT to have this issue for 8.4-->8.5, since we'll 
have a full module infrastructure by then.  Really!

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> So, here's what we need for 8.3 --> 8.4 for contrib modules:

> 1) make a list of contrib modules which do not convert cleanly (testers?)
> 2) document these.
> 3) give pg_migrator some hackish way to install 8.4 contrib modules from 
> source before copying over the database.
> 4) set pg_migrator to ignore duplicate object warnings if it does the above.

1A) consider whether we can reduce/eliminate the conversion problems.
We have five days.

I don't think we need testing, per se.  The first step should be to diff
the 8.3 and 8.4 versions of the various contrib .sql.in files and
determine what changed.  Any module whose .sql.in file hasn't changed
is definitely safe.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Grzegorz Jaskiewicz
Date:
On 6 Jun 2009, at 19:50, Tom Lane wrote:
> We have five days.
>
> I don't think we need testing, per se.  The first step should be to  
> diff
> the 8.3 and 8.4 versions of the various contrib .sql.in files and
> determine what changed.  Any module whose .sql.in file hasn't changed
> is definitely safe.

I can tell you already that dblink has changed, and I had to drop it  
before migration, otherwise everything went fine. Migration of 57GB  
data in place took about 1 minute.




Re: pg_migrator issue with contrib

From
Dimitri Fontaine
Date:
Le 6 juin 09 à 20:45, Josh Berkus a écrit :
> So, here's what we need for 8.3 --> 8.4 for contrib modules:

That does nothing for external modules whose code isn't in PostgreSQL
control. I'm thinking of those examples I cited up-thread --- and some
more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

Could pg_migrator detect usage of "objects" oids (data types in
relation, index opclass, ...) that are unknown to be in the standard -
core + contrib distribution, and quit trying to upgrade the cluster in
this case, telling the user his database is not supported?

> Note that we expect NOT to have this issue for 8.4-->8.5, since
> we'll have a full module infrastructure by then.  Really!

Note: added in-place upgrade to requirements of the first version of
the feature.
--
dim



Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> Le 6 juin 09 ? 20:45, Josh Berkus a ?crit :
> > So, here's what we need for 8.3 --> 8.4 for contrib modules:
> 
> That does nothing for external modules whose code isn't in PostgreSQL  
> control. I'm thinking of those examples I cited up-thread --- and some  
> more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

Agreed, that's why the new INSTALL text addresses this clearly.  You
might want to read my blog entry about why the INSTALL file is so
important for pg_migrator:
http://momjian.us/main/blogs/pgblog.html#June_6_2009

> Could pg_migrator detect usage of "objects" oids (data types in  
> relation, index opclass, ...) that are unknown to be in the standard - 
> core + contrib distribution, and quit trying to upgrade the cluster in  
> this case, telling the user his database is not supported?

Well, they will get an error and see the INSTALL file --- I don't see
having pg_migrator go around looking for things as a fruitful effort.

> > Note that we expect NOT to have this issue for 8.4-->8.5, since  
> > we'll have a full module infrastructure by then.  Really!
> 
> Note: added in-place upgrade to requirements of the first version of  
> the feature.

Yes, this will certainly spur development of a better /contrib install
system.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Grzegorz Jaskiewicz wrote:
> 
> On 6 Jun 2009, at 19:50, Tom Lane wrote:
> > We have five days.
> >
> > I don't think we need testing, per se.  The first step should be to  
> > diff
> > the 8.3 and 8.4 versions of the various contrib .sql.in files and
> > determine what changed.  Any module whose .sql.in file hasn't changed
> > is definitely safe.
> 
> I can tell you already that dblink has changed, and I had to drop it  
> before migration, otherwise everything went fine. Migration of 57GB  
> data in place took about 1 minute.

The good news is that the INSTALL instructions where clear enough for
the tester to understand that uninstalling dblink from the old cluster
and reinstalling it in the new cluster would work.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > So, here's what we need for 8.3 --> 8.4 for contrib modules:
> 
> > 1) make a list of contrib modules which do not convert cleanly (testers?)
> > 2) document these.
> > 3) give pg_migrator some hackish way to install 8.4 contrib modules from 
> > source before copying over the database.
> > 4) set pg_migrator to ignore duplicate object warnings if it does the above.
> 
> 1A) consider whether we can reduce/eliminate the conversion problems.
> We have five days.
> 
> I don't think we need testing, per se.  The first step should be to diff
> the 8.3 and 8.4 versions of the various contrib .sql.in files and
> determine what changed.  Any module whose .sql.in file hasn't changed
> is definitely safe.

I am a little hesistant to do one-off adjustments for /contrib changes
between versions.  The INSTALL instructions are pretty clear of how to
fix problems like dblink.

A more complex case would be if hstore couldn't be migrated, and there
is no way to drop that type and reinstall on the new system without
dropping your data too, but again, I don't see how we have any way to
fix that for 8.4 anyway.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Josh Berkus wrote:
> On 6/5/09 6:27 PM, Bruce Momjian wrote:
> > Josh Berkus wrote:
> >> Bruce,
> >>
> >> Assuming a contrib module *hasn't* changed its API, does pg_migrator
> >> link against the 8.4 version of the module's .so, or the old 8.3 version?
> >
> > 8.4 version, or whatever is in the 8.4 lib, which should be 8.4.
> 
> So, here's what we need for 8.3 --> 8.4 for contrib modules:
> 
> 1) make a list of contrib modules which do not convert cleanly (testers?)
> 2) document these.
> 3) give pg_migrator some hackish way to install 8.4 contrib modules from 
> source before copying over the database.
> 4) set pg_migrator to ignore duplicate object warnings if it does the above.
> 
> Note that we expect NOT to have this issue for 8.4-->8.5, since we'll 
> have a full module infrastructure by then.  Really!

I think the cleanest solution is to document that these issues might
happen and suggest solutions.  This has already worked for our tester
today so I am hopeful people will just figure out how to fix this.

I have added dblink as a specific migration example to the INSTALL file:

---------------------------------------------------------------------------

If an error occurs while restoring the database schema, pg_migrator will
exit and you will have to revert to the old cluster as outlined in step
#10 below.  To try pg_migrator again, you will need to modify the old
cluster so the pg_migrator schema restore succeeds.  If the problem is a
/contrib module, you might need to uninstall the /contrib module from
the old cluster and install it in the new cluster after the migration,
assuming the module is not being used to store user data.

For example, /contrib/dblink changed its API from Postgres 8.3 to 8.4 so
sites using it must uninstall dblink from their Postgres 8.3, cluster,
do the migration, then install dblink in Postgres 8.4.  More complex
cases might require individual objects to be migrated manually.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I think the cleanest solution is to document that these issues might
> happen and suggest solutions.

No, the cleanest solution is to fix it before people ever see a
problem.  This is trivial to do in the case of dblink and I don't
see why you think it would be better to make users work around it.

Also, dblink is one of the easiest cases because (a) it doesn't have
anything but functions, and thus it's possible to drop it from the
old DB without data loss, and (b) the inconsistency that it has is
something that will result in a clean, readily understandable failure
during migration.  As opposed to some other cases that will migrate
just fine and then dump core during use.

I've just finished running through a diff of 8.3 vs 8.4 contrib
SQL files.  It looks like we have these cases:

* Quite a few modules have differences in the signatures of GIST and
GIN opclass support functions.  As was discussed earlier, we don't
really have to fix those, because the SQL-level signatures don't
affect anything at runtime.  Most of these modules also have RECHECK
flags that were removed, but we have a solution for that too.

* dblink has an entirely unnecessary change in the definition of
dblink_current_query(), which will cause it to fail to migrate.

* fuzzystrmatch has added a new levenshtein() function.  The worst
case if this is migrated without any extra effort is the user
doesn't have access to that function.  But it's a functions-only
module, so the user could fix it by just rerunning the SQL file
afterwards.

* intarray has removed its internal @> and <@ operators.  As I was
mentioning the other day, it might be the best thing to just revert
that change anyway, until we can get a better handle on the behavior
of GIN for empty arrays.

* isn has got the nastiest problems of the lot: since it piggybacks
on type bigint, a migrated database might work, or might crash
miserably, depending on whether bigint has become pass-by-value
in the new database.  I'm not very sure if we can fix this reasonably.

* pageinspect has changed the ABI of get_raw_page() in a way that will
likely make it dump core if the function definition is migrated from
an old DB.  This needs to be fixed.  It's also added a new function
fsm_page_contents(bytea), which is no big problem, since the SQL
file can be re-executed without harm.

* pg_buffercache has changed the view pg_buffercache, which is
definitely going to be a migration issue.  Need to test whether
it represents a crash risk if the old definition is migrated.

* pg_freespacemap has made *major* changes in its ABI.  There's
probably no hope of this working either, but we need to be sure
it's not a crash risk.

* pgstattuple has made changes in the output types of its functions.
This is a serious crash risk, and I'm not immediately sure how to
fix it.  Note that simply migrating the module will not draw any
errors.

* seg has added a new function seg_center(seg).  This can be fixed
by reloading the SQL file, but not by dropping the module beforehand
since that would risk data loss.

* tsearch2 has opclass support function changes, but unlike other
cases of this, it will fail to migrate to 8.4 because the functions
are references to core functions instead of being declared in the
module.  Also, "drop it first" isn't a very useful recommendation
since the domains it defines might be used somewhere.

So we've definitely got some work here.  If we do nothing, several
of these issues are going to end up with CVE numbers.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Grzegorz Jaskiewicz
Date:
On 7 Jun 2009, at 03:27, Bruce Momjian wrote:

> Grzegorz Jaskiewicz wrote:
>>
>> On 6 Jun 2009, at 19:50, Tom Lane wrote:
>>> We have five days.
>>>
>>> I don't think we need testing, per se.  The first step should be to
>>> diff
>>> the 8.3 and 8.4 versions of the various contrib .sql.in files and
>>> determine what changed.  Any module whose .sql.in file hasn't  
>>> changed
>>> is definitely safe.
>>
>> I can tell you already that dblink has changed, and I had to drop it
>> before migration, otherwise everything went fine. Migration of 57GB
>> data in place took about 1 minute.
>
> The good news is that the INSTALL instructions where clear enough for
> the tester to understand that uninstalling dblink from the old cluster
> and reinstalling it in the new cluster would work.
Yes, but I forgot about one database, and had to do it all over again,  
luckily I first tested it without --link...



Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Sun, Jun 7, 2009 at 12:11 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> * intarray has removed its internal @> and <@ operators.  As I was
> mentioning the other day, it might be the best thing to just revert
> that change anyway, until we can get a better handle on the behavior
> of GIN for empty arrays.

+1.

> So we've definitely got some work here.  If we do nothing, several
> of these issues are going to end up with CVE numbers.

I think it's becoming increasingly clear that pg_migrator is not for
the faint of heart; in fact, I think we should hedge references to it
with words like "experimental".  If we set an expectation that this
tool has the same level of maturity and reliability that people
associate with PostgreSQL in general, we are going to have a lot of
disappointed users.  Just to recall the history, the first pgfoundry
commit messages for this tool were on February 9th, three months after
the start of the final CommitFest and feature freeze for 8.4.  Since
then, development has proceeded at an amazingly rapid pace, but
there's only so much you can do in four months, especially when it's
too late to rearchitect previously-committed 8.4 features to play more
nicely with the new feature being added.

It seems to me that if we keep plugging at this problem, 8.4->8.5
migration has the potential to be considerably smoother than 8.3 to
8.4 migration (and, yes, a good module facility will help), but it
wouldn't be surprising to me if we're well into 9.x territory before
we really get all of the issues hammered out.  I don't think even a
major feature like Hot Standby has the far-reaching implications on
how the system needs to be designed that upgrade-in-place does.

So while I agree with Tom that we should fix as many of these issues
as we reasonably well can for 8.4, I also agree with Bruce that a lot
of this comes down to setting appropriate expectations for our users:
this is a potentially useful tool, especially for users with huge
databases, but it's new, and it's not perfect, and use with caution.

...Robert


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I think it's becoming increasingly clear that pg_migrator is not for
> the faint of heart; in fact, I think we should hedge references to it
> with words like "experimental".

Probably ...

> Just to recall the history, the first pgfoundry
> commit messages for this tool were on February 9th, three months after
> the start of the final CommitFest and feature freeze for 8.4.  Since
> then, development has proceeded at an amazingly rapid pace, but
> there's only so much you can do in four months,

... but the above is a *complete* misrepresentation of the thing's
history (apparently you failed to look in the Attic?).  EDB have been
using previous versions of this tool for some years, and the basic
premise is the same as contrib/pg_upgrade that existed as far back as
7.1/7.2 timeframe.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Jun 7, 2009, at 10:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> I think it's becoming increasingly clear that pg_migrator is not for
>> the faint of heart; in fact, I think we should hedge references to it
>> with words like "experimental".
>
> Probably ...
>
>> Just to recall the history, the first pgfoundry
>> commit messages for this tool were on February 9th, three months  
>> after
>> the start of the final CommitFest and feature freeze for 8.4.  Since
>> then, development has proceeded at an amazingly rapid pace, but
>> there's only so much you can do in four months,
>
> ... but the above is a *complete* misrepresentation of the thing's
> history (apparently you failed to look in the Attic?).  EDB have been
> using previous versions of this tool for some years, and the basic
> premise is the same as contrib/pg_upgrade that existed as far back as
> 7.1/7.2 timeframe.

I did know that EDB had been using the tool for a while, but I admit  
I'm not familiar with the whole history.  I had the impression that  
we'd gotten a lot more serious about really making this rock solid  
since Bruce took it in February.  But maybe that's not the case?

...Robert


Re: pg_migrator issue with contrib

From
Stefan Kaltenbrunner
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> I think it's becoming increasingly clear that pg_migrator is not for
>> the faint of heart; in fact, I think we should hedge references to it
>> with words like "experimental".
> 
> Probably ...

I'm with Robert on that one - while pg_migrator is extremely inmportant 
for us to go forward. I really think we need to tag it "experimental" 
for this release at least. pg_migrator is complex and we are still 
discovering new issues every day I don't think rushing it as _THE_ 
solution will do any good for our reputation. A lot of our code(as 
software in general) took years to mature and pg_migrator is likely not 
an exception.

> 
>> Just to recall the history, the first pgfoundry
>> commit messages for this tool were on February 9th, three months after
>> the start of the final CommitFest and feature freeze for 8.4.  Since
>> then, development has proceeded at an amazingly rapid pace, but
>> there's only so much you can do in four months,
> 
> ... but the above is a *complete* misrepresentation of the thing's
> history (apparently you failed to look in the Attic?).  EDB have been
> using previous versions of this tool for some years, and the basic
> premise is the same as contrib/pg_upgrade that existed as far back as
> 7.1/7.2 timeframe.

well - how much field exposure has pg_migrator/edb_migrator seen 
actually? given the large number of "breaks your database" bugs and 
issues that got found during the last few weeks I have a hard time 
imaging that edb really used the current code for their customers.


Stefan


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> I did know that EDB had been using the tool for a while, but I admit  
> I'm not familiar with the whole history.  I had the impression that  
> we'd gotten a lot more serious about really making this rock solid  
> since Bruce took it in February.  But maybe that's not the case?

I don't actually know the EDB end of the history either; maybe someone
can educate us about that.  But it's true that the core developers,
at least, weren't taking it seriously until this year.  That's because
it really can only handle catalog changes, not changes to the contents
of user tables; and it's been quite a long time since we've had a
release where we didn't change tuple header layout or packing rules or
something that made it a nonstarter.  It wasn't clear till early this
year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
being useful in production ... so we got serious about it.

(I do not know whether EDB ever really used it in production.  If they
did, it must have been for private updates that changed catalogs and
not user data.)
        regards, tom lane


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
I wrote:
> * pageinspect has changed the ABI of get_raw_page() in a way that will
> likely make it dump core if the function definition is migrated from
> an old DB.  This needs to be fixed.
> [ and similarly for some other contrib modules ]

After thinking about this some more, I think that there is a fairly
simple coding rule we can adopt to prevent post-migration crashes
of the sort I'm worrying about above.  That is:

* If you change the ABI of a C-language function, change its C name.

This will ensure that if someone tries to migrate the old function
definition from an old database, they will get a pg_migrator failure,
or at worst a clean runtime failure when they attempt to use the old
definition.  They won't get a core dump or some worse form of security
problem.

As an example, the problem in pageinspect is this diff:

***************
*** 6,16 **** -- -- get_raw_page() --
! CREATE OR REPLACE FUNCTION get_raw_page(text, int4) RETURNS bytea AS 'MODULE_PATHNAME', 'get_raw_page' LANGUAGE C
STRICT; -- -- page_header() --
 
--- 6,21 ---- -- -- get_raw_page() --
! CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4) RETURNS bytea AS 'MODULE_PATHNAME', 'get_raw_page' LANGUAGE
CSTRICT; 
 
+ CREATE OR REPLACE FUNCTION get_raw_page(text, int4) 
+ RETURNS bytea
+ AS $$ SELECT get_raw_page($1, 'main', $2); $$
+ LANGUAGE SQL STRICT;
+  -- -- page_header() --
***************

The underlying C-level get_raw_page function is still there, but
it now expects three arguments not two, and will crash if it's
passed an int4 where it's expecting a text argument.  But the old
function definition will migrate without error --- there's no way
for pg_migrator to realize it's installing a security hazard.

The way we should have done this, which I intend to go change it to,
is something like

CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page'
LANGUAGE C STRICT;

CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
RETURNS bytea
AS 'MODULE_PATHNAME', 'get_raw_page_3'
LANGUAGE C STRICT;

so that the old function's ABI is preserved.  Migration of the old
contrib module will then lead to the 3-argument function not being
immediately available, but the 2-argument one still works.  Had we not
wanted to keep the 2-argument form for some reason, we would have
provided only get_raw_page_3 in the .so file, and attempts to use the
old function definition would fail safely.

(We have actually seen similar problems before with people trying
to dump and reload database containing contrib modules.  pg_migrator
is not creating a problem that wasn't there before, it's just making
it worse.)

Comments, better ideas?
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Grzegorz Jaskiewicz wrote:
> 
> On 7 Jun 2009, at 03:27, Bruce Momjian wrote:
> 
> > Grzegorz Jaskiewicz wrote:
> >>
> >> On 6 Jun 2009, at 19:50, Tom Lane wrote:
> >>> We have five days.
> >>>
> >>> I don't think we need testing, per se.  The first step should be to
> >>> diff
> >>> the 8.3 and 8.4 versions of the various contrib .sql.in files and
> >>> determine what changed.  Any module whose .sql.in file hasn't  
> >>> changed
> >>> is definitely safe.
> >>
> >> I can tell you already that dblink has changed, and I had to drop it
> >> before migration, otherwise everything went fine. Migration of 57GB
> >> data in place took about 1 minute.
> >
> > The good news is that the INSTALL instructions where clear enough for
> > the tester to understand that uninstalling dblink from the old cluster
> > and reinstalling it in the new cluster would work.
> Yes, but I forgot about one database, and had to do it all over again,  
> luckily I first tested it without --link...

Any failure would have allowed you to revert to the old server.  It is
only when you _start_ the new server that the old server cannot be used;
I have clarified the INSTALL file on that point.
If you use linking, the migration will be much faster (no data copying),but you will no longer be able to access your
oldcluster once you startthe new cluster after the upgrade. 
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Jun 7, 2009, at 12:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Robert Haas <robertmhaas@gmail.com> writes:
>> I did know that EDB had been using the tool for a while, but I admit
>> I'm not familiar with the whole history.  I had the impression that
>> we'd gotten a lot more serious about really making this rock solid
>> since Bruce took it in February.  But maybe that's not the case?
>
> I don't actually know the EDB end of the history either; maybe someone
> can educate us about that.  But it's true that the core developers,
> at least, weren't taking it seriously until this year.  That's because
> it really can only handle catalog changes, not changes to the contents
> of user tables; and it's been quite a long time since we've had a
> release where we didn't change tuple header layout or packing rules or
> something that made it a nonstarter.  It wasn't clear till early this
> year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
> being useful in production ... so we got serious about it.

OK, that's more or less what I thought, and what I intended to convey  
upthread.  As far as core Postgres is concerned this is a new feature,  
and we haven't worked out all the kinks yet.  As long as we set  
expectations accordingly, I think that's OK.  You mention CVEs for  
these contrib issues, but will CVEs still be issued if we make clear  
that this is experimental only?  I would hope not, since that would  
amount to a policy that any half-baked code anywhere on pgfoundry is  
just as much our responsibility as core Postgres.  Surely we're  
allowed to say "good progress has been made here, but we harbor no  
illusions that it's bullet-proof."

...Robert


Re: pg_migrator issue with contrib

From
Josh Berkus
Date:
On 6/7/09 10:56 AM, Robert Haas wrote:
> OK, that's more or less what I thought, and what I intended to convey
> upthread.  As far as core Postgres is concerned this is a new feature,
> and we haven't worked out all the kinks yet.

Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it. 
AFAIC, until we have these sorts of issues worked out, it's still a beta.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


Re: pg_migrator issue with contrib

From
Stefan Kaltenbrunner
Date:
Josh Berkus wrote:
> On 6/7/09 10:56 AM, Robert Haas wrote:
>> OK, that's more or less what I thought, and what I intended to convey
>> upthread.  As far as core Postgres is concerned this is a new feature,
>> and we haven't worked out all the kinks yet.
> 
> Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it. 
> AFAIC, until we have these sorts of issues worked out, it's still a beta.

afaiks bruce stated he is going to remove the BETA tag from pg_migrator 
soon so I guess calling it beta in the main project docs will confuse 
the hell out of people(or causing them to think that it is not beta any 
more).
So maybe calling it experimental(from the POV of the main project) or 
something similiar might still be the better solution.


Stefan


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I think the cleanest solution is to document that these issues might
> > happen and suggest solutions.
> 
> No, the cleanest solution is to fix it before people ever see a
> problem.  This is trivial to do in the case of dblink and I don't
> see why you think it would be better to make users work around it.
> 
> Also, dblink is one of the easiest cases because (a) it doesn't have
> anything but functions, and thus it's possible to drop it from the
> old DB without data loss, and (b) the inconsistency that it has is
> something that will result in a clean, readily understandable failure
> during migration.  As opposed to some other cases that will migrate
> just fine and then dump core during use.
> 
> I've just finished running through a diff of 8.3 vs 8.4 contrib
> SQL files.  It looks like we have these cases:

[ list removed]

Certainly if you can fix /contrib problems at the source, please do so. 
I was commenting on the idea of having pg_migrator somehow skip specific
items to try to make it more failure-proof.  While I can do that for a
few cases, such as suppress the creation of specific functions by
filtering the schema dump file,  I will never be able to get them all,
and doing it extensively could destabilize pg_migrator.

You are suggesting improving /contrib itself, which is better.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> The underlying C-level get_raw_page function is still there, but
> it now expects three arguments not two, and will crash if it's
> passed an int4 where it's expecting a text argument.  But the old
> function definition will migrate without error --- there's no way
> for pg_migrator to realize it's installing a security hazard.

FYI, there is nothing pg_migrator specific here.  Someone doing a
dump/reload from 8.3 to 8.4 would have the same security issue. 
pg_migrator is using the same pg_dump output as a dump restore, except
it uses --schema.  pg_migrator would actually be more secure because it
will exit on the restore error rather than having the error possibly
ignored by the user.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Robert Haas wrote:
> OK, that's more or less what I thought, and what I intended to convey  
> upthread.  As far as core Postgres is concerned this is a new feature,  
> and we haven't worked out all the kinks yet.  As long as we set  
> expectations accordingly, I think that's OK.  You mention CVEs for  
> these contrib issues, but will CVEs still be issued if we make clear  
> that this is experimental only?  I would hope not, since that would  
> amount to a policy that any half-baked code anywhere on pgfoundry is  
> just as much our responsibility as core Postgres.  Surely we're  
> allowed to say "good progress has been made here, but we harbor no  
> illusions that it's bullet-proof."

Again, there is nothing pg_migrator-specific about these security
issues.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Stefan Kaltenbrunner wrote:
> Josh Berkus wrote:
> > On 6/7/09 10:56 AM, Robert Haas wrote:
> >> OK, that's more or less what I thought, and what I intended to convey
> >> upthread.  As far as core Postgres is concerned this is a new feature,
> >> and we haven't worked out all the kinks yet.
> > 
> > Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it. 
> > AFAIC, until we have these sorts of issues worked out, it's still a beta.
> 
> afaiks bruce stated he is going to remove the BETA tag from pg_migrator 
> soon so I guess calling it beta in the main project docs will confuse 
> the hell out of people(or causing them to think that it is not beta any 
> more).
> So maybe calling it experimental(from the POV of the main project) or 
> something similar might still be the better solution.

This all sounds very discouraging.  It is like, "Oh, my, there is a
migration tool and it might have bugs.  How do we prevent people from
using it?"

Right now nothing in the project is referring to pg_migrator except in
the press release, and it is marked as beta there.  How do you want to
deemphasize it more than that?  Why did I bother working on this if the
community reaction is to try to figure out how to make people avoid
using it?

I am now thinking I need to my own PR for pg_migrator because obviously
the community is only worried it might have a bug.  Instead of testing
it, looking at the code, submitting bug reports, or anything
constructive, you sit around figuring out how to put a disparaging label
on it!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
> > I did know that EDB had been using the tool for a while, but I admit  
> > I'm not familiar with the whole history.  I had the impression that  
> > we'd gotten a lot more serious about really making this rock solid  
> > since Bruce took it in February.  But maybe that's not the case?
> 
> I don't actually know the EDB end of the history either; maybe someone
> can educate us about that.  But it's true that the core developers,
> at least, weren't taking it seriously until this year.  That's because
> it really can only handle catalog changes, not changes to the contents
> of user tables; and it's been quite a long time since we've had a
> release where we didn't change tuple header layout or packing rules or
> something that made it a nonstarter.  It wasn't clear till early this
> year that 8.3->8.4 would be a cycle where pg_migrator had a chance of
> being useful in production ... so we got serious about it.
> 
> (I do not know whether EDB ever really used it in production.  If they
> did, it must have been for private updates that changed catalogs and
> not user data.)

pg_migrator verion 0.5 is still on the pg_migrator web site, and that is
the version I started from.  It had this line in the intro:
PG_migrator is a tool (not a complete solution) that performs anin-place upgrade of existing data.

Of course no one wants a toolkit, they want a full solution, so I
modified the code to be easier to use and more robust.  I am not sure
how much EDB used it.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Sun, Jun 7, 2009 at 11:50 PM, Bruce Momjian<bruce@momjian.us> wrote:
> Stefan Kaltenbrunner wrote:
>> Josh Berkus wrote:
>> > On 6/7/09 10:56 AM, Robert Haas wrote:
>> >> OK, that's more or less what I thought, and what I intended to convey
>> >> upthread.  As far as core Postgres is concerned this is a new feature,
>> >> and we haven't worked out all the kinks yet.
>> >
>> > Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
>> > AFAIC, until we have these sorts of issues worked out, it's still a beta.
>>
>> afaiks bruce stated he is going to remove the BETA tag from pg_migrator
>> soon so I guess calling it beta in the main project docs will confuse
>> the hell out of people(or causing them to think that it is not beta any
>> more).
>> So maybe calling it experimental(from the POV of the main project) or
>> something similar might still be the better solution.
>
> This all sounds very discouraging.  It is like, "Oh, my, there is a
> migration tool and it might have bugs.  How do we prevent people from
> using it?"
>
> Right now nothing in the project is referring to pg_migrator except in
> the press release, and it is marked as beta there.  How do you want to
> deemphasize it more than that?  Why did I bother working on this if the
> community reaction is to try to figure out how to make people avoid
> using it?

Because Rome wasn't built in a day.

It seems to me that you yourself placed a far more disparaging label
on it than anything that anyone has proposed today; this was a week
ago:

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

I don't think it's anyone's intention to disparage your work on this
tool.  It certainly isn't mine.  But it seems obvious to me that it
has some pretty severe limitations and warts.  The fact that those
limitations and warts are well-documented doesn't negate their
existence.  I also don't think calling the software "beta" or
"experimental" is a way of deemphasizing it.  I think it's a way of
being clear that this software is not the bullet-proof, rock-solid,
handles-all-cases-and-keeps-on-trucking level of robustness that
people have come to expect from PostgreSQL.

FWIW, I have no problem at all with mentioning pg_migrator in the
release notes or the documentation; my failure to respond to your last
emails on this topic was due to being busy and having already spent
too much time responding to other emails, not due to thinking it was a
bad idea.  I actually think it's a good idea.  But I also think those
references should describe it as experimental, because I think it is.
I really hope it won't remain experimental forever, but I think that's
an accurate characterization of where it is now.

You, or others, may disagree, of course.

...Robert


Re: pg_migrator issue with contrib

From
Stefan Kaltenbrunner
Date:
Robert Haas wrote:
> On Sun, Jun 7, 2009 at 11:50 PM, Bruce Momjian<bruce@momjian.us> wrote:
>> Stefan Kaltenbrunner wrote:
>>> Josh Berkus wrote:
>>>> On 6/7/09 10:56 AM, Robert Haas wrote:
>>>>> OK, that's more or less what I thought, and what I intended to convey
>>>>> upthread.  As far as core Postgres is concerned this is a new feature,
>>>>> and we haven't worked out all the kinks yet.
>>>> Yes, I'm calling it "pg_migrator beta" in any advocacy/PR about it.
>>>> AFAIC, until we have these sorts of issues worked out, it's still a beta.
>>> afaiks bruce stated he is going to remove the BETA tag from pg_migrator
>>> soon so I guess calling it beta in the main project docs will confuse
>>> the hell out of people(or causing them to think that it is not beta any
>>> more).
>>> So maybe calling it experimental(from the POV of the main project) or
>>> something similar might still be the better solution.
>> This all sounds very discouraging.  It is like, "Oh, my, there is a
>> migration tool and it might have bugs.  How do we prevent people from
>> using it?"
>>
>> Right now nothing in the project is referring to pg_migrator except in
>> the press release, and it is marked as beta there.  How do you want to
>> deemphasize it more than that?  Why did I bother working on this if the
>> community reaction is to try to figure out how to make people avoid
>> using it?
> 
> Because Rome wasn't built in a day.

indeed

> 
> It seems to me that you yourself placed a far more disparaging label
> on it than anything that anyone has proposed today; this was a week
> ago:
> 
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php

well that is way more discouraging than what I wanted to say :)

> 
> I don't think it's anyone's intention to disparage your work on this
> tool.  It certainly isn't mine.  But it seems obvious to me that it
> has some pretty severe limitations and warts.  The fact that those
> limitations and warts are well-documented doesn't negate their
> existence.  I also don't think calling the software "beta" or
> "experimental" is a way of deemphasizing it.  I think it's a way of
> being clear that this software is not the bullet-proof, rock-solid,
> handles-all-cases-and-keeps-on-trucking level of robustness that
> people have come to expect from PostgreSQL.

Exactly my point. pg_migrator gained a lot of momentum in the last weeks 
an months, but imho it has still way to go. I do think that binary 
upgrades are extremely important for us(that's why I did a fair amount 
of testing on it) but I don't think that we should  go too far for this 
release.
A lot of the code that makes postgresql what it is now took years to 
mature on pgfoundry or in contrib. So some of the questions to ask would be:
* is pg_migrator ready for contrib/? Probably not - it is still a way 
too moving target so pgfoundry is good
* is pg_migrator ready for /src/bin?

Realistically I think we need to get at least one full cycle to see what  happens in the field with something as
complexas pg_migrator to 
 
really get a grasp on what else comes up.


> 
> FWIW, I have no problem at all with mentioning pg_migrator in the
> release notes or the documentation; my failure to respond to your last
> emails on this topic was due to being busy and having already spent
> too much time responding to other emails, not due to thinking it was a
> bad idea.  I actually think it's a good idea.  But I also think those
> references should describe it as experimental, because I think it is.
> I really hope it won't remain experimental forever, but I think that's
> an accurate characterization of where it is now.

yep - I was not against mentioning it either. We just should do it in a 
sane way(ie it is not part of the core project yet but endorsed and 
might get added in the future or such) so we don't confuse people(like 
we call it beta, the homepage does not) and yet get valuable feedback 
which we certainly need to go forward.


Stefan


Re: pg_migrator issue with contrib

From
Magnus Hagander
Date:
Dimitri Fontaine wrote:
> Le 6 juin 09 à 20:45, Josh Berkus a écrit :
>> So, here's what we need for 8.3 --> 8.4 for contrib modules:
> 
> That does nothing for external modules whose code isn't in PostgreSQL
> control. I'm thinking of those examples I cited up-thread --- and some
> more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).

For me and I know several others, the big question would be PostGIS.
Unfortunately I haven't had the time to run any tests myself, so I'll
join the line of people being worried, but I have a number of customers
with somewhere between pretty and very large PostGIS databases that
could really benefit from pg_migrator.

As long as PostGIS is the same version in both of them, is pg_migrator
is likely to work? (one can always run the PostGIS upgrade as a separate
step)


> Could pg_migrator detect usage of "objects" oids (data types in
> relation, index opclass, ...) that are unknown to be in the standard
> -core + contrib distribution, and quit trying to upgrade the cluster in
> this case, telling the user his database is not supported?

+1 on this.

Or at least, have it exit and say "if you know that these things are
reasonably safe, run pg_migrator again with --force" or something like that.


-- Magnus HaganderSelf: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Re: pg_migrator issue with contrib

From
Merlin Moncure
Date:
On Sun, Jun 7, 2009 at 12:36 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> * pageinspect has changed the ABI of get_raw_page() in a way that will
>> likely make it dump core if the function definition is migrated from
>> an old DB.  This needs to be fixed.
>> [ and similarly for some other contrib modules ]
>
> After thinking about this some more, I think that there is a fairly
> simple coding rule we can adopt to prevent post-migration crashes
> of the sort I'm worrying about above.  That is:
>
> * If you change the ABI of a C-language function, change its C name.
>
> This will ensure that if someone tries to migrate the old function
> definition from an old database, they will get a pg_migrator failure,
> or at worst a clean runtime failure when they attempt to use the old
> definition.  They won't get a core dump or some worse form of security
> problem.
>
> As an example, the problem in pageinspect is this diff:
>
> ***************
> *** 6,16 ****
>  --
>  -- get_raw_page()
>  --
> ! CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
>  RETURNS bytea
>  AS 'MODULE_PATHNAME', 'get_raw_page'
>  LANGUAGE C STRICT;
>
>  --
>  -- page_header()
>  --
> --- 6,21 ----
>  --
>  -- get_raw_page()
>  --
> ! CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
>  RETURNS bytea
>  AS 'MODULE_PATHNAME', 'get_raw_page'
>  LANGUAGE C STRICT;
>
> + CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
> + RETURNS bytea
> + AS $$ SELECT get_raw_page($1, 'main', $2); $$
> + LANGUAGE SQL STRICT;
> +
>  --
>  -- page_header()
>  --
> ***************
>
> The underlying C-level get_raw_page function is still there, but
> it now expects three arguments not two, and will crash if it's
> passed an int4 where it's expecting a text argument.  But the old
> function definition will migrate without error --- there's no way
> for pg_migrator to realize it's installing a security hazard.
>
> The way we should have done this, which I intend to go change it to,
> is something like
>
> CREATE OR REPLACE FUNCTION get_raw_page(text, int4)
> RETURNS bytea
> AS 'MODULE_PATHNAME', 'get_raw_page'
> LANGUAGE C STRICT;
>
> CREATE OR REPLACE FUNCTION get_raw_page(text, text, int4)
> RETURNS bytea
> AS 'MODULE_PATHNAME', 'get_raw_page_3'
> LANGUAGE C STRICT;
>
> so that the old function's ABI is preserved.  Migration of the old
> contrib module will then lead to the 3-argument function not being
> immediately available, but the 2-argument one still works.  Had we not
> wanted to keep the 2-argument form for some reason, we would have
> provided only get_raw_page_3 in the .so file, and attempts to use the
> old function definition would fail safely.
>
> (We have actually seen similar problems before with people trying
> to dump and reload database containing contrib modules.  pg_migrator
> is not creating a problem that wasn't there before, it's just making
> it worse.)
>
> Comments, better ideas?

maybe, get_raw_page_v2, etc?  I suppose you could run into situation
with multiple versions of the function w/same # of arguments?

merlin


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> As long as PostGIS is the same version in both of them, is pg_migrator
> is likely to work? (one can always run the PostGIS upgrade as a separate
> step)

There was just some discussion about that on postgis-devel.  I think the
conclusion was that you would have to do the PostGIS update as a
separate step.  They intend to support both 1.3.x and 1.4.x on current
versions of Postgres for some time, so in principle you could do it in
either order.

>> Could pg_migrator detect usage of "objects" oids (data types in
>> relation, index opclass, ...) that are unknown to be in the standard
>> -core + contrib distribution, and quit trying to upgrade the cluster in
>> this case, telling the user his database is not supported?

> +1 on this.

> Or at least, have it exit and say "if you know that these things are
> reasonably safe, run pg_migrator again with --force" or something like that.

I don't think that anything in that line is going to be helpful.
What it will lead to is people mindlessly using --force (cf our
bad experiences with -i for pg_dump).  If you can't give a *useful*
ie trustworthy warning/error, issuing a useless one is not a good
substitute.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Magnus Hagander
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> As long as PostGIS is the same version in both of them, is pg_migrator
>> is likely to work? (one can always run the PostGIS upgrade as a separate
>> step)
> 
> There was just some discussion about that on postgis-devel.  I think the
> conclusion was that you would have to do the PostGIS update as a
> separate step.  They intend to support both 1.3.x and 1.4.x on current
> versions of Postgres for some time, so in principle you could do it in
> either order.

Doing them as two steps is totally fine with me, because IIRC the
PostGIS upgrades generally don't require hours and hours of downtime.


>>> Could pg_migrator detect usage of "objects" oids (data types in
>>> relation, index opclass, ...) that are unknown to be in the standard
>>> -core + contrib distribution, and quit trying to upgrade the cluster in
>>> this case, telling the user his database is not supported?
> 
>> +1 on this.
> 
>> Or at least, have it exit and say "if you know that these things are
>> reasonably safe, run pg_migrator again with --force" or something like that.
> 
> I don't think that anything in that line is going to be helpful.
> What it will lead to is people mindlessly using --force (cf our
> bad experiences with -i for pg_dump).  If you can't give a *useful*
> ie trustworthy warning/error, issuing a useless one is not a good
> substitute.

Well, in that case, error out would be a better option than doing it and
probably fail later. And have a --force option available, but don't
suggest it.

//Magnus



Re: pg_migrator issue with contrib

From
Dimitri Fontaine
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>>> Could pg_migrator detect usage of "objects" oids (data types in
>>>> relation, index opclass, ...) that are unknown to be in the standard
>>>> -core + contrib distribution, and quit trying to upgrade the cluster in
>>>> this case, telling the user his database is not supported?
>> 
>>> +1 on this.
>> 
>>> Or at least, have it exit and say "if you know that these things are
>>> reasonably safe, run pg_migrator again with --force" or something like that.
>> 
>> I don't think that anything in that line is going to be helpful.
>> What it will lead to is people mindlessly using --force (cf our
>> bad experiences with -i for pg_dump).  If you can't give a *useful*
>> ie trustworthy warning/error, issuing a useless one is not a good
>> substitute.
>
> Well, in that case, error out would be a better option than doing it and
> probably fail later. And have a --force option available, but don't
> suggest it.

My vote would go to detect and error out without recovering option. If
the tool is not able to handle a situation and knows it, I don't see
what would be good about it leting the user lose data on purpose.

The --force option should be for the user to manually drop his columns
and indexes (etc) and try pg_migrator again, which will stop listing
faulty objects but care about the now compatible cluster.

Restoring the lost data is not the job of pg_migrator, of course.

Regards,
-- 
dim


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> My vote would go to detect and error out without recovering option. If
> the tool is not able to handle a situation and knows it, I don't see
> what would be good about it leting the user lose data on purpose.

No, that's not what's being discussed.  The proposal is to have it error
out when it *does not* know whether there is a real problem; and, in
fact, when there's only a rather low probability of there being a real
problem.  My view is that that's basically counterproductive.  It leads
directly to having to have a --force switch and then to people using
that switch carelessly.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Robert Haas wrote:
> > Right now nothing in the project is referring to pg_migrator except in
> > the press release, and it is marked as beta there. ?How do you want to
> > deemphasize it more than that? ?Why did I bother working on this if the
> > community reaction is to try to figure out how to make people avoid
> > using it?
> 
> Because Rome wasn't built in a day.
> 
> It seems to me that you yourself placed a far more disparaging label
> on it than anything that anyone has proposed today; this was a week
> ago:
> 
> http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php
> 
> I don't think it's anyone's intention to disparage your work on this
> tool.  It certainly isn't mine.  But it seems obvious to me that it
> has some pretty severe limitations and warts.  The fact that those
> limitations and warts are well-documented doesn't negate their
> existence.  I also don't think calling the software "beta" or
> "experimental" is a way of deemphasizing it.  I think it's a way of
> being clear that this software is not the bullet-proof, rock-solid,
> handles-all-cases-and-keeps-on-trucking level of robustness that
> people have come to expect from PostgreSQL.
> 
> FWIW, I have no problem at all with mentioning pg_migrator in the
> release notes or the documentation; my failure to respond to your last
> emails on this topic was due to being busy and having already spent
> too much time responding to other emails, not due to thinking it was a
> bad idea.  I actually think it's a good idea.  But I also think those
> references should describe it as experimental, because I think it is.
> I really hope it won't remain experimental forever, but I think that's
> an accurate characterization of where it is now.

pg_migrator should be looked at critically here, and I agree we should
avoid letting pg_migrator failures reflect badly on Postgres.

Let me list the problems with pg_migrator:
o  /contrib and plugin migration (not unique to pg_migrator)o  you must read/follow the install instructionso  might
requirepost-migration table/index rebuildso  new so serious bugs might exist
 

and let me list its benefits:
o  first in-place upgrade capability in yearso  tested by some users, all successful (since late alpha)o  removes major
impedimentto adoptiono  includes extensive error checking and reportingo  contains detailed installation/usage
instructions

So let's look at pg_migrator as an opportunity and a risk.  As far as I
know, only Hiroshi Saito and I have have looked at the code.  Why don't
others read the pg_migrator source code looking for bugs?  Why have more
people not test it?

I think "experimental" is the wrong label.  Experimental assumes its
usefulness is uncertain and that it is still being researched ---
neither is true.  Once I release pg_migrator 8.4 final at the end of
this week (assuming no bugs are reported), I consider it done, or at
least advanced as far as I can go until I get more feedback from users.

I think we can say:  "pg_migrator is designed for experienced users with
large databases, for whom the typical dump/restore required for major
version upgrades is a hardship".

I assume this will be the same adoption pattern we had with the Win32
port, where it was a new platform in 8.0 and we dealt with some issues
as it was deployed, and that people who want it will find it and
hopefully it will be useful for them.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> Dimitri Fontaine wrote:
> > Le 6 juin 09 ? 20:45, Josh Berkus a ?crit :
> >> So, here's what we need for 8.3 --> 8.4 for contrib modules:
> > 
> > That does nothing for external modules whose code isn't in PostgreSQL
> > control. I'm thinking of those examples I cited up-thread --- and some
> > more. (ip4r, temporal, prefix, hstore-new, oracfe, etc).
> 
> For me and I know several others, the big question would be PostGIS.
> Unfortunately I haven't had the time to run any tests myself, so I'll
> join the line of people being worried, but I have a number of customers
> with somewhere between pretty and very large PostGIS databases that
> could really benefit from pg_migrator.
> 
> As long as PostGIS is the same version in both of them, is pg_migrator
> is likely to work? (one can always run the PostGIS upgrade as a separate
> step)

Yes, it should work with the same version of PostGIS, but I have not
tested it.  There is nothing special about PostGIS that would cause it
not work work --- we use the same pg_dump as you would for a major
upgrade --- we just move the files around instead of dumping the data.

> > Could pg_migrator detect usage of "objects" oids (data types in
> > relation, index opclass, ...) that are unknown to be in the standard
> > -core + contrib distribution, and quit trying to upgrade the cluster in
> > this case, telling the user his database is not supported?
> 
> +1 on this.
> 
> Or at least, have it exit and say "if you know that these things are
> reasonably safe, run pg_migrator again with --force" or something like that.

Right now pg_migrator throws an error if the schema load doesn't work. 
Assuming you use the same version on the old and new clusters, it should
work fine.  I am unclear what checking oids would do.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > As long as PostGIS is the same version in both of them, is pg_migrator
> > is likely to work? (one can always run the PostGIS upgrade as a separate
> > step)
> 
> There was just some discussion about that on postgis-devel.  I think the
> conclusion was that you would have to do the PostGIS update as a
> separate step.  They intend to support both 1.3.x and 1.4.x on current
> versions of Postgres for some time, so in principle you could do it in
> either order.

Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
pg_migrator upgrade.  It has to be done either before or after
pg_migrator is run.  I wonder how I could prevent someone from trying
that trick.

> >> Could pg_migrator detect usage of "objects" oids (data types in
> >> relation, index opclass, ...) that are unknown to be in the standard
> >> -core + contrib distribution, and quit trying to upgrade the cluster in
> >> this case, telling the user his database is not supported?
> 
> > +1 on this.
> 
> > Or at least, have it exit and say "if you know that these things are
> > reasonably safe, run pg_migrator again with --force" or something like that.
> 
> I don't think that anything in that line is going to be helpful.
> What it will lead to is people mindlessly using --force (cf our
> bad experiences with -i for pg_dump).  If you can't give a *useful*
> ie trustworthy warning/error, issuing a useless one is not a good
> substitute.

Yep.  The install instructions explain how you have to get around this,
and if they don't understand it, they shouldn't be using pg_migrator and
should just do the traditional dump/restore.  It is too tempting to give
them a force flag.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Magnus Hagander wrote:
> >>> Could pg_migrator detect usage of "objects" oids (data types in
> >>> relation, index opclass, ...) that are unknown to be in the standard
> >>> -core + contrib distribution, and quit trying to upgrade the cluster in
> >>> this case, telling the user his database is not supported?
> > 
> >> +1 on this.
> > 
> >> Or at least, have it exit and say "if you know that these things are
> >> reasonably safe, run pg_migrator again with --force" or something like that.
> > 
> > I don't think that anything in that line is going to be helpful.
> > What it will lead to is people mindlessly using --force (cf our
> > bad experiences with -i for pg_dump).  If you can't give a *useful*
> > ie trustworthy warning/error, issuing a useless one is not a good
> > substitute.
> 
> Well, in that case, error out would be a better option than doing it and
> probably fail later. And have a --force option available, but don't
> suggest it.

Uh, what doesn't error out now in pg_migrator?

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Dimitri Fontaine <dfontaine@hi-media.com> writes:
>> My vote would go to detect and error out without recovering option. If
>> the tool is not able to handle a situation and knows it, I don't see
>> what would be good about it leting the user lose data on purpose.
>
> No, that's not what's being discussed.  The proposal is to have it error
> out when it *does not* know whether there is a real problem; and, in
> fact, when there's only a rather low probability of there being a real
> problem.  My view is that that's basically counterproductive.  It leads
> directly to having to have a --force switch and then to people using
> that switch carelessly.

True, it could be that the data type representation has not changed
between 8.3 and 8.4, nor the index content format. In this case
pg_migrator will work fine on the cluster as soon as you installed the
new .so...

So the case where pg_migrator still fails is when the .sql file of the
module has changed in a way that restoring what pg_dump gives no longer
match what the .so exposes, or if the new .so is non backward
compatible?

Ok, maybe there's a way it'll just work. I withdraw my vote.

Thanks for your patience, regards,
-- 
dim


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> >> I don't think that anything in that line is going to be helpful.
> >> What it will lead to is people mindlessly using --force (cf our
> >> bad experiences with -i for pg_dump).  If you can't give a *useful*
> >> ie trustworthy warning/error, issuing a useless one is not a good
> >> substitute.
> >
> > Well, in that case, error out would be a better option than doing it and
> > probably fail later. And have a --force option available, but don't
> > suggest it.
> 
> My vote would go to detect and error out without recovering option. If
> the tool is not able to handle a situation and knows it, I don't see
> what would be good about it leting the user lose data on purpose.
> 
> The --force option should be for the user to manually drop his columns
> and indexes (etc) and try pg_migrator again, which will stop listing
> faulty objects but care about the now compatible cluster.
> 
> Restoring the lost data is not the job of pg_migrator, of course.

Agreed.  Right now pg_migrator never modifies the old cluster except for
renaming pg_control (documented) so the old cluster is not accidentally
restarted.  I don't want to change that behavior.

I would filter the dump --schema file, but again, it is best to let the
administrator do it, and if they can't, they should just do
dump/restore.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Let me list the problems with pg_migrator:

>     o  /contrib and plugin migration (not unique to pg_migrator)
>     o  you must read/follow the install instructions
>     o  might require post-migration table/index rebuilds
>     o  new so serious bugs might exist

I think that #1 and #4 could be substantially alleviated if the
instructions recommended doing a trial run with a schema-only dump
of the database.  That is,

* pg_dumpall -s
* load that into a test installation (of the *old* PG version)
* migrate the test installation to new PG version
* do the same sorts of applications compatibility checks you'd want to do anyway before a major version upgrade

This would certainly catch migration-time failures caused by plugins,
and the followup testing would probably catch any large post-migration
issue.

Somebody who is not willing to do this type of testing should not be
using pg_migrator (yet), and probably has not got a database large
enough to need it anyway.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> 
> > Dimitri Fontaine <dfontaine@hi-media.com> writes:
> >> My vote would go to detect and error out without recovering option. If
> >> the tool is not able to handle a situation and knows it, I don't see
> >> what would be good about it leting the user lose data on purpose.
> >
> > No, that's not what's being discussed.  The proposal is to have it error
> > out when it *does not* know whether there is a real problem; and, in
> > fact, when there's only a rather low probability of there being a real
> > problem.  My view is that that's basically counterproductive.  It leads
> > directly to having to have a --force switch and then to people using
> > that switch carelessly.
> 
> True, it could be that the data type representation has not changed
> between 8.3 and 8.4, nor the index content format. In this case
> pg_migrator will work fine on the cluster as soon as you installed the
> new .so...

Yes.

> So the case where pg_migrator still fails is when the .sql file of the
> module has changed in a way that restoring what pg_dump gives no longer
> match what the .so exposes, or if the new .so is non backward
> compatible?

Yes, that is a problem.  It is not a pg_migrator-specific problem
because people traditionally bring the /contrib schema over from the old
install (I think).  The only pg_migrator-specific failure is when the
data format changed and dump/restore would fix it, but pg_migrator would
migrate corrupt data.  :-(

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> There was just some discussion about that on postgis-devel.  I think the
>> conclusion was that you would have to do the PostGIS update as a
>> separate step.  They intend to support both 1.3.x and 1.4.x on current
>> versions of Postgres for some time, so in principle you could do it in
>> either order.

> Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
> pg_migrator upgrade.  It has to be done either before or after
> pg_migrator is run.  I wonder how I could prevent someone from trying
> that trick.

You don't need to, because it will fail automatically.  They are using
version-numbered .so files, so C-language functions referencing the 1.3
.so will fail to load if only the 1.4 .so is in the new installation.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Dimitri Fontaine <dfontaine@hi-media.com> writes:
> So the case where pg_migrator still fails is when the .sql file of the
> module has changed in a way that restoring what pg_dump gives no longer
> match what the .so exposes, or if the new .so is non backward
> compatible?

Exactly.  And note that this is not pg_migrator's fault: a pg_dump
dump and reload of the database exposes the user to the same risks,
if the module author has not been careful about compatibility.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Let me list the problems with pg_migrator:
> 
> >     o  /contrib and plugin migration (not unique to pg_migrator)
> >     o  you must read/follow the install instructions
> >     o  might require post-migration table/index rebuilds
> >     o  new so serious bugs might exist
> 
> I think that #1 and #4 could be substantially alleviated if the
> instructions recommended doing a trial run with a schema-only dump
> of the database.  That is,
> 
> * pg_dumpall -s
> * load that into a test installation (of the *old* PG version)
> * migrate the test installation to new PG version
> * do the same sorts of applications compatibility checks you'd want to
>   do anyway before a major version upgrade

But you have no data in the database --- can any meaningful testing be
done?

FYI, pg_migrator will do the schema load pretty early (even before the
file copy) and fail on errors.  Retrying pg_migrator is pretty easy and
is now well documented in the INSTALL file.

> This would certainly catch migration-time failures caused by plugins,
> and the followup testing would probably catch any large post-migration
> issue.
> 
> Somebody who is not willing to do this type of testing should not be
> using pg_migrator (yet), and probably has not got a database large
> enough to need it anyway.

Agreed.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> There was just some discussion about that on postgis-devel.  I think the
> >> conclusion was that you would have to do the PostGIS update as a
> >> separate step.  They intend to support both 1.3.x and 1.4.x on current
> >> versions of Postgres for some time, so in principle you could do it in
> >> either order.
> 
> > Oh, yea, you can't go from PostGIS version 1.3 to 1.4 _while_ you do the
> > pg_migrator upgrade.  It has to be done either before or after
> > pg_migrator is run.  I wonder how I could prevent someone from trying
> > that trick.
> 
> You don't need to, because it will fail automatically.  They are using
> version-numbered .so files, so C-language functions referencing the 1.3
> .so will fail to load if only the 1.4 .so is in the new installation.

Nice.  Something to be learned there.  ;-)

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Dimitri Fontaine <dfontaine@hi-media.com> writes:
> > So the case where pg_migrator still fails is when the .sql file of the
> > module has changed in a way that restoring what pg_dump gives no longer
> > match what the .so exposes, or if the new .so is non backward
> > compatible?
> 
> Exactly.  And note that this is not pg_migrator's fault: a pg_dump
> dump and reload of the database exposes the user to the same risks,
> if the module author has not been careful about compatibility.

Agreed.  In many ways pg_migrator failures will be caused by failures of
the many tools it relies upon, as I mentioned a few days ago.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Dimitri Fontaine wrote:
>> So the case where pg_migrator still fails is when the .sql file of the
>> module has changed in a way that restoring what pg_dump gives no longer
>> match what the .so exposes, or if the new .so is non backward
>> compatible?

> Yes, that is a problem.  It is not a pg_migrator-specific problem
> because people traditionally bring the /contrib schema over from the old
> install (I think).  The only pg_migrator-specific failure is when the
> data format changed and dump/restore would fix it, but pg_migrator would
> migrate corrupt data.  :-(

There is a different problem though: sometimes the recommended fix for
contrib-module incompatibilities is to load the new contrib module into
the target database before trying to load your old dump file.  (We told
people to do that for 8.2->8.3 tsearch2, for example.)  In the
pg_migrator context there is no way to insert the new contrib module
first, and also no way to ignore the duplicate-object errors that you
typically get while loading the old dump.

It would probably be a relatively simple feature addition to teach
pg_migrator to load such-and-such modules into the new databases before
loading the old dump.  But I'm still scared to death by the idea of
letting it ignore errors, so there doesn't seem to be any good solution
to this type of migration scenario.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> I think that #1 and #4 could be substantially alleviated if the
>> instructions recommended doing a trial run with a schema-only dump
>> of the database.  That is,
>> 
>> * pg_dumpall -s
>> * load that into a test installation (of the *old* PG version)
>> * migrate the test installation to new PG version
>> * do the same sorts of applications compatibility checks you'd want to
>> do anyway before a major version upgrade

> But you have no data in the database --- can any meaningful testing be
> done?

Well, you'd have to insert some.  But this is something you'd have to do
*anyway*, unless you are willing to just pray that your apps don't need
any changes.  The only new thing I'm suggesting here is incorporating
use of pg_migrator into your normal compatibility testing.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Dimitri Fontaine
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:
> Exactly.  And note that this is not pg_migrator's fault: a pg_dump
> dump and reload of the database exposes the user to the same risks,
> if the module author has not been careful about compatibility.

It seems to me the dump will contain text string representation of data
and pg_restore will run the input function of the type on this, so that
maintaining backward compatibility of the data type doesn't sound
hard. As far as the specific index support goes, pg_restore creates the
index from scratch.

So, from my point of view, supporting backward compatibility by means of
dump and restore is the easy way. Introducing pg_migrator in the game,
the data type and index internals upgrade are now faced to the same
problem as the -core in-place upgrade project.

Maybe we'll be able to provide the extension authors (not only contrib)
a specialized API to trigger in case of in-place upgrade of PG version
or the extension itself, ala Erlang code upgrade facility e.g.:
http://erlang.org/doc/reference_manual/code_loading.html#12.3

This part of the extension design will need help from C dynamic module
experts around, because it's terra incognita as far as I'm concerned.

Regards,
-- 
dim


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
I wrote:
> * pg_buffercache has changed the view pg_buffercache, which is
> definitely going to be a migration issue.  Need to test whether
> it represents a crash risk if the old definition is migrated.

I checked this, and there is not a crash risk: the function successfully
creates its result tuplestore, and then the main executor notices it's
not compatible with the old view.  So you get

regression=# select * from pg_buffercache;
ERROR:  function return row and query-specified return row do not match
DETAIL:  Returned row contains 8 attributes, but query expects 7.

You can fix it by dropping and recreating the view (eg, run the module's
uninstall and then install scripts).  I suppose that might be a bit
annoying if you've built additional views atop this one, but overall
it doesn't sound too bad.

So I don't plan to do anything about this module.  Still working on
the others.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Dimitri Fontaine wrote:
> >> So the case where pg_migrator still fails is when the .sql file of the
> >> module has changed in a way that restoring what pg_dump gives no longer
> >> match what the .so exposes, or if the new .so is non backward
> >> compatible?
> 
> > Yes, that is a problem.  It is not a pg_migrator-specific problem
> > because people traditionally bring the /contrib schema over from the old
> > install (I think).  The only pg_migrator-specific failure is when the
> > data format changed and dump/restore would fix it, but pg_migrator would
> > migrate corrupt data.  :-(
> 
> There is a different problem though: sometimes the recommended fix for
> contrib-module incompatibilities is to load the new contrib module into
> the target database before trying to load your old dump file.  (We told
> people to do that for 8.2->8.3 tsearch2, for example.)  In the
> pg_migrator context there is no way to insert the new contrib module
> first, and also no way to ignore the duplicate-object errors that you
> typically get while loading the old dump.
> 
> It would probably be a relatively simple feature addition to teach
> pg_migrator to load such-and-such modules into the new databases before
> loading the old dump.  But I'm still scared to death by the idea of
> letting it ignore errors, so there doesn't seem to be any good solution
> to this type of migration scenario.

Ah, OK, interesting.  We could have pg_migrator detect this issue and
fail right away with a message indicating pg_migrator cannot be used
unless those objects are dumped manually and the /contrib removed.

As long as pg_migrator is clear, I don't think people will complain.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> I think that #1 and #4 could be substantially alleviated if the
> >> instructions recommended doing a trial run with a schema-only dump
> >> of the database.  That is,
> >> 
> >> * pg_dumpall -s
> >> * load that into a test installation (of the *old* PG version)
> >> * migrate the test installation to new PG version
> >> * do the same sorts of applications compatibility checks you'd want to
> >> do anyway before a major version upgrade
> 
> > But you have no data in the database --- can any meaningful testing be
> > done?
> 
> Well, you'd have to insert some.  But this is something you'd have to do
> *anyway*, unless you are willing to just pray that your apps don't need
> any changes.  The only new thing I'm suggesting here is incorporating
> use of pg_migrator into your normal compatibility testing.

Ah, I see.  Interesting.  I have added the second sentence to the
pg_migrator README:
Installation------------See the INSTALL file for detailed installation instructions.  Fordeployment testing, create a
schema-onlycopy of the old cluster, insertdummy data, and migrate that.
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Dimitri Fontaine wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > Exactly.  And note that this is not pg_migrator's fault: a pg_dump
> > dump and reload of the database exposes the user to the same risks,
> > if the module author has not been careful about compatibility.
> 
> It seems to me the dump will contain text string representation of data
> and pg_restore will run the input function of the type on this, so that
> maintaining backward compatibility of the data type doesn't sound
> hard. As far as the specific index support goes, pg_restore creates the
> index from scratch.
> 
> So, from my point of view, supporting backward compatibility by means of
> dump and restore is the easy way. Introducing pg_migrator in the game,
> the data type and index internals upgrade are now faced to the same
> problem as the -core in-place upgrade project.
> 
> Maybe we'll be able to provide the extension authors (not only contrib)
> a specialized API to trigger in case of in-place upgrade of PG version
> or the extension itself, ala Erlang code upgrade facility e.g.:
>   http://erlang.org/doc/reference_manual/code_loading.html#12.3
> 
> This part of the extension design will need help from C dynamic module
> experts around, because it's terra incognita as far as I'm concerned.

At a minimum it would be great if items could mark themselves as
non-binary-upgradable.   Perhaps the existence of a symbol in the *.so
file could indicate that, or a function call could be made and you pass
in the old and new major version numbers and it would return true/false
based on binary upgradeability.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> There is a different problem though: sometimes the recommended fix for
>> contrib-module incompatibilities is to load the new contrib module into
>> the target database before trying to load your old dump file.  (We told
>> people to do that for 8.2->8.3 tsearch2, for example.)  In the
>> pg_migrator context there is no way to insert the new contrib module
>> first, and also no way to ignore the duplicate-object errors that you
>> typically get while loading the old dump.

> Ah, OK, interesting.  We could have pg_migrator detect this issue and
> fail right away with a message indicating pg_migrator cannot be used
> unless those objects are dumped manually and the /contrib removed.

How would it detect it?  I think the only thing you could do is
hard-wire tests for specific objects, which is klugy and doesn't
extend to third-party modules that you don't know about.

In most cases where there are major incompatibilities, attempting to
load the old dump file would fail anyway, so I don't think pg_migrator
really needs any hard-wired test.  It's the minor changes that are
risky.  I think ultimately the burden for those has to be on the module
author: he has to either avoid cross-version incompatibilities or make
sure they will fail safely.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Mon, Jun 8, 2009 at 11:08 AM, Bruce Momjian<bruce@momjian.us> wrote:
> Robert Haas wrote:
>> > Right now nothing in the project is referring to pg_migrator except in
>> > the press release, and it is marked as beta there. ?How do you want to
>> > deemphasize it more than that? ?Why did I bother working on this if the
>> > community reaction is to try to figure out how to make people avoid
>> > using it?
>>
>> Because Rome wasn't built in a day.
>>
>> It seems to me that you yourself placed a far more disparaging label
>> on it than anything that anyone has proposed today; this was a week
>> ago:
>>
>> http://archives.postgresql.org/pgsql-hackers/2009-05/msg01470.php
>>
>> I don't think it's anyone's intention to disparage your work on this
>> tool.  It certainly isn't mine.  But it seems obvious to me that it
>> has some pretty severe limitations and warts.  The fact that those
>> limitations and warts are well-documented doesn't negate their
>> existence.  I also don't think calling the software "beta" or
>> "experimental" is a way of deemphasizing it.  I think it's a way of
>> being clear that this software is not the bullet-proof, rock-solid,
>> handles-all-cases-and-keeps-on-trucking level of robustness that
>> people have come to expect from PostgreSQL.
>>
>> FWIW, I have no problem at all with mentioning pg_migrator in the
>> release notes or the documentation; my failure to respond to your last
>> emails on this topic was due to being busy and having already spent
>> too much time responding to other emails, not due to thinking it was a
>> bad idea.  I actually think it's a good idea.  But I also think those
>> references should describe it as experimental, because I think it is.
>> I really hope it won't remain experimental forever, but I think that's
>> an accurate characterization of where it is now.
>
> pg_migrator should be looked at critically here, and I agree we should
> avoid letting pg_migrator failures reflect badly on Postgres.
>
> Let me list the problems with pg_migrator:
>
>        o  /contrib and plugin migration (not unique to pg_migrator)
>        o  you must read/follow the install instructions
>        o  might require post-migration table/index rebuilds
>        o  new so serious bugs might exist

I pretty much agree with this list.  With respect to #2, I don't think
that it's asking a lot for people to read/follow the install
instructions, so I don't consider that a serious problem.

> and let me list its benefits:
>
>        o  first in-place upgrade capability in years
>        o  tested by some users, all successful (since late alpha)
>        o  removes major impediment to adoption
>        o  includes extensive error checking and reporting
>        o  contains detailed installation/usage instructions
>
> So let's look at pg_migrator as an opportunity and a risk.  As far as I
> know, only Hiroshi Saito and I have have looked at the code.  Why don't
> others read the pg_migrator source code looking for bugs?  Why have more
> people not test it?

No reason at all - I very much hope that happens.

> I think "experimental" is the wrong label.  Experimental assumes its
> usefulness is uncertain and that it is still being researched ---
> neither is true.  Once I release pg_migrator 8.4 final at the end of
> this week (assuming no bugs are reported), I consider it done, or at
> least advanced as far as I can go until I get more feedback from users.

Oh, to me "experimental" does not imply that usefulness is uncertain;
rather, it implies that usefulness has been established but that the
code is new (item #4 above) and may be not be 100% feature-complete
(items #1 and #3 above).

> I think we can say:  "pg_migrator is designed for experienced users with
> large databases, for whom the typical dump/restore required for major
> version upgrades is a hardship".

Precisely.  In other words, if you are an INEXPERIENCED user (that is
to say, most of them) or you don't have a particular large database,
dump + reload is probably the safest option.  We're not discouraging
you from use pg_migrator, but please be careful and observe that it is
new and has some limitations.

> I assume this will be the same adoption pattern we had with the Win32
> port, where it was a new platform in 8.0 and we dealt with some issues
> as it was deployed, and that people who want it will find it and
> hopefully it will be useful for them.

Completely agree.  And like the Windows port, hopefully after a
release or two, we'll figure out what we can improve and do so.  I am
interested in this problem but all of my free time lately has been
going into the EXPLAIN patch I'm working on, so I haven't had time to
dig into it much.  The problems of being a hobbyist...

...Robert


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> There is a different problem though: sometimes the recommended fix for
> >> contrib-module incompatibilities is to load the new contrib module into
> >> the target database before trying to load your old dump file.  (We told
> >> people to do that for 8.2->8.3 tsearch2, for example.)  In the
> >> pg_migrator context there is no way to insert the new contrib module
> >> first, and also no way to ignore the duplicate-object errors that you
> >> typically get while loading the old dump.
> 
> > Ah, OK, interesting.  We could have pg_migrator detect this issue and
> > fail right away with a message indicating pg_migrator cannot be used
> > unless those objects are dumped manually and the /contrib removed.
> 
> How would it detect it?  I think the only thing you could do is
> hard-wire tests for specific objects, which is klugy and doesn't
> extend to third-party modules that you don't know about.

Yep, that is the only way.  The good news is that we are not modifying
any data so it is just a detection issue, i.e. it would not destabilize
pg_migrator's operation.

> In most cases where there are major incompatibilities, attempting to
> load the old dump file would fail anyway, so I don't think pg_migrator
> really needs any hard-wired test.  It's the minor changes that are
> risky.  I think ultimately the burden for those has to be on the module
> author: he has to either avoid cross-version incompatibilities or make
> sure they will fail safely.

Yep.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> At a minimum it would be great if items could mark themselves as
> non-binary-upgradable.

It's hardly difficult to make that happen --- just change the C name of
some function, or the name of the whole .so file.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Robert Haas wrote:
> > Let me list the problems with pg_migrator:
> >
> > ? ? ? ?o ?/contrib and plugin migration (not unique to pg_migrator)
> > ? ? ? ?o ?you must read/follow the install instructions
> > ? ? ? ?o ?might require post-migration table/index rebuilds
> > ? ? ? ?o ?new so serious bugs might exist
> 
> I pretty much agree with this list.  With respect to #2, I don't think
> that it's asking a lot for people to read/follow the install
> instructions, so I don't consider that a serious problem.

My point was that I think someday pg_migrator will be point-and-click,
but it is not now.

> Oh, to me "experimental" does not imply that usefulness is uncertain;
> rather, it implies that usefulness has been established but that the
> code is new (item #4 above) and may be not be 100% feature-complete
> (items #1 and #3 above).
> 
> > I think we can say: ?"pg_migrator is designed for experienced users with
> > large databases, for whom the typical dump/restore required for major
> > version upgrades is a hardship".
> 
> Precisely.  In other words, if you are an INEXPERIENCED user (that is
> to say, most of them) or you don't have a particular large database,
> dump + reload is probably the safest option.  We're not discouraging
> you from use pg_migrator, but please be careful and observe that it is
> new and has some limitations.

Agreed.  There is no reason for most users to need pg_migrator;  it is
not worth the risk for them, however small.  There are some people who
really need it, and hopefully they are experienced users, while there is
a larger group who want to know such an option _exists_, so if they ever
need it, it is available.

> > I assume this will be the same adoption pattern we had with the Win32
> > port, where it was a new platform in 8.0 and we dealt with some issues
> > as it was deployed, and that people who want it will find it and
> > hopefully it will be useful for them.
> 
> Completely agree.  And like the Windows port, hopefully after a
> release or two, we'll figure out what we can improve and do so.  I am
> interested in this problem but all of my free time lately has been
> going into the EXPLAIN patch I'm working on, so I haven't had time to
> dig into it much.  The problems of being a hobbyist...

One difference in risk is that the Windows port usually had _new_ data
meaning you were not risking as much as using pg_migrator on an
estabilished database installation.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > At a minimum it would be great if items could mark themselves as
> > non-binary-upgradable.
> 
> It's hardly difficult to make that happen --- just change the C name of
> some function, or the name of the whole .so file.

Yes, but it needs to happen.  ;-)  PostGIS has done this, which is good.
The problem is that if they don't do it it is out of the control of
pg_migrator.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> > Oh, to me "experimental" does not imply that usefulness is uncertain;
> > rather, it implies that usefulness has been established but that the
> > code is new (item #4 above) and may be not be 100% feature-complete
> > (items #1 and #3 above).
> > 
> > > I think we can say: ?"pg_migrator is designed for experienced users with
> > > large databases, for whom the typical dump/restore required for major
> > > version upgrades is a hardship".
> > 
> > Precisely.  In other words, if you are an INEXPERIENCED user (that is
> > to say, most of them) or you don't have a particular large database,
> > dump + reload is probably the safest option.  We're not discouraging
> > you from use pg_migrator, but please be careful and observe that it is
> > new and has some limitations.
> 
> Agreed.  There is no reason for most users to need pg_migrator;  it is
> not worth the risk for them, however small.  There are some people who
> really need it, and hopefully they are experienced users, while there is
> a larger group who want to know such an option _exists_, so if they ever
> need it, it is available.

I think this "larger group" is where my problem with the word
"experimental" come in.  I think pg_migrator is far enough along that we
know it works, and that it will probably work for future major releases.
By calling it "experimental" we are not conveying confidence in the tool
for people who are making deployment decisions based on the existence of
such tool, even if they aren't going to use it initially.  And by not
conveying confidence, we will lose the adoption advantage we can get
from pg_migrator.

Now, we don't want to over-promise, but at the same time we shouldn't
downplay the tool either.  For a sufficiently-experienced administrator,
pg_migrator is a useful migration tool, and we need to convey that. 
Even if you have to hire a consultant to manage the migration, if it
saves days of downtime, it is worth it.  Consultants don't often use
experimental tools, but they do use complex, powerful tools that are
often rough around the edges in terms of usability, e.g. read the
INSTALL file carefully.

For longterm strategy, let me list the challenges for pg_migrator from
any major upgrade (listed in the DEVELOPERS file):
Change                                  Conversion
Method------------------------------------------------------------------------------clog
  noneheap page header, including bitmask     convert to new page format on readtuple header, including bitmask
convertto new page format on readdata value format                       create old data type in new clusterindex page
format                      reindex, or recreate old index methods
 

These are the issue we will have to address for 8.5 and beyond if
pg_migrator is to remain useful.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Mon, Jun 8, 2009 at 1:06 PM, Bruce Momjian<bruce@momjian.us> wrote:
> Tom Lane wrote:
>> Bruce Momjian <bruce@momjian.us> writes:
>> > At a minimum it would be great if items could mark themselves as
>> > non-binary-upgradable.
>>
>> It's hardly difficult to make that happen --- just change the C name of
>> some function, or the name of the whole .so file.
>
> Yes, but it needs to happen.  ;-)  PostGIS has done this, which is good.
> The problem is that if they don't do it it is out of the control of
> pg_migrator.

I think it might be possible to implement a system that can't be
broken by accident.  Firefox (at least AIUI) requires plugin authors
to explicitly flag their modules as compatible with new versions of
Firefox.  When you upgrade your firefox installation in place (heh,
heh) it goes off to the web site and checks whether all of your
extensions have been so flagged.  Any that have not been get disabled
automatically.

Obviously we don't want to get into connecting to a web site, but we
could probably come up with some other API for .so files to indicate
which versions of PG they're compatible with.  If they don't implement
that API, we assume the predate its introduction and are not
upgradeable.  I'm fuzzy on the details here but the point is that if
you implement an opt-in system rather than an opt-out system then
people have to deliberately circumvent it to break things, rather than
just needing to be lazy.

...Robert


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Mon, Jun 8, 2009 at 1:20 PM, Bruce Momjian<bruce@momjian.us> wrote:
> Bruce Momjian wrote:
>> > Oh, to me "experimental" does not imply that usefulness is uncertain;
>> > rather, it implies that usefulness has been established but that the
>> > code is new (item #4 above) and may be not be 100% feature-complete
>> > (items #1 and #3 above).
>> >
>> > > I think we can say: ?"pg_migrator is designed for experienced users with
>> > > large databases, for whom the typical dump/restore required for major
>> > > version upgrades is a hardship".
>> >
>> > Precisely.  In other words, if you are an INEXPERIENCED user (that is
>> > to say, most of them) or you don't have a particular large database,
>> > dump + reload is probably the safest option.  We're not discouraging
>> > you from use pg_migrator, but please be careful and observe that it is
>> > new and has some limitations.
>>
>> Agreed.  There is no reason for most users to need pg_migrator;  it is
>> not worth the risk for them, however small.  There are some people who
>> really need it, and hopefully they are experienced users, while there is
>> a larger group who want to know such an option _exists_, so if they ever
>> need it, it is available.
>
> I think this "larger group" is where my problem with the word
> "experimental" come in.  I think pg_migrator is far enough along that we
> know it works, and that it will probably work for future major releases.
> By calling it "experimental" we are not conveying confidence in the tool
> for people who are making deployment decisions based on the existence of
> such tool, even if they aren't going to use it initially.  And by not
> conveying confidence, we will lose the adoption advantage we can get
> from pg_migrator.
>
> Now, we don't want to over-promise, but at the same time we shouldn't
> downplay the tool either.  For a sufficiently-experienced administrator,
> pg_migrator is a useful migration tool, and we need to convey that.
> Even if you have to hire a consultant to manage the migration, if it
> saves days of downtime, it is worth it.  Consultants don't often use
> experimental tools, but they do use complex, powerful tools that are
> often rough around the edges in terms of usability, e.g. read the
> INSTALL file carefully.

Fair enough.  I'm game to use a different word.  I spent approximately
30 seconds coming up with that suggestion.  :-)

> For longterm strategy, let me list the challenges for pg_migrator from
> any major upgrade (listed in the DEVELOPERS file):
>
>  Change                                  Conversion Method
>  ------------------------------------------------------------------------------
>  clog                                    none
>  heap page header, including bitmask     convert to new page format on read
>  tuple header, including bitmask         convert to new page format on read
>  data value format                       create old data type in new cluster
>  index page format                       reindex, or recreate old index methods
>
> These are the issue we will have to address for 8.5 and beyond if
> pg_migrator is to remain useful.

No arguments here, sounds like interesting stuff.

...Robert


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> Obviously we don't want to get into connecting to a web site, but we
> could probably come up with some other API for .so files to indicate
> which versions of PG they're compatible with.

You mean like PG_MODULE_MAGIC?
        regards, tom lane


Re: pg_migrator issue with contrib

From
Robert Haas
Date:
On Mon, Jun 8, 2009 at 1:32 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Obviously we don't want to get into connecting to a web site, but we
>> could probably come up with some other API for .so files to indicate
>> which versions of PG they're compatible with.
>
> You mean like PG_MODULE_MAGIC?

Hey, how about that.  Why doesn't that solve our problem here?

[ thinks ... ] I guess it's because there's no guarantee that the
function is installed on the SQL level with the signature that is
appropriate on the C level.  To fix that, I suppose we'd need to
version the contents of the .sql file that installs the definitions,
which gets back to the problem of building a general-purpose module
facility.

...Robert


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Robert Haas wrote:
> On Mon, Jun 8, 2009 at 1:06 PM, Bruce Momjian<bruce@momjian.us> wrote:
> > Tom Lane wrote:
> >> Bruce Momjian <bruce@momjian.us> writes:
> >> > At a minimum it would be great if items could mark themselves as
> >> > non-binary-upgradable.
> >>
> >> It's hardly difficult to make that happen --- just change the C name of
> >> some function, or the name of the whole .so file.
> >
> > Yes, but it needs to happen. ?;-) ?PostGIS has done this, which is good.
> > The problem is that if they don't do it it is out of the control of
> > pg_migrator.
> 
> I think it might be possible to implement a system that can't be
> broken by accident.  Firefox (at least AIUI) requires plugin authors
> to explicitly flag their modules as compatible with new versions of
> Firefox.  When you upgrade your firefox installation in place (heh,
> heh) it goes off to the web site and checks whether all of your
> extensions have been so flagged.  Any that have not been get disabled
> automatically.

Interesting it allows the flagging to happen in real-time, rather than
requiring the system to know at install time whether it is compatible
with future verions (almost an impossibility).  I am afraid we would
need some kind of real-time check, or at least have major versions flag
which external stuff cannot be upgraded, which we have discussed here
already.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Robert Haas wrote:
> > Now, we don't want to over-promise, but at the same time we shouldn't
> > downplay the tool either. ?For a sufficiently-experienced administrator,
> > pg_migrator is a useful migration tool, and we need to convey that.
> > Even if you have to hire a consultant to manage the migration, if it
> > saves days of downtime, it is worth it. ?Consultants don't often use
> > experimental tools, but they do use complex, powerful tools that are
> > often rough around the edges in terms of usability, e.g. read the
> > INSTALL file carefully.
> 
> Fair enough.  I'm game to use a different word.  I spent approximately
> 30 seconds coming up with that suggestion.  :-)

I think the text I already posted is appropriate:
pg_migrator is designed for experienced users with large databases, forwhom the typical dump/restore required for major
versionupgrades is ahardship.
 

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
I wrote:
> [ concerning various migration hazards in contrib/ ]

> * isn has got the nastiest problems of the lot: since it piggybacks
> on type bigint, a migrated database might work, or might crash
> miserably, depending on whether bigint has become pass-by-value
> in the new database.  I'm not very sure if we can fix this reasonably.

I've spent some time thinking about possible workarounds for this, and
not really come up with any.  The only feasible thing I can think of
to do is teach pg_migrator to refuse to migrate if (a) the old DB
contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
can be checked in pg_control).  One question here is how you decide
if the old DB contains contrib/isn.  I don't think looking for the
type name per se is a hot idea.  The best plan that has come to mind
is to look through pg_proc to see if there are any C-language functions
that reference "$libdir/isn".

> * pg_freespacemap has made *major* changes in its ABI.  There's
> probably no hope of this working either, but we need to be sure
> it's not a crash risk.

This turns out not to be a problem: the set of exposed C function names
changed, so the function definitions will fail to migrate.  Dropping
the module before migrating and reinstalling it afterwards is an easy
workaround.

> * pgstattuple has made changes in the output types of its functions.
> This is a serious crash risk, and I'm not immediately sure how to
> fix it.  Note that simply migrating the module will not draw any
> errors.

This doesn't seem to create serious problems either.  pgstatindex() has
changed some output columns from int4 to int8, but because it creates
the result tuple from text strings, it manages to just work anyway.
(In principle you could get some overflow problems with very large
indexes, but I doubt that's an issue in practice; and it couldn't cause
a crash anyway.)  pg_relpages() likewise changed its return type, but
in this particular case you could only get garbage answers not a crash.
So I think we can just tell people to reinstall the SQL file after
migration.

> * tsearch2 has opclass support function changes, but unlike other
> cases of this, it will fail to migrate to 8.4 because the functions
> are references to core functions instead of being declared in the
> module.  Also, "drop it first" isn't a very useful recommendation
> since the domains it defines might be used somewhere.

It would be nice if this migrated cleanly, but it doesn't and there's
not much we can do about it.  At least it will fail safely.

So, other than the suggested pg_migrator hack for contrib/isn, the only
thing left to do here is fix dblink_current_query().  I'll take care of
that, but not till after Joe commits his remaining patch, so as not to
risk creating merge hazards for him.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Jun 8, 2009 at 1:32 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> You mean like PG_MODULE_MAGIC?

> Hey, how about that.  Why doesn't that solve our problem here?

> [ thinks ... ] I guess it's because there's no guarantee that the
> function is installed on the SQL level with the signature that is
> appropriate on the C level.

Yeah.  And it's more than just the function itself.  For example,
in the contrib/isn mess, the function definitions didn't change.
The problem is the passbyval flag (or lack of it) on the type
definition.

I think we've speculated in the past about having ways of embedding
per-function data into the .so libraries so that these sorts of
things could be caught automatically.  But it'd be a lot of work
for rather limited reward.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> I wrote:
> > [ concerning various migration hazards in contrib/ ]
> 
> > * isn has got the nastiest problems of the lot: since it piggybacks
> > on type bigint, a migrated database might work, or might crash
> > miserably, depending on whether bigint has become pass-by-value
> > in the new database.  I'm not very sure if we can fix this reasonably.
> 
> I've spent some time thinking about possible workarounds for this, and
> not really come up with any.  The only feasible thing I can think of
> to do is teach pg_migrator to refuse to migrate if (a) the old DB
> contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
> can be checked in pg_control).  One question here is how you decide
> if the old DB contains contrib/isn.  I don't think looking for the
> type name per se is a hot idea.  The best plan that has come to mind
> is to look through pg_proc to see if there are any C-language functions
> that reference "$libdir/isn".

Sure, pg_migrator is good at checking.  Please confirm you want this
added to pg_migrator.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Alvaro Herrera
Date:
Bruce Momjian escribió:

> For longterm strategy, let me list the challenges for pg_migrator from
> any major upgrade (listed in the DEVELOPERS file):
> 
>  Change                                  Conversion Method
>  ------------------------------------------------------------------------------
>  clog                                    none
>  heap page header, including bitmask     convert to new page format on read
>  tuple header, including bitmask         convert to new page format on read
>  data value format                       create old data type in new cluster
>  index page format                       reindex, or recreate old index methods

TOAST changes?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: pg_migrator issue with contrib

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> I've spent some time thinking about possible workarounds for this, and
>> not really come up with any.  The only feasible thing I can think of
>> to do is teach pg_migrator to refuse to migrate if (a) the old DB
>> contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
>> can be checked in pg_control).  One question here is how you decide
>> if the old DB contains contrib/isn.  I don't think looking for the
>> type name per se is a hot idea.  The best plan that has come to mind
>> is to look through pg_proc to see if there are any C-language functions
>> that reference "$libdir/isn".

> Sure, pg_migrator is good at checking.  Please confirm you want this
> added to pg_migrator.

Yeah, I'd suggest it.  Even if we later come up with a workaround for
contrib/isn, you're going to want to have the infrastructure in place
for this type of check, because there will surely be cases that need it.

Note that I think the FLOAT8PASSYVAL check is a must.  There is no
reason to forbid migrating isn on 32-bit machines, for example.
        regards, tom lane


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Alvaro Herrera wrote:
> Bruce Momjian escribi?:
> 
> > For longterm strategy, let me list the challenges for pg_migrator from
> > any major upgrade (listed in the DEVELOPERS file):
> > 
> >  Change                                  Conversion Method
> >  ------------------------------------------------------------------------------
> >  clog                                    none
> >  heap page header, including bitmask     convert to new page format on read
> >  tuple header, including bitmask         convert to new page format on read
> >  data value format                       create old data type in new cluster
> >  index page format                       reindex, or recreate old index methods
> 
> TOAST changes?

Good point, added.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_migrator issue with contrib

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> I've spent some time thinking about possible workarounds for this, and
> >> not really come up with any.  The only feasible thing I can think of
> >> to do is teach pg_migrator to refuse to migrate if (a) the old DB
> >> contains contrib/isn, and (b) the new DB has FLOAT8PASSYVAL (which
> >> can be checked in pg_control).  One question here is how you decide
> >> if the old DB contains contrib/isn.  I don't think looking for the
> >> type name per se is a hot idea.  The best plan that has come to mind
> >> is to look through pg_proc to see if there are any C-language functions
> >> that reference "$libdir/isn".
>
> > Sure, pg_migrator is good at checking.  Please confirm you want this
> > added to pg_migrator.
>
> Yeah, I'd suggest it.  Even if we later come up with a workaround for
> contrib/isn, you're going to want to have the infrastructure in place
> for this type of check, because there will surely be cases that need it.
>
> Note that I think the FLOAT8PASSYVAL check is a must.  There is no
> reason to forbid migrating isn on 32-bit machines, for example.

Done, with patch attached, and pg_migrator beta6 released.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
? tools
? log
? src/pg_migrator
Index: src/controldata.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/controldata.c,v
retrieving revision 1.14
diff -c -r1.14 controldata.c
*** src/controldata.c    13 May 2009 15:19:16 -0000    1.14
--- src/controldata.c    8 Jun 2009 21:29:31 -0000
***************
*** 23,29 ****
   */
  void
  get_control_data(migratorContext *ctx, const char *bindir,
!                  const char *datadir, ControlData *ctrl)
  {
      char        cmd[MAXPGPATH];
      char        bufin[MAX_STRING];
--- 23,30 ----
   */
  void
  get_control_data(migratorContext *ctx, const char *bindir,
!                  const char *datadir, ControlData *ctrl,
!                  const uint32 pg_version)
  {
      char        cmd[MAXPGPATH];
      char        bufin[MAX_STRING];
***************
*** 43,48 ****
--- 44,50 ----
      bool        got_index = false;
      bool        got_toast = false;
      bool        got_date_is_int = false;
+     bool        got_float8_pass_by_value = false;
      char       *lang = NULL;

      /* Because we test the pg_resetxlog output strings, it has to be in English. */
***************
*** 65,71 ****
      /* Only pre-8.4 has these so if they are not set below we will check later */
      ctrl->lc_collate = NULL;
      ctrl->lc_ctype = NULL;
!
      /* we have the result of cmd in "output". so parse it line by line now */
      while (fgets(bufin, sizeof(bufin), output))
      {
--- 67,80 ----
      /* Only pre-8.4 has these so if they are not set below we will check later */
      ctrl->lc_collate = NULL;
      ctrl->lc_ctype = NULL;
!
!     /* Not in pre-8.4 */
!     if (pg_version < 80400)
!     {
!         ctrl->float8_pass_by_value = false;
!         got_float8_pass_by_value = true;
!     }
!
      /* we have the result of cmd in "output". so parse it line by line now */
      while (fgets(bufin, sizeof(bufin), output))
      {
***************
*** 249,254 ****
--- 258,275 ----
              ctrl->date_is_int = strstr(p, "64-bit integers") != NULL;
              got_date_is_int = true;
          }
+         else if ((p = strstr(bufin, "Float8 argument passing:")) != NULL)
+         {
+             p = strchr(p, ':');
+
+             if (p == NULL || strlen(p) <= 1)
+                 pg_log(ctx, PG_FATAL, "%d: pg_resetxlog  problem\n", __LINE__);
+
+             p++;                /* removing ':' char */
+             /* used later for /contrib check */
+             ctrl->float8_pass_by_value = strstr(p, "by value") != NULL;
+             got_float8_pass_by_value = true;
+         }
          /* In pre-8.4 only */
          else if ((p = strstr(bufin, "LC_COLLATE:")) != NULL)
          {
***************
*** 305,311 ****
      if (!got_xid || !got_oid || !got_log_id || !got_log_seg || !got_tli ||
          !got_align || !got_blocksz || !got_largesz || !got_walsz ||
          !got_walseg || !got_ident || !got_index || !got_toast ||
!         !got_date_is_int)
      {
          pg_log(ctx, PG_REPORT,
              "Some required control information is missing;  cannot find:\n");
--- 326,332 ----
      if (!got_xid || !got_oid || !got_log_id || !got_log_seg || !got_tli ||
          !got_align || !got_blocksz || !got_largesz || !got_walsz ||
          !got_walseg || !got_ident || !got_index || !got_toast ||
!         !got_date_is_int || !got_float8_pass_by_value)
      {
          pg_log(ctx, PG_REPORT,
              "Some required control information is missing;  cannot find:\n");
***************
*** 352,357 ****
--- 373,382 ----
          if (!got_date_is_int)
              pg_log(ctx, PG_REPORT, "  dates/times are integers?\n");

+         /* value added in Postgres 8.4 */
+         if (!got_float8_pass_by_value)
+             pg_log(ctx, PG_REPORT, "  float8 argument passing method\n");
+
          pg_log(ctx, PG_FATAL,
                 "Unable to continue without required control information, terminating\n");
      }
Index: src/pg_migrator.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/pg_migrator.c,v
retrieving revision 1.42
diff -c -r1.42 pg_migrator.c
*** src/pg_migrator.c    4 Jun 2009 22:01:55 -0000    1.42
--- src/pg_migrator.c    8 Jun 2009 21:29:31 -0000
***************
*** 71,76 ****
--- 71,77 ----
      {
          v8_3_check_for_name_data_type_usage(&ctx, CLUSTER_OLD);
          v8_3_check_for_tsquery_usage(&ctx, CLUSTER_OLD);
+         v8_3_check_for_isn_and_int8_passing_mismatch(&ctx, CLUSTER_OLD);
          if (ctx.check)
          {
              v8_3_rebuild_tsvector_tables(&ctx, true, CLUSTER_OLD);
***************
*** 560,568 ****

      /* get/check pg_control data of servers */
      get_control_data(ctx, ctx->old.bindir, ctx->old.pgdata,
!                      &ctx->old.controldata);
      get_control_data(ctx, ctx->new.bindir, ctx->new.pgdata,
!                      &ctx->new.controldata);
      check_control_data(ctx, &ctx->old.controldata, &ctx->new.controldata);
  }

--- 561,569 ----

      /* get/check pg_control data of servers */
      get_control_data(ctx, ctx->old.bindir, ctx->old.pgdata,
!                      &ctx->old.controldata, ctx->old.pg_version);
      get_control_data(ctx, ctx->new.bindir, ctx->new.pgdata,
!                      &ctx->new.controldata, ctx->new.pg_version);
      check_control_data(ctx, &ctx->old.controldata, &ctx->new.controldata);
  }

Index: src/pg_migrator.h
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/pg_migrator.h,v
retrieving revision 1.51
diff -c -r1.51 pg_migrator.h
*** src/pg_migrator.h    8 Jun 2009 18:33:32 -0000    1.51
--- src/pg_migrator.h    8 Jun 2009 21:29:31 -0000
***************
*** 145,150 ****
--- 145,151 ----
      uint32        index;
      uint32        toast;
      bool        date_is_int;
+     bool        float8_pass_by_value;
      char       *lc_collate;
      char       *lc_ctype;
      char       *encoding;
***************
*** 244,250 ****
  /* controldata.c */

  void get_control_data(migratorContext *ctx, const char *bindir,
!                  const char *datadir, ControlData *ctrl);
  void check_control_data(migratorContext *ctx, ControlData *oldctrl,
                     ControlData *newctrl);
  void set_locale_and_encoding(migratorContext *ctx, Cluster whichCluster);
--- 245,252 ----
  /* controldata.c */

  void get_control_data(migratorContext *ctx, const char *bindir,
!                  const char *datadir, ControlData *ctrl,
!                  const uint32 pg_version);
  void check_control_data(migratorContext *ctx, ControlData *oldctrl,
                     ControlData *newctrl);
  void set_locale_and_encoding(migratorContext *ctx, Cluster whichCluster);
***************
*** 375,380 ****
--- 377,384 ----
                              Cluster whichCluster);
  void        v8_3_check_for_tsquery_usage(migratorContext *ctx,
                              Cluster whichCluster);
+ void        v8_3_check_for_isn_and_int8_passing_mismatch(migratorContext *ctx,
+                             Cluster whichCluster);
  void        v8_3_rebuild_tsvector_tables(migratorContext *ctx,
                              bool check_mode, Cluster whichCluster);
  void        v8_3_invalidate_hash_gin_indexes(migratorContext *ctx,
Index: src/version.c
===================================================================
RCS file: /cvsroot/pg-migrator/pg_migrator/src/version.c,v
retrieving revision 1.17
diff -c -r1.17 version.c
*** src/version.c    31 May 2009 20:25:48 -0000    1.17
--- src/version.c    8 Jun 2009 21:29:31 -0000
***************
*** 89,95 ****
                  "| Your installation uses the \"name\" data type in user tables.\n"
                  "| This data type changed its internal alignment between your old and\n"
                  "| new clusters so this cluster cannot currently be upgraded.\n"
!                 "| You can remove the problem tables and restart this program.\n"
                  "| Simply dropping the offending columns will not work.\n"
                  "| A list of the problem columns is in the file\n"
                  "| \"%s\".\n\n", output_path);
--- 89,95 ----
                  "| Your installation uses the \"name\" data type in user tables.\n"
                  "| This data type changed its internal alignment between your old and\n"
                  "| new clusters so this cluster cannot currently be upgraded.\n"
!                 "| You can remove the problem tables and restart the migration.\n"
                  "| Simply dropping the offending columns will not work.\n"
                  "| A list of the problem columns is in the file\n"
                  "| \"%s\".\n\n", output_path);
***************
*** 178,184 ****
                  "| Your installation uses the \"tsquery\" data type.\n"
                  "| This data type added a new internal field between your old and\n"
                  "| new clusters so this cluster cannot currently be upgraded.\n"
!                 "| You can remove the problem columns and restart this program.\n"
                  "| A list of the problem columns is in the file\n"
                  "| \"%s\".\n\n", output_path);
      }
--- 178,184 ----
                  "| Your installation uses the \"tsquery\" data type.\n"
                  "| This data type added a new internal field between your old and\n"
                  "| new clusters so this cluster cannot currently be upgraded.\n"
!                 "| You can remove the problem columns and restart the migration.\n"
                  "| A list of the problem columns is in the file\n"
                  "| \"%s\".\n\n", output_path);
      }
***************
*** 188,193 ****
--- 188,286 ----


  /*
+  * v8_3_check_for_isn_and_int8_passing_mismatch()
+  *
+  *    /contrib/isn relies on data type bigint, and the CREATE TYPE
+  *  PASSEDBYVALUE setting might not match in the old and new servers,
+  *    so the new shared object file would work unreliably.  Passing int8
+  *    by value is new in Postgres 8.4.
+  */
+ void
+ v8_3_check_for_isn_and_int8_passing_mismatch(migratorContext *ctx, Cluster whichCluster)
+ {
+     ClusterInfo    *active_cluster = (whichCluster == CLUSTER_OLD) ?
+                     &ctx->old : &ctx->new;
+     int            dbnum;
+     FILE        *script = NULL;
+     bool        found = false;
+     char        output_path[MAXPGPATH];
+
+     prep_status(ctx, "Checking for /contrib/isn with bigint-passing mismatch");
+
+     if (ctx->old.controldata.float8_pass_by_value ==
+         ctx->new.controldata.float8_pass_by_value)
+     {
+         /* no mismatch */
+         check_ok(ctx);
+         return;
+     }
+
+     snprintf(output_path, sizeof(output_path), "%s/contrib_isn_and_int8_pass_by_value.txt",
+             ctx->home_dir);
+
+     for (dbnum = 0; dbnum < active_cluster->dbarr.ndbs; dbnum++)
+     {
+         PGresult   *res;
+          bool        db_used = false;
+         int            ntups;
+         int            rowno;
+         int            i_nspname, i_proname;
+         DbInfo       *active_db = &active_cluster->dbarr.dbs[dbnum];
+         PGconn       *conn = connectToServer(ctx, active_db->db_name, whichCluster);
+
+         /* Find any user-defined tsquery columns */
+         res = executeQueryOrDie(ctx, conn,
+                                 "SELECT n.nspname, p.proname "
+                                 "FROM    pg_catalog.pg_proc p, "
+                                 "        pg_catalog.pg_namespace n "
+                                 "WHERE    p.pronamespace = n.oid AND "
+                                 "        p.probin = '$libdir/isn' AND "
+                                 "        n.nspname != 'pg_catalog' AND "
+                                 "        n.nspname != 'information_schema'");
+
+         ntups = PQntuples(res);
+         i_nspname = PQfnumber(res, "nspname");
+         i_proname = PQfnumber(res, "proname");
+         for (rowno = 0; rowno < ntups; rowno++)
+         {
+             found = true;
+             if (script == NULL && (script = fopen(output_path, "w")) == NULL)
+                     pg_log(ctx, PG_FATAL, "Could not create necessary file:  %s\n", output_path);
+             if (!db_used)
+             {
+                 fprintf(script, "Database:  %s\n", active_db->db_name);
+                 db_used = true;
+             }
+             fprintf(script, "  %s.%s\n",
+                     PQgetvalue(res, rowno, i_nspname),
+                     PQgetvalue(res, rowno, i_proname));
+         }
+
+         PQclear(res);
+
+         PQfinish(conn);
+     }
+
+     if (found)
+     {
+         fclose(script);
+         pg_log(ctx, PG_REPORT, "fatal\n");
+         pg_log(ctx, PG_FATAL,
+                 "| Your installation uses \"/contrib/isn\" functions\n"
+                 "| which rely on the bigint data type.  Your old and new\n"
+                 "| clusters pass bigint values differently so this cluster\n"
+                 "| cannot currently be upgraded.  You can manually migrate\n"
+                 "| data that use \"/contrib/isn\" facilities and remove\n"
+                 "| \"/contrib/isn\" from the old cluster and restart the\n"
+                 "| migration.  A list of the problem functions is in the\n"
+                 "| file \"%s\".\n\n", output_path);
+     }
+     else
+         check_ok(ctx);
+ }
+
+
+ /*
   * v8_3_rebuild_tsvector_tables()
   *
   * 8.3 sorts lexemes by its length and if lengths are the same then it uses