9.2 to 9.5 pg_upgrade losing data - Mailing list pgsql-general

From Pete Fuller
Subject 9.2 to 9.5 pg_upgrade losing data
Date
Msg-id 643C2793-2349-44F5-BF6F-E5C4173FAF0E@3sitracking.com
Whole thread Raw
Responses Re: 9.2 to 9.5 pg_upgrade losing data  (Adrian Klaver <adrian.klaver@aklaver.com>)
List pgsql-general
Hello all,

We are attempting to upgrade a production 9.2 postgres cluster to 9.5. The server is running Centos 7 with the Centos
version- currently 9.2.15 We are installing the postgresql provided rpm of postgresql9.5 from the postgresql repo,
currently9.5.4. In testing and on our development cluster, the upgrade worked without issue but something is happening
onour production DB that we can not figure out. 

The 9.2 cluster is in streaming replication controlled by pacemaker. Steps we have been using are the following.

    • verify the replica server is in a good state with up to date data, then put the machine in standby, stopping the
9.2db 
    • Removing the recovery.conf file and starting the 9.2 server manually to get some stats to verify later (counts of
rowsfrom several key tables) 
    • Stop the 9.2 server, (/bin/pg_ctl stop -D /var/lib/pgsql/data ) and install the 9.5 binaries.
    • init the 9.5 db
    • run a pg_upgrade with the link option. The full command we are using is
/usr/pgsql-9.5/bin/pg_upgrade -d /var/lib/pgsql/data -D /var/lib/pgsql/9.5/data -b /bin -B /usr/pgsql-9.5/bin/ --link
-r-v  
    • run the full vacuum analyze script that pg_upgrade generated

What we are finding in testing is we get thru the upgrade process, start the 9.5 server manually to compare the row
countswe collected immediately before the upgrade and we are seeing different data, sometimes drastically. This data
includesmissing user accounts, etc, that is not perishable data. 

For example - 9.2 standalone before running pg_upgrade gives this information
tracks=# select count(*) from users;
 count
-------
 13945
(1 row)

tracks=# select count(*) from sessions;
 count
-------
   559
(1 row)

PG 9.5 after the upgrade - same machine, upgraded database

tracks=# select count(*) from sessions;
-------
   155
(1 row)

tracks=# select count(*) from users;
 count
-------
 13157
(1 row)


We have enabled verbose logging and have reviewed it but are seeing no errors in the migration. The 9.5 server starts
withoutcomplaint and if we did not do these manual checks we would not see a problem. We have attempted this upgrade on
thismachine repeatedly in prepping for our maintenance windows, and on occasion our manual checks have shown good
numbers. At this point, though, it has failed silently enough times that I have little confidence in the data even if
ourrudimentary checks do show identical counts after the upgrade.  

Has anyone else seen this issue?

The only 'hack' during the upgrade that isn't straight from the postgres_upgrade man page is to handle a renamed option
inpostgresql that cents back ported - unix_socket_directories. Using the rename of pg_ctl that is referenced in the top
answerhere to force a couple variables into the centos provided pg_ctl binary -
http://dba.stackexchange.com/questions/50135/pg-upgrade-unrecognized-configuration-parameter-unix-socket-directory


Any information or suggestions would be helpful at this point.






pgsql-general by date:

Previous
From: Sameer Kumar
Date:
Subject: Re: RowExclusiveLock timeout while autovacuum
Next
From: Михаил
Date:
Subject: regexp_replace double quote