Re: 9.2 to 9.5 pg_upgrade losing data - Mailing list pgsql-general

From Adrian Klaver
Subject Re: 9.2 to 9.5 pg_upgrade losing data
Date
Msg-id 45fd2e61-f13c-d99b-f61e-e2f0db1b9a62@aklaver.com
Whole thread Raw
In response to Re: 9.2 to 9.5 pg_upgrade losing data  (Pete Fuller <pfuller@3sitracking.com>)
Responses Re: 9.2 to 9.5 pg_upgrade losing data  (Pete Fuller <pfuller@3sitracking.com>)
List pgsql-general
On 08/15/2016 07:40 AM, Pete Fuller wrote:
> Directories are correct.  We do not utilize tablespaces.

Anything obviously different in the setup between your production
servers and the testing and development clusters?

>
>
>
>
>> On Aug 15, 2016, at 10:06 AM, Adrian Klaver <adrian.klaver@aklaver.com
>> <mailto:adrian.klaver@aklaver.com>> wrote:
>>
>> On 08/15/2016 06:20 AM, Pete Fuller wrote:
>>> Hello all,
>>>
>>> We are attempting to upgrade a production 9.2 postgres cluster to
>>> 9.5. The server is running Centos 7 with the Centos version -
>>> currently 9.2.15 We are installing the postgresql provided rpm of
>>> postgresql9.5 from the postgresql repo, currently 9.5.4. In testing
>>> and on our development cluster, the upgrade worked without issue but
>>> something is happening on our production DB that we can not figure out.
>>>
>>> The 9.2 cluster is in streaming replication controlled by pacemaker.
>>> Steps we have been using are the following.
>>>
>>> • verify the replica server is in a good state with up to date data,
>>> then put the machine in standby, stopping the 9.2 db
>>> • Removing the recovery.conf file and starting the 9.2 server
>>> manually to get some stats to verify later (counts of rows from
>>> several key tables)
>>> • Stop the 9.2 server, (/bin/pg_ctl stop -D /var/lib/pgsql/data ) and
>>> install the 9.5 binaries.
>>> • init the 9.5 db
>>> • run a pg_upgrade with the link option. The full command we are using is
>>> /usr/pgsql-9.5/bin/pg_upgrade -d /var/lib/pgsql/data -D
>>> /var/lib/pgsql/9.5/data -b /bin -B /usr/pgsql-9.5/bin/ --link -r -v
>>
>> Are you sure you are linking the appropriate directories?
>>
>> Are there any tablespaces in the mix?
>>
>>> • run the full vacuum analyze script that pg_upgrade generated
>>>
>>> What we are finding in testing is we get thru the upgrade process,
>>> start the 9.5 server manually to compare the row counts we collected
>>> immediately before the upgrade and we are seeing different data,
>>> sometimes drastically. This data includes missing user accounts, etc,
>>> that is not perishable data.
>>>
>>> For example - 9.2 standalone before running pg_upgrade gives this
>>> information
>>> tracks=# select count(*) from users;
>>> count
>>> -------
>>> 13945
>>> (1 row)
>>>
>>> tracks=# select count(*) from sessions;
>>> count
>>> -------
>>>   559
>>> (1 row)
>>>
>>> PG 9.5 after the upgrade - same machine, upgraded database
>>>
>>> tracks=# select count(*) from sessions;
>>> -------
>>>   155
>>> (1 row)
>>>
>>> tracks=# select count(*) from users;
>>> count
>>> -------
>>> 13157
>>> (1 row)
>>>
>>>
>>> We have enabled verbose logging and have reviewed it but are seeing
>>> no errors in the migration. The 9.5 server starts without complaint
>>> and if we did not do these manual checks we would not see a problem.
>>> We have attempted this upgrade on this machine repeatedly in prepping
>>> for our maintenance windows, and on occasion our manual checks have
>>> shown good numbers.  At this point, though, it has failed silently
>>> enough times that I have little confidence in the data even if our
>>> rudimentary checks do show identical counts after the upgrade.
>>>
>>> Has anyone else seen this issue?
>>>
>>> The only 'hack' during the upgrade that isn't straight from the
>>> postgres_upgrade man page is to handle a renamed option in postgresql
>>> that cents back ported - unix_socket_directories. Using the rename of
>>> pg_ctl that is referenced in the top answer here to force a couple
>>> variables into the centos provided pg_ctl binary -
>>> http://dba.stackexchange.com/questions/50135/pg-upgrade-unrecognized-configuration-parameter-unix-socket-directory
>>>
>>>
>>> Any information or suggestions would be helpful at this point.
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>> --
>> Adrian Klaver
>> adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: ERROR: MultiXactId XXXXX has not been created yet -- apparent wraparound
Next
From: Pete Fuller
Date:
Subject: Re: 9.2 to 9.5 pg_upgrade losing data