Thread: pg_upgrade problem as locale difference in data centers

pg_upgrade problem as locale difference in data centers

From
Yi Sun
Date:
Hello,

We want to upgrade some PG9.6 DB to PG13, the databases locale are different in data centers, some are C, some are ru_RU.UTF-8 and so on... as below

postgres=# select datname,datcollate,datctype from pg_database;
            datname            | datcollate  |  datctype
-------------------------------+-------------+-------------
 postgres                      | C           | C
 template1                     | C           | C
 template0                     | C           | C
 aaa_service                  | C           | C

postgres=# select datname,datcollate,datctype from pg_database;
            datname             | datcollate  |  datctype
--------------------------------+-------------+-------------
 postgres                       | ru_RU.UTF-8 | ru_RU.UTF-8
 template1                      | ru_RU.UTF-8 | ru_RU.UTF-8
 template0                      | ru_RU.UTF-8 | ru_RU.UTF-8
 bbb_service                   | ru_RU.UTF-8 | ru_RU.UTF-8

We use 1 ansible script to proceed the upgrade in data centers, as pg_upgrade will only work once postgres, template1, template0 3 databases locale are same between PG9.6 and PG13, so we test before pg_upgrade, update the 3 databases locale to 'en_US.UTF-8' both in PG9.6 and PG13 as below script

update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8' where datname in ('postgres','template1','template0') and (datcollate<>'en_US.UTF-8' or datctype<>'en_US.UTF-8')

Then test pg_upgrade completed with no error, but as pg documentation says like"collation can't be changed after database creation. Only drop and recreate", the question is if we can update and upgrade this way, will it have problems? 

Thanks and best regards
Sun Yi

Re: pg_upgrade problem as locale difference in data centers

From
Tom Lane
Date:
Yi Sun <yinan81@gmail.com> writes:
> update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8'
> where datname in ('postgres','template1','template0') and
> (datcollate<>'en_US.UTF-8' or datctype<>'en_US.UTF-8')

This seems like a pretty bad idea.  You might get away with it if you've
not added any user-defined indexes in any of those databases, but it's
definitely a case where if things go wrong you'll be told it's your
own fault.

The context of "moving to a new datacenter and also updating across four
Postgres versions" suggests strongly to me that you are also planning a
move to a new OS version, in which case you had better read

https://wiki.postgresql.org/wiki/Locale_data_changes

In short, it seems to me that you are at very great risk of ending
up with corrupt (incorrectly ordered) indexes on textual columns.
If you don't mind reindexing all of those after the update, you
could proceed with this plan.  Otherwise, pg_dump-and-restore might
be a safer idea.

            regards, tom lane



Re: pg_upgrade problem as locale difference in data centers

From
Yi Sun
Date:
As we use ansible to deploy the upgrade, so mentioned the data centers situation. The PostgreSQL is single node and the upgrade will be in the same data center and same Linux server(Centos 7), just will run the ansible to upgrade PG in all Data centers.

For example, in our aaa data center

postgres=# select datname,datcollate,datctype from pg_database;
            datname            | datcollate  |  datctype
-------------------------------+-------------+-------------
 postgres                      |  ru_RU.UTF-8            |  ru_RU.UTF-8
 template1                     |  ru_RU.UTF-8             |  ru_RU.UTF-8
 template0                     |  ru_RU.UTF-8             |  ru_RU.UTF-8
 aaa_service                  |  ru_RU.UTF-8             | ru_RU.UTF-8

we test before pg_upgrade, update the postgres, template1, template0 3 databases locale to 'en_US.UTF-8' both in PG9.6 and PG13 as below script

update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8' where datname in ('postgres','template1','template0') and (datcollate<>'en_US.UTF-8' or datctype<>'en_US.UTF-8')

Then test pg_upgrade completed with no error and Locale is like below:

postgres=# select datname,datcollate,datctype from pg_database;
            datname            | datcollate  |  datctype
-------------------------------+-------------+-------------
 postgres                       |  en_US.UTF-8             |  en_US.UTF-8
 template1                     |  en_US.UTF-8             |  en_US.UTF-8
 template0                     |  en_US.UTF-8             |  en_US.UTF-8
 aaa_service                  |  ru_RU.UTF-8             |  ru_RU.UTF-8

The aaa_service application database Locale is no change, and there is no user-defined table or index in postgres, template1, template0 3 databases. So the question is like our case, if update the postgres, template1, template0 3 databases locale and upgrade this way has problems? If we have to get away with it and choose pg_dump-and-restore or pglogical? Thanks 

Thanks and best regards
Sun Yi

Re: pg_upgrade problem as locale difference in data centers

From
Yi Sun
Date:
Hi Tom,

Thank you for your help.

As we use ansible to deploy the upgrade, so mentioned the data centers situation. The PostgreSQL is single node and the upgrade will be in the same data center and same Linux server(Centos 7), just will run the ansible to upgrade PG in all Data centers.

For example, in our aaa data center

postgres=# select datname,datcollate,datctype from pg_database;
            datname            | datcollate  |  datctype
-------------------------------+-------------+-------------
 postgres                       |  ru_RU.UTF-8             |  ru_RU.UTF-8
 template1                     |  ru_RU.UTF-8             |  ru_RU.UTF-8
 template0                     |  ru_RU.UTF-8             |  ru_RU.UTF-8
 aaa_service                  |  ru_RU.UTF-8             |  ru_RU.UTF-8

we test before pg_upgrade, update the postgres, template1, template0 3 databases locale to 'en_US.UTF-8' both in PG9.6 and PG13 as below script

update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8' where datname in ('postgres','template1','template0') and (datcollate<>'en_US.UTF-8' or datctype<>'en_US.UTF-8')

Then test pg_upgrade completed with no error and Locale is like below:

postgres=# select datname,datcollate,datctype from pg_database;
            datname            | datcollate  |  datctype
-------------------------------+-------------+-------------
 postgres                       |  en_US.UTF-8             |  en_US.UTF-8
 template1                     |  en_US.UTF-8             |  en_US.UTF-8
 template0                     |  en_US.UTF-8             |  en_US.UTF-8
 aaa_service                  |  ru_RU.UTF-8             |  ru_RU.UTF-8

The aaa_service application database Locale is no change, and there is no user-defined table or index in postgres, template1, template0 3 databases. So the question is like our case, if update the postgres, template1, template0 3 databases locale and upgrade this way has problems and risk? If we have to get away with it and choose pg_dump-and-restore or pglogical? Thanks

Best regards
Sun Yi