The last day we've encountered an issue what i think is somewhat severe if
you want to do either OS upgrades with CentOS or even binary upgrades with
an existing PostgreSQL instance to a new machine with locale de_DE.UTF-8
and thus i'd like to share here.
Here are the details:
Originally a Postgres 9.4 was running on CentOS 5.11/x86_64. The database
in question was initialized with locale de_DE.UTF-8 and previously upgraded
via pg_upgrade from 9.2 and then running without any issues for a while.
After that the customer migrated to new hardware with an OS upgrade to
CentOS 6.6/x86_64. This was done by just remounting the SAN LUN on the new
machine. So far so good, no issues.
However, after a while developers realized duplicate values in unique keys
with certain types of string values (the format is described in the
examples below). So the suspicion was that this has to do with locales. And
yes, the german locale collation order changed:
CentOS 5.11 has:
echo -e '156\n1-5-6\n110\n1-1-0' | LANG=de_DE.UTF-8 sort
110
1-1-0
156
1-5-6
CentOS 6.6 does:
echo -e '159\n1-5-9\n110\n1-1-0' | LANG=de_DE.UTF-8 sort
1-1-0
110
1-5-9
159
Interestingly CentOS 7.1 restores the behavior from CentOS 5.11
echo -e '159\n1-5-9\n110\n1-1-0' | LANG=de_DE.UTF-8 sort
110
1-1-0
159
1-5-9
There are entries in the CentOS bugtracker regarding other locales:
https://bugs.centos.org/view.php?id=7009
https://bugs.centos.org/view.php?id=6210
So users are encouraged to carefully test their platforms when upgrading.
Checks show that at least RHEL6 and RHEL7 have the same issue, too.
--
Thanks
Bernd