Thread: Collations and codepages

Collations and codepages

From
Raimo Jormakka
Date:
Hi all,

In Windows 7, and using PostgreSQL 9.4.5, the collation gets set to "English_United States.1252" when I select the "English, United States" locale in the installer. In Linux, the collation is set to "en_US.UTF-8". The encoding is set to UTF-8 in both instances.

Will these two instances behave identically in terms of collation logic? And if not, is there something I can do about it? In general, what's the impact of the codepage part of a collation to begin with?

Cheers,
Raimo

Re: Collations and codepages

From
Albe Laurenz
Date:
Raimo Jormakka wrote:
> In Windows 7, and using PostgreSQL 9.4.5, the collation gets set to "English_United States.1252" when
> I select the "English, United States" locale in the installer. In Linux, the collation is set to
> "en_US.UTF-8". The encoding is set to UTF-8 in both instances.
>
> Will these two instances behave identically in terms of collation logic? And if not, is there
> something I can do about it? In general, what's the impact of the codepage part of a collation to
> begin with?

The two collations will probably not behave identically, since PostgreSQL uses the
operating system collations instead of having ist own, and odds are that Microsoft's
collations and glibc's are slightly different.

I don't know if the impact will be large; maybe run a couple of tests to see if the
ordering is similar enough for your purposes.

I don't think that the actual encoing (UTF-8 or Windows-1252) has any impact on the ordering,
but I am not certain.

Yours,
Laurenz Albe