Re: utf8 vs UTF-8 - Mailing list pgsql-general
From | Adrian Klaver |
---|---|
Subject | Re: utf8 vs UTF-8 |
Date | |
Msg-id | f510e041-7e9b-4745-847b-06b9dcce6281@aklaver.com Whole thread Raw |
In response to | Re: utf8 vs UTF-8 (Troels Arvin <troels@arvin.dk>) |
Responses |
Re: utf8 vs UTF-8
|
List | pgsql-general |
On 5/18/24 07:48, Troels Arvin wrote: > Hello, > > Tom Lane wrote: > >> test1 | loc_test | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 > >> test3 | troels | UTF8 | libc | en_US.utf8 | en_US.utf8 > > > > On most if not all platforms, both those spellings of the locale names > > will be taken as valid. You might try running "locale -a" to get an > > idea of which one is preferred according to your current libc > > installation > > "locale -a" on the Ubuntu system outputs this: > > C > C.utf8 > en_US.utf8 > POSIX If you expand that to locale -v -a you get: locale: en_US.utf8 archive: /usr/lib/locale/locale-archive ------------------------------------------------------------------------------- title | English locale for the USA source | Free Software Foundation, Inc. address | https://www.gnu.org/software/libc/ email | bug-glibc-locales@gnu.org language | American English territory | United States revision | 1.0 date | 2000-06-24 codeset | UTF-8 > So at first, I thought en_US.utf8 would be the most correct locale > identifier. However, when I look at Postgres' own databases, they have > the slightly different locale string: > > psql --list | grep -E 'postgres|template' > postgres | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... > template0 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... > template1 | postgres | UTF8 | libc | en_US.UTF-8 | en_US.UTF-8 | ... > > Also, when I try to create a database with "en_US.utf8" as locale > without specifying a template: > > troels=# create database test4 locale 'en_US.utf8'; > ERROR: new collation (en_US.utf8) is incompatible with the collation of > the template database (en_US.UTF-8) > HINT: Use the same collation as in the template database, or use > template0 as template. I'm going to say that is Postgres being exact to a fault. > > Given the locale of Postgres' own databases and Postgres' error message, > I'm leaning to en_US.UTF-8 being the most correct locale to use. Because > why would Postgres care about it, if utf8/UTF-8 doesn't matter? > > >> but TBH, I doubt it's worth worrying about. > > But couldn't there be an issue, if for example the client's locale and > the server's locale aren't exactly the same? I'm thinking maybe the > client library has to perform unneeded translation of the stream of data > to/from the database? -- Adrian Klaver adrian.klaver@aklaver.com
pgsql-general by date: