Thread: Re: pgsql: Add option to use ICU as global locale provider
Re: Peter Eisentraut > Since some (legacy) code still uses the libc locale facilities > directly, we still need to set the libc global locale settings even if > ICU is otherwise selected. So pg_database now has three > locale-related fields: the existing datcollate and datctype, which are > always set, and a new daticulocale, which is only set if ICU is > selected. A similar change is made in pg_collation for consistency, > but in that case, only the libc-related fields or the ICU-related > field is set, never both. Since the intended usage seems to be that databases should either be using libc, or the ICU locales, but probably not both at the same time, does it make sense to clutter the already very wide `psql -l` output with two new extra columns? This hardly fits in normal-size terminals: =# \l List of databases Name │ Owner │ Encoding │ Collate │ Ctype │ ICU Locale │ Locale Provider │ Access privileges ───────────┼───────┼──────────┼────────────┼────────────┼────────────┼─────────────────┼─────────────────── postgres │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ template0 │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ =c/myon ↵ │ │ │ │ │ │ │ myon=CTc/myon template1 │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ =c/myon ↵ │ │ │ │ │ │ │ myon=CTc/myon (3 rows) (Even longer if the username is "postgres") It also makes \l+ even harder to read when the most often only relevant new column, the database size, is even more to the far right. Couldn't that be a single "Locale" column, possibly extended by more info in parentheses if the values differ? Locale de_DE.utf8 de-x-icu-whatever de_DE.utf8 (Ctype: C.UTF-8) SQL_ASCII (ICU Locale: en-x-something) Christoph
On 19.03.22 18:53, Christoph Berg wrote: > Re: Peter Eisentraut >> Since some (legacy) code still uses the libc locale facilities >> directly, we still need to set the libc global locale settings even if >> ICU is otherwise selected. So pg_database now has three >> locale-related fields: the existing datcollate and datctype, which are >> always set, and a new daticulocale, which is only set if ICU is >> selected. A similar change is made in pg_collation for consistency, >> but in that case, only the libc-related fields or the ICU-related >> field is set, never both. > > Since the intended usage seems to be that databases should either be > using libc, or the ICU locales, but probably not both at the same > time, does it make sense to clutter the already very wide `psql -l` > output with two new extra columns? Good point, let me think about that.
Re: Peter Eisentraut > > Since the intended usage seems to be that databases should either be > > using libc, or the ICU locales, but probably not both at the same > > time, does it make sense to clutter the already very wide `psql -l` > > output with two new extra columns? > > Good point, let me think about that. A possible solution might be to rip out all the locale columns except "Encoding" from \l, and leave them in place for \l+. For \l+, I'd suggest moving the database size and the tablespace to the front, after owner. Christoph
Christoph Berg <myon@debian.org> writes: > A possible solution might be to rip out all the locale columns except > "Encoding" from \l, and leave them in place for \l+. I'd rather see a single column summarizing the locale situation. Perhaps it could be COALESCE(daticulocale, datcollate), or something using a CASE on datlocprovider? Then \l+ could replace that with all the underlying columns. > For \l+, I'd suggest moving the database size and the tablespace to > the front, after owner. I think it's confusing if the + and non-+ versions of a command present their columns in inconsistent orders. I'm not dead set against this, but -0.5 or so. regards, tom lane
Re: To Peter Eisentraut > This hardly fits in normal-size terminals: > > =# \l > List of databases > Name │ Owner │ Encoding │ Collate │ Ctype │ ICU Locale │ Locale Provider │ Access privileges > ───────────┼───────┼──────────┼────────────┼────────────┼────────────┼─────────────────┼─────────────────── > postgres │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ > template0 │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ =c/myon ↵ > │ │ │ │ │ │ │ myon=CTc/myon > template1 │ myon │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ │ libc │ =c/myon ↵ > │ │ │ │ │ │ │ myon=CTc/myon > (3 rows) Another gripe here: The above is the output when run against a PG15 cluster, created without an ICU locale set. When running psql 15 against PG 14, the output is this: $ psql -l List of databases Name │ Owner │ Encoding │ Collate │ Ctype │ ICU Locale │ Locale Provider │ Access privileges ───────────┼──────────┼──────────┼────────────┼────────────┼────────────┼─────────────────┼─────────────────────── postgres │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │ template0 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │ =c/postgres ↵ │ │ │ │ │ │ │ postgres=CTc/postgres template1 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │ =c/postgres ↵ │ │ │ │ │ │ │ postgres=CTc/postgres (3 rows) The "ICU Locale" column is now populated, that seems wrong. The problem is in the else branch in src/bin/psql/describe.c around line 900: + if (pset.sversion >= 150000) + appendPQExpBuffer(&buf, + " d.daticulocale as \"%s\",\n" + " CASE d.datlocprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\",\ + gettext_noop("ICU Locale"), + gettext_noop("Locale Provider")); + else + appendPQExpBuffer(&buf, + " d.datcollate as \"%s\",\n" <--- there + " 'libc' AS \"%s\",\n", + gettext_noop("ICU Locale"), + gettext_noop("Locale Provider")); I'd think this should rather be + " '' as \"%s\",\n" Christoph
On Fri, Apr 15, 2022, at 11:58 AM, Christoph Berg wrote:
When running psql 15 against PG 14, the output is this:$ psql -lList of databasesName │ Owner │ Encoding │ Collate │ Ctype │ ICU Locale │ Locale Provider │ Access privileges───────────┼──────────┼──────────┼────────────┼────────────┼────────────┼─────────────────┼───────────────────────postgres │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │template0 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │ =c/postgres ↵│ │ │ │ │ │ │ postgres=CTc/postgrestemplate1 │ postgres │ UTF8 │ de_DE.utf8 │ de_DE.utf8 │ de_DE.utf8 │ libc │ =c/postgres ↵│ │ │ │ │ │ │ postgres=CTc/postgres(3 rows)The "ICU Locale" column is now populated, that seems wrong.
Good catch!
The problem is in the else branch in src/bin/psql/describe.c aroundline 900:+ if (pset.sversion >= 150000)+ appendPQExpBuffer(&buf,+ " d.daticulocale as \"%s\",\n"+ " CASE d.datlocprovider WHEN 'c' THEN 'libc' WHEN 'i' THEN 'icu' END AS \"%s\",\+ gettext_noop("ICU Locale"),+ gettext_noop("Locale Provider"));+ else+ appendPQExpBuffer(&buf,+ " d.datcollate as \"%s\",\n" <--- there+ " 'libc' AS \"%s\",\n",+ gettext_noop("ICU Locale"),+ gettext_noop("Locale Provider"));I'd think this should rather be+ " '' as \"%s\",\n"
Since dataiculocale allows NULL, my suggestion is to use NULL instead of an
empty string. It is consistent with a cluster whose locale provider is libc.
"Euler Taveira" <euler@eulerto.com> writes: > On Fri, Apr 15, 2022, at 11:58 AM, Christoph Berg wrote: >> When running psql 15 against PG 14, the output is this: >> The "ICU Locale" column is now populated, that seems wrong. > Good catch! Indeed. > Since dataiculocale allows NULL, my suggestion is to use NULL instead of an > empty string. It is consistent with a cluster whose locale provider is libc. Yeah, I agree. We should make the pre-v15 output match what you'd see if looking at a non-ICU v15 database. regards, tom lane
psql -l and locales (Re: pgsql: Add option to use ICU as global locale provider)
From
Christoph Berg
Date:
Re: Tom Lane > Christoph Berg <myon@debian.org> writes: > > A possible solution might be to rip out all the locale columns except > > "Encoding" from \l, and leave them in place for \l+. > > I'd rather see a single column summarizing the locale situation. > Perhaps it could be COALESCE(daticulocale, datcollate), or > something using a CASE on datlocprovider? > Then \l+ could replace that with all the underlying columns. Fwiw I still think the default psql -l output should be more concise. Any chance to have that happen for PG15? I can try creating a patch if it has chances of getting through. Christoph