Re: ICU for global collation - Mailing list pgsql-hackers
From | Julien Rouhaud |
---|---|
Subject | Re: ICU for global collation |
Date | |
Msg-id | 20220817165359.mcco6ufe2vchokqo@jrouhaud Whole thread Raw |
In response to | Re: ICU for global collation (Marina Polyakova <m.polyakova@postgrespro.ru>) |
Responses |
Re: ICU for global collation
|
List | pgsql-hackers |
Hi, On Mon, Aug 15, 2022 at 03:06:32PM +0300, Marina Polyakova wrote: > > 1.1) It looks like there's a bug in the function get_db_infos > (src/bin/pg_upgrade/info.c), where the version of the old cluster is always > checked: > > if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500) > snprintf(query + strlen(query), sizeof(query) - strlen(query), > "'c' AS datlocprovider, NULL AS daticulocale, "); > else > snprintf(query + strlen(query), sizeof(query) - strlen(query), > "datlocprovider, daticulocale, "); > > With the simple patch > > diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c > index df374ce4b362b4c6c87fc1fd0e476e5d6d353d9e..53ea348e211d3ac38334292bc16cb814bc13bb87 > 100644 > --- a/src/bin/pg_upgrade/info.c > +++ b/src/bin/pg_upgrade/info.c > @@ -319,7 +319,7 @@ get_db_infos(ClusterInfo *cluster) > > snprintf(query, sizeof(query), > "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, "); > - if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500) > + if (GET_MAJOR_VERSION(cluster->major_version) < 1500) > snprintf(query + strlen(query), sizeof(query) - strlen(query), > "'c' AS datlocprovider, NULL AS daticulocale, "); > else > > I got the expected error during the upgrade: > > locale providers for database "template1" do not match: old "libc", new > "icu" > Failure, exiting Good catch. There's unfortunately not a lot of regression tests for ICU-initialized clusters. I'm wondering if the build-farm client could be taught about the locale provider rather than assuming libc. Clearly that wouldn't have caught this issue, but it should still increase the coverage a bit (I'm thinking of the recent problem with the abbreviated keys). > 1.2) It looks like the mentioned asserion in dbcommands.c conflicts with the > following lines earlier: > > if (dbiculocale == NULL) > dbiculocale = src_iculocale; > > The following patch works for me: > > diff --git a/src/backend/commands/dbcommands.c > b/src/backend/commands/dbcommands.c > index b31a30550b025d48ba3cc250dc4c15f41f9a80be..17a2942341e528c01182fb6d4878580f2706bec9 > 100644 > --- a/src/backend/commands/dbcommands.c > +++ b/src/backend/commands/dbcommands.c > @@ -1048,6 +1048,8 @@ createdb(ParseState *pstate, const CreatedbStmt *stmt) > (errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("ICU locale must be specified"))); > } > + else > + dbiculocale = NULL; > > if (dblocprovider == COLLPROVIDER_ICU) > check_icu_locale(dbiculocale); I think it would be better to do that in the various variable initialization. Maybe switch the dblocprovider and dbiculocale initialization, and only initialize dbiculocale if dblocprovider is COLLPROVIDER_ICU. > 2) CREATE DATABASE does not always require the icu locale unlike initdb and > createdb: > > $ createdb mydb --locale en_US.UTF-8 --template template0 --locale-provider > icu > createdb: error: database creation failed: ERROR: ICU locale must be > specified > > $ psql -c "CREATE DATABASE mydb LOCALE \"en_US.UTF-8\" TEMPLATE template0 > LOCALE_PROVIDER icu" postgres > CREATE DATABASE > > I'm wondering if this is not a fully-supported feature (because createdb > creates an SQL command with LC_COLLATE and LC_CTYPE options instead of > LOCALE option) or is it a bug in CREATE DATABASE?.. From > src/backend/commands/dbcommands.c: > > if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale) > { > if (dlocale && dlocale->arg) > dbiculocale = defGetString(dlocale); > } This discrepancy between createdb and CREATE DATABASE looks like an oversight, as createdb indeed interprets --locale as: if (locale) { if (lc_ctype) pg_fatal("only one of --locale and --lc-ctype can be specified"); if (lc_collate) pg_fatal("only one of --locale and --lc-collate can be specified"); lc_ctype = locale; lc_collate = locale; } AFAIK the fallback in the CREATE DATABASE case is expected as POSIX locale names should be accepted by icu, so this should work for createdb too.
pgsql-hackers by date: