Re: ICU for global collation - Mailing list pgsql-hackers

From Marina Polyakova
Subject Re: ICU for global collation
Date
Msg-id f385ba25e7f8be427b8c582e5cca7d79@postgrespro.ru
Whole thread Raw
In response to Re: ICU for global collation  (Justin Pryzby <pryzby@telsasoft.com>)
Responses Re: ICU for global collation
Re: ICU for global collation
List pgsql-hackers
Hello everyone in this thread!

While reading and testing the patch that adds ICU for global collations 
[1] I noticed on master (1c5818b9c68e5c2ac8f19d372f24cce409de1a26) and 
REL_15_STABLE (63b64d8270691894a9a8f2d4e929e7780020edb8) that:

1) pg_upgrade from REL_14_STABLE 
(63b64d8270691894a9a8f2d4e929e7780020edb8) does not always work:

For REL_14_STABLE:

$ initdb -D data_old

For REL_15_STABLE or master:

$ initdb -D data_new --locale-provider icu --icu-locale ru-RU
$ pg_upgrade -d .../data_old -D data_new -b ... -B ...
...
Restoring database schemas in the new cluster
   template1
*failure*

Consult the last few lines of 
"data_new/pg_upgrade_output.d/20220815T142454.223/log/pg_upgrade_dump_1.log" 
for
the probable cause of the failure.
Failure, exiting

In 
data_new/pg_upgrade_output.d/20220815T142454.223/log/pg_upgrade_dump_1.log:

pg_restore: error: could not execute query: server closed the connection 
unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
Command was: CREATE DATABASE "template1" WITH TEMPLATE = template0 OID = 
1 ENCODING = 'UTF8' LOCALE_PROVIDER = libc LOCALE = 'en_US.UTF-8';

In 
data_new/pg_upgrade_output.d/20220815T142454.223/log/pg_upgrade_server.log:

TRAP: FailedAssertion("(dblocprovider == COLLPROVIDER_ICU && 
dbiculocale) || (dblocprovider != COLLPROVIDER_ICU && !dbiculocale)", 
File: "dbcommands.c", Line: 1292, PID: 69247)
postgres: marina postgres [local] CREATE 
DATABASE(ExceptionalCondition+0xb9)[0xb4d8ec]
postgres: marina postgres [local] CREATE 
DATABASE(createdb+0x1abc)[0x68ca99]
postgres: marina postgres [local] CREATE 
DATABASE(standard_ProcessUtility+0x651)[0x9b1d82]
postgres: marina postgres [local] CREATE 
DATABASE(ProcessUtility+0x122)[0x9b172a]
postgres: marina postgres [local] CREATE DATABASE[0x9b01cf]
postgres: marina postgres [local] CREATE DATABASE[0x9b0433]
postgres: marina postgres [local] CREATE 
DATABASE(PortalRun+0x2fe)[0x9af95d]
postgres: marina postgres [local] CREATE DATABASE[0x9a953b]
postgres: marina postgres [local] CREATE 
DATABASE(PostgresMain+0x733)[0x9ada6b]
postgres: marina postgres [local] CREATE DATABASE[0x8ec632]
postgres: marina postgres [local] CREATE DATABASE[0x8ebfbb]
postgres: marina postgres [local] CREATE DATABASE[0x8e8653]
postgres: marina postgres [local] CREATE 
DATABASE(PostmasterMain+0x1226)[0x8e7f26]
postgres: marina postgres [local] CREATE DATABASE[0x7bbccb]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3)[0x7eff082f90b3]
postgres: marina postgres [local] CREATE DATABASE(_start+0x2e)[0x49c29e]
2022-08-15 14:24:56.124 MSK [69231] LOG:  server process (PID 69247) was 
terminated by signal 6: Aborted
2022-08-15 14:24:56.124 MSK [69231] DETAIL:  Failed process was running: 
CREATE DATABASE "template1" WITH TEMPLATE = template0 OID = 1 ENCODING = 
'UTF8' LOCALE_PROVIDER = libc LOCALE = 'en_US.UTF-8';

1.1) It looks like there's a bug in the function get_db_infos 
(src/bin/pg_upgrade/info.c), where the version of the old cluster is 
always checked:

if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
    snprintf(query + strlen(query), sizeof(query) - strlen(query),
             "'c' AS datlocprovider, NULL AS daticulocale, ");
else
    snprintf(query + strlen(query), sizeof(query) - strlen(query),
             "datlocprovider, daticulocale, ");

With the simple patch

diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 
df374ce4b362b4c6c87fc1fd0e476e5d6d353d9e..53ea348e211d3ac38334292bc16cb814bc13bb87 
100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -319,7 +319,7 @@ get_db_infos(ClusterInfo *cluster)

      snprintf(query, sizeof(query),
               "SELECT d.oid, d.datname, d.encoding, d.datcollate, d.datctype, ");
-    if (GET_MAJOR_VERSION(old_cluster.major_version) < 1500)
+    if (GET_MAJOR_VERSION(cluster->major_version) < 1500)
          snprintf(query + strlen(query), sizeof(query) - strlen(query),
                   "'c' AS datlocprovider, NULL AS daticulocale, ");
      else

I got the expected error during the upgrade:

locale providers for database "template1" do not match:  old "libc", new 
"icu"
Failure, exiting

1.2) It looks like the mentioned asserion in dbcommands.c conflicts with 
the following lines earlier:

if (dbiculocale == NULL)
    dbiculocale = src_iculocale;

The following patch works for me:

diff --git a/src/backend/commands/dbcommands.c 
b/src/backend/commands/dbcommands.c
index 
b31a30550b025d48ba3cc250dc4c15f41f9a80be..17a2942341e528c01182fb6d4878580f2706bec9 
100644
--- a/src/backend/commands/dbcommands.c
+++ b/src/backend/commands/dbcommands.c
@@ -1048,6 +1048,8 @@ createdb(ParseState *pstate, const CreatedbStmt 
*stmt)
                      (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
                       errmsg("ICU locale must be specified")));
      }
+    else
+        dbiculocale = NULL;

      if (dblocprovider == COLLPROVIDER_ICU)
          check_icu_locale(dbiculocale);

2) CREATE DATABASE does not always require the icu locale unlike initdb 
and createdb:

$ initdb -D data --locale en_US.UTF-8 --locale-provider icu
...
initdb: error: ICU locale must be specified

$ initdb -D data --locale en_US.UTF-8
$ pg_ctl -D data -l logfile start

$ createdb mydb --locale en_US.UTF-8 --template template0 
--locale-provider icu
createdb: error: database creation failed: ERROR:  ICU locale must be 
specified

$ psql -c "CREATE DATABASE mydb LOCALE \"en_US.UTF-8\" TEMPLATE 
template0 LOCALE_PROVIDER icu" postgres
CREATE DATABASE

$ psql -c "CREATE DATABASE mydb TEMPLATE template0 LOCALE_PROVIDER icu" 
postgres
ERROR:  ICU locale must be specified

I'm wondering if this is not a fully-supported feature (because createdb 
creates an SQL command with LC_COLLATE and LC_CTYPE options instead of 
LOCALE option) or is it a bug in CREATE DATABASE?.. From 
src/backend/commands/dbcommands.c:

if (dblocprovider == COLLPROVIDER_ICU && !dbiculocale)
{
    if (dlocale && dlocale->arg)
        dbiculocale = defGetString(dlocale);
}

[1] 
https://github.com/postgres/postgres/commit/f2553d43060edb210b36c63187d52a632448e1d2

-- 
Marina Polyakova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



pgsql-hackers by date:

Previous
From: Melih Mutlu
Date:
Subject: Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication
Next
From: torikoshia
Date:
Subject: Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)