Re: ICU locale validation / canonicalization - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: ICU locale validation / canonicalization
Date
Msg-id 7ee414ad-deb5-1144-8a0e-b34ae3b71cd5@enterprisedb.com
Whole thread Raw
In response to Re: ICU locale validation / canonicalization  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: ICU locale validation / canonicalization
List pgsql-hackers
On 22.03.23 19:05, Jeff Davis wrote:
> On Tue, 2023-03-21 at 10:35 +0100, Peter Eisentraut wrote:
>> [PATCH v6 1/6] Support language tags in older ICU versions (53 and
>>    earlier).
>>
>> In pg_import_system_collations(), this is now redundant and can be
>> simplified:
>>
>> -               if (!pg_is_ascii(langtag) || !pg_is_ascii(iculocstr))
>> +               if (!pg_is_ascii(langtag) || !pg_is_ascii(langtag))
>>
>> icu_set_collation_attributes() needs more commenting about what is
>> going
>> on.  My guess is that uloc_canonicalize() converts from language tag
>> to
>> ICU locale ID, and then the existing logic to parse that apart would
>> apply.  Is that how it works?
> 
> Fixed the redundancy, added some comments, and committed 0001.

So, does uloc_canonicalize() always convert to ICU locale IDs?  What if 
you pass a language tag, does it convert it to ICU locale ID as well?

>> [PATCH v6 2/6] Wrap ICU ucol_open().

> So I'm inclined to just leave initdb alone in patches 0002 and 0003.

0002 and 0003 look ok to me now.

In 0002, the error "opening default collator is not supported", should 
that be an assert or an elog?  Is it reachable by the user?

You might want to check the declarations at the top of pg_ucol_open(). 
0003 reformats them after they were just added in 0002.  Maybe check 
that they are pgindent'ed in 0002 properly.

I don't understand patch 0004.  It seems to do two things, handle 
C/POSIX locale specifications and add an SQL-callable function.  Are 
those connected?




pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Allow logical replication to copy tables in binary format
Next
From: Alexandr Nikulin
Date:
Subject: Improve the performance of nested loop join in the case of partitioned inner table