Re: Built-in CTYPE provider - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: Built-in CTYPE provider
Date
Msg-id a8804ef9-fda6-4660-9f98-ecd1315f958c@eisentraut.org
Whole thread Raw
In response to Re: Built-in CTYPE provider  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Built-in CTYPE provider
Re: Built-in CTYPE provider
List pgsql-hackers
On 21.03.24 01:13, Jeff Davis wrote:
> The v26 patch was not quite complete, so I didn't commit it yet.
> Attached v27-0001 and 0002.
> 
> 0002 is necessary because otherwise lc_collate_is_c() short-circuits
> the version check in pg_newlocale_from_collation(). With 0002, the code
> is simpler and all paths go through pg_newlocale_from_collation(), and
> the version check happens even when lc_collate_is_c().
> 
> But perhaps there was a reason the code was the way it was, so
> submitting for review in case I missed something.
> 
>> 0005 and 0006 don't contain any test cases.  So I guess they are
>> really
>> only usable via 0007.  Is that understanding correct?
> 0005 is not a functional change, it's just a refactoring to use a
> callback, which is preparation for 0007.
> 
>> Are there any test cases that illustrate the word boundary changes in
>> patch 0005?  It might be useful to test those against Oracle as well.
> The tests include initcap('123abc') which is '123abc' in the PG_C_UTF8
> collation vs '123Abc' in PG_UNICODE_FAST.
> 
> The reason for the latter behavior is that the Unicode Default Case
> Conversion algorithm for toTitlecase() advances to the next Cased
> character before mapping to titlecase, and digits are not Cased. ICU
> has a configurable adjustment, and defaults in a way that produces
> '123abc'.
> 
> New rebased series attached.

The patch set v27 is ok with me, modulo (a) discussion about initcap 
semantics, and (b) what collation to assign to ucs_basic, which can be 
revisited later.




pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Functions to return random numbers in a given range
Next
From: Peter Eisentraut
Date:
Subject: Re: Built-in CTYPE provider