Re: Upgrading locale issues - Mailing list pgsql-general

From rihad
Subject Re: Upgrading locale issues
Date
Msg-id c8ceefea-88af-08a9-08ed-a4fc6ed223c7@mail.ru
Whole thread Raw
In response to Re: Upgrading locale issues  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Upgrading locale issues
List pgsql-general
On 05/02/2019 12:26 AM, Peter Geoghegan wrote:
> On Mon, Apr 29, 2019 at 7:45 AM rihad <rihad@mail.ru> wrote:
>> Hi. Today we run pg_ctl promote on a slave server (10.7) and started
>> using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
>> 11.2. And you guessed it, most varchar indexes got corrupted because
>> system local changed in subtle ways. So I created the extension amcheck
>> and reindexed all bad indexes one by one. Is there any way to prevent
>> such things in the future? Will switching to ICU fix all such issues?
> Not necessarily, but it will detect the incompatibility more or less
> automatically, making it far more likely that the problem will be
> caught before it does any harm. ICU versions collations, giving
> Postgres a way to reason about their compatibility over time. The libc
> collations are not versioned, though (at least not in any standard way
> that Postgres can take advantage of).
>
>> The problem with it is that ICU collations are absent in pg_collation,
>> initdb should be run to create them, but pg_basebackup only runs on an
>> empty base directory, so I couldn't run initdb + pg_basebackup to
>> prepare the replica server. I believe I can run the create collation
>> command manually, but what would it look like for en-x-icu?
> It is safe to call pg_import_system_collations() directly, which is
> all that initdb does. This is documented, so you wouldn't be relying
> on a hack.
>
Thanks for the reply. Do you know what would a "decent" ICU collation be 
to bind to a field's schema definition so it would mimic a UTF-8 
encoding for a multilingual column? Maybe und-x-icu? We aren't as much 
concerned about their sortability in most cases, we just want indexes to 
better handle future PG/ICU upgrades. But what does und(efined) even 
mean with respect to collations? With UTF-8 at least some default 
collation is specified, like en_US.UTF-8. Will results be in a 
completely undefined order as a result of ORDER BY "icu_und_column"?




pgsql-general by date:

Previous
From: Michael Nolan
Date:
Subject: Re: Starting Postgres when there is no disk space
Next
From: Francisco Olarte
Date:
Subject: Re: Query not producing expected result