Re: Upgrading locale issues - Mailing list pgsql-general

From Thomas Munro
Subject Re: Upgrading locale issues
Date
Msg-id CA+hUKG+sMX1BEhWoWHHcmM2mf4BiJgONjDTnY+-oY9qj073HhA@mail.gmail.com
Whole thread Raw
In response to Re: Upgrading locale issues  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: Upgrading locale issues
List pgsql-general
On Thu, May 2, 2019 at 8:26 AM Peter Geoghegan <pg@bowt.ie> wrote:
> On Mon, Apr 29, 2019 at 7:45 AM rihad <rihad@mail.ru> wrote:
> > Hi. Today we run pg_ctl promote on a slave server (10.7) and started
> > using it as a master. The OS also got upgraded FreeBSD 10.4 -> FreeBSD
> > 11.2. And you guessed it, most varchar indexes got corrupted because
> > system local changed in subtle ways. So I created the extension amcheck
> > and reindexed all bad indexes one by one. Is there any way to prevent
> > such things in the future? Will switching to ICU fix all such issues?
>
> Not necessarily, but it will detect the incompatibility more or less
> automatically, making it far more likely that the problem will be
> caught before it does any harm. ICU versions collations, giving
> Postgres a way to reason about their compatibility over time. The libc
> collations are not versioned, though (at least not in any standard way
> that Postgres can take advantage of).

As discussed over on -hackers[1], I think it's worth pursuing that
though.  FWIW I've proposed locale versioning for FreeBSD's libc[2].
The reason I haven't gone further with that yet even though the code
change has been accepted in principle by FreeBSD reviewers is because
I got stuck on the question of how exactly to model the versions.  If,
say, just Turkish changes, I don't want to be rebuilding my French
indexes, which means that I don't think you can use the CLDR version
string.  Frustratingly, you could probably do a good job of that by
just checksumming the collation definition files, but that seems a bit
too crude.

There is also the question of how PostgreSQL should model versions,
and as I've argued in [1], I think we should track them at the level
of database object dependencies.

I'm hoping to reopen this can of worms for PostgreSQL 13 (and the
corresponding support could in theory be in FreeBSD 13... coincidence,
or a sign!?)

> > The problem with it is that ICU collations are absent in pg_collation,
> > initdb should be run to create them, but pg_basebackup only runs on an
> > empty base directory, so I couldn't run initdb + pg_basebackup to
> > prepare the replica server. I believe I can run the create collation
> > command manually, but what would it look like for en-x-icu?
>
> It is safe to call pg_import_system_collations() directly, which is
> all that initdb does. This is documented, so you wouldn't be relying
> on a hack.

Unfortunately you can't use ICU collations as a database default yet
(though there was some WIP code[3]), so ICU only saves you from
versioning problems if you explicitly set collations for columns or
expressions, and even then the version tracking is currently just a
warning that you clear manually with a command, not a mechanism that
really tracks which database objects were last rebuilt/validated with
a given version.

[1]
https://www.postgresql.org/message-id/flat/CAEepm%3D0uEQCpfq_%2BLYFBdArCe4Ot98t1aR4eYiYTe%3DyavQygiQ%40mail.gmail.com
[2] https://reviews.freebsd.org/D17166
[3] https://www.postgresql.org/message-id/flat/3366.1498183854%40sss.pgh.pa.us

-- 
Thomas Munro
https://enterprisedb.com



pgsql-general by date:

Previous
From: Mark Zellers
Date:
Subject: Migrating an application with Oracle temporary tables
Next
From: Igal Sapir
Date:
Subject: Starting Postgres when there is no disk space