Home > mailing lists

Re: [BUGS] Crash report for some ICU-52 (debian8) COLLATE andwork_mem values - Mailing list pgsql-bugs

From	Peter Geoghegan
Subject	Re: [BUGS] Crash report for some ICU-52 (debian8) COLLATE andwork_mem values
Date	August 8, 2017 02:21:19
Msg-id	CAH2-Wzm=HJ6_TXjftfXv+Nk69xBvRd=Pc8N0BPy+oHzjq-Gw=Q@mail.gmail.com Whole thread Raw
In response to	Re: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: [BUGS] Crash report for some ICU-52 (debian8) COLLATE andwork_mem values (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
List	pgsql-bugs

Tree view

On Mon, Aug 7, 2017 at 12:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Well, the fact that they're "redundant" doesn't really help you if
> you can't pg_upgrade because the collation name you chose in v10 is
> not present in initdb's results in v11.  So this is still a serious
> issue to my mind.

I agree.

Even MongoDB has ICU support these days. They specifically document
which collations are supported. It's just the same for DB2, and other
systems that build their collations on ICU. Users do not "use the ICU
collations" on these other systems. They simply use the collations
that are available, choosing from a list in the documentation, or
possibly create their own collations with their own customization.

The ICU collations are based on the CLDR data and an IETF standard's
idea of a locale identifier [1], so in an important sense they're
supposed to be universal; they're not tied to ICU in particular. This
is probably why ICU is ridiculously forgiving of alternate collation
names, and will not throw an error if you specify an ICU collation
name that is total garbage within CREATE COLLATION (there is a
Postgres regression test that proves this for ICU, actually): As far
as ICU is concerned, this may be coming from input from an end user
over the web, where it makes sense to be so forgiving.

Even stuff like the names for emoji collations, or phonebook
collations, are covered by a standard, though it's not quite an IETF
standard. RFC 6067 says that the CLDR data is the authoritative source
of which variant subtags are allowed, and ICU uses CLDR, from the
Unicode consortium.

We need to move further away from the idea that there are ICU
collations just like there are libc collations.

[1] https://www.rfc-editor.org/rfc/bcp/bcp47.txt
-- 
Peter Geoghegan

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

From: Michael Paquier
Date: 08 August 2017, 02:19:45
Subject: Re: [BUGS] BUG #14771: "Logical decoding" does not cover the impactof "TRUNCATE TABLE" command

From: Andres Freund
Date: 08 August 2017, 02:34:32
Subject: Re: [BUGS] BUG #14771: "Logical decoding" does not cover the impactof "TRUNCATE TABLE" command

Re: [BUGS] Crash report for some ICU-52 (debian8) COLLATE andwork_mem values - Mailing list pgsql-bugs

Previous

Next