Re: [HACKERS] ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values) - Mailing list pgsql-hackers

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> On 8/6/17 20:07, Peter Geoghegan wrote:
>> I've looked into this. I'll give an example of what keyword variants
>> there are for Greek, and then discuss what I think each is.

> I'm not sure why we want to get into editorializing this.  We query ICU
> for the names of distinct collations and use that.  It's more than most
> people need, sure, but it doesn't cost us anything.

Yes, *it does*.  The cost will be borne by users who get screwed at update
time, not by developers, but that doesn't make it insignificant.

> The alternatives are hand-maintaining a list of collations, or
> installing no collations by default.  Both of those are arguably worse
> for users or for future code maintenance or both.

I'm not (yet) convinced that we need a hand-maintained whitelist.  But
I am wondering why we're expending extra code to import keyword variants.
Who is that catering to, really?

The thing that I'm particularly thinking about is that if someone wants
an ICU variant collation that we didn't make initdb provide, they'll do
a CREATE COLLATION and go use it.  At update time, pg_dump or pg_upgrade
will export/import that via CREATE COLLATION, and the only way it fails
is if ICU rejects the collation name as garbage.  (Which, as we already
established upthread, it's quite unlikely to do.)  On the other hand,
if someone relies on an ICU variant collation that initdb did import,
and then in the next release that collation doesn't get imported because
ICU changed their minds on what to advertise, the update situation is not
pretty at all.  Certainly it won't get handled transparently.  This line
of thinking leads me to believe that we ought to be pretty conservative
about what we import during initdb.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: [HACKERS] ICU collation variant keywords and pg_collation entries(Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)
Next
From: Noah Misch
Date:
Subject: Re: [HACKERS] possible encoding issues with libxml2 functions