Re: ICU locale validation / canonicalization - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: ICU locale validation / canonicalization
Date
Msg-id 5293249a-a361-5a5a-a61e-4e8049a75837@enterprisedb.com
Whole thread Raw
In response to Re: ICU locale validation / canonicalization  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: ICU locale validation / canonicalization  (Jeff Davis <pgsql@j-davis.com>)
Re: ICU locale validation / canonicalization  (Jeff Davis <pgsql@j-davis.com>)
Re: ICU locale validation / canonicalization  (Noah Misch <noah@leadboat.com>)
List pgsql-hackers
On 30.03.23 04:33, Jeff Davis wrote:
> Attached is a new version of the final patch, which performs
> canonicalization. I'm not 100% sure that it's wanted, but it still
> seems like a good idea to get the locales into a standard format in the
> catalogs, and if a lot more people start using ICU in v16 (because it's
> the default), then it would be a good time to do it. But perhaps there
> are risks?

I say, let's do it.


I don't think we should show the notice when the canonicalization 
doesn't change anything.  This is not useful:

+NOTICE:  using language tag "und-u-kf-upper" for locale "und-u-kf-upper"

Also, the message should be phrased more from the perspective of the 
user instead of using ICU jargon, like

NOTICE:  using canonicalized form "%s" for locale specification "%s"

(Still too many big words?)


I don't think the special handling of IsBinaryUpgrade is needed or 
wanted.  I would hope that with this feature, all old-style locale IDs 
would go away, but this way we would keep them forever.  If we believe 
that canonicalization is safe, then I don't see why we cannot apply it 
during binary upgrade.


Needs documentation updates in doc/src/sgml/charset.sgml.




pgsql-hackers by date:

Previous
From: "houzj.fnst@fujitsu.com"
Date:
Subject: RE: Support logical replication of DDLs
Next
From: Daniel Gustafsson
Date:
Subject: Re: [EXTERNAL] Support load balancing in libpq