Re: [HACKERS] Can ICU be used for a database's default sort order? - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Can ICU be used for a database's default sort order?
Date
Msg-id 17334.1498259653@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Can ICU be used for a database's default sort order?  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Peter Geoghegan <pg@bowt.ie> writes:
> On Fri, Jun 23, 2017 at 11:32 AM, Peter Eisentraut
> <peter.eisentraut@2ndquadrant.com> wrote:
>> 1) Associate by name only.  That is, you can create a database with any
>> COLLATION "foo" that you want, and it's only checked when you first
>> connect to or do anything in the database.
>> 
>> 2) Create shared collations.  Then we'd need a way to manage having a
>> mix of shared and non-shared collations around.
>> 
>> There are significant pros and cons to all of these ideas.  Some people
>> I talked to appeared to prefer the shared collations approach.

> I strongly prefer the second approach. The only downside that occurs
> to me is that that approach requires more code. Is there something
> that I've missed?

I'm not very clear on how you'd bootstrap template1 into anything
other than C locale in the second approach.  With our existing
libc-based stuff, it's possible to define what the database's locale
is before there are any catalogs.  It's not apparent how to do that with
a collation-based solution.

In my mind, collations are just a SQL-syntax wrapper for locales that
are really defined one level down.  I think we'd be well advised to
carry that same approach into the database properties, because otherwise
we have circularities to deal with. So I'm imagining something more like

create database encoding 'utf8' lc_collate 'icu-en_US' lc_ctype ...

where lc_collate is just a string that we know how to interpret, the
same as now.

We could optionally reduce the amount of notation involved by merging the
lc_collate and lc_ctype parameters into one, say

create database encoding 'utf8' locale 'icu-en_US' ...

I'm not too clear on how this would play with other libc locale
functionality (lc_monetary and so on), but we'd have to deal with
that question anyway.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: [HACKERS] pg_terminate_backend can terminate background workersand autovacuum launchers
Next
From: Thomas Munro
Date:
Subject: Re: [HACKERS] Small bug in replication lag tracking