Re: WIP patch: Collation support - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: WIP patch: Collation support
Date
Msg-id 48D8B4F1.6040309@enterprisedb.com
Whole thread Raw
In response to Re: WIP patch: Collation support  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP patch: Collation support  ("Dave Page" <dpage@pgadmin.org>)
List pgsql-hackers
Committed.

Tom Lane wrote:
> * You should try to get rid of LOCALE_NAME_BUFLEN altogether.  Definitely
> the comment about it in pg_control.h is now obsolete.

Yep. I removed LOCALE_NAME_BUFLEN. The real max length of a locale name 
is now NAMEDATALEN, because it's stored in a name field in pg_database. 
NAMEDATALEN is only 64 bytes, whereas LOCALE_NAME_BUFLEN was 128. 64 
bytes should be enough for "en_GB.UTF8" style locale names, but I wonder 
if it's enough for the longer names used on Windows? Could someone 
confirm that, please?

>     An important restriction, however, is that each database's character set
>     must be compatible with the database's <envar>LC_CTYPE</> setting.
> 
> Also I wonder whether we shouldn't say that it must be compatible with
> LC_CTYPE *and* LC_COLLATE.

I think we should, but that's in fact not what is tested. Before the 
patch as well, we only tested that the encoding matches LC_CTYPE, but 
you could set LC_COLLATE to anything. I'll work on a subsequent patch to 
tighten that.

> * This makes sense, but then shouldn't we make the identical restriction
> for encoding?
> 
> +    The <literal>COLLATE</> and <literal>CTYPE</> settings must match
> +    those of the template database, except when template0 is used as
> +    template. This is because <literal>COLLATE</> and <literal>CTYPE</>

It wouldn't be as bullet-proof for encoding, because we'd still have the 
problem that the encoding in the shared system tables would be 
ill-defined. That's a pre-existing problem, though. We could simply 
remove support for per-database encodings altogether and fix it at 
initdb time, as Martijn suggest earlier, but now that we have 
per-database locales, per-database encodings is a lot more useful as well.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Markus Wanner
Date:
Subject: Re: Proposal: move column defaults into pg_attribute along with attacl
Next
From: "Dave Page"
Date:
Subject: Re: WIP patch: Collation support