Home > mailing lists

Re: Per-column collation, work in progress - Mailing list pgsql-hackers

From	Peter Eisentraut
Subject	Re: Per-column collation, work in progress
Date	September 23, 2010 06:03:22
Msg-id	1285232583.27917.11.camel@vanquo.pezone.net Whole thread Raw
In response to	Re: Per-column collation, work in progress (Pavel Stehule <pavel.stehule@gmail.com>)
Responses	Re: Per-column collation, work in progress
List	pgsql-hackers

Tree view

On tor, 2010-09-23 at 10:12 +0200, Pavel Stehule wrote:
> 1. It's doesn't work with SQL 92 rules for sortby list. I can
> understand so explicit COLLATE using doesn't work, but the implicit
> using doesn't work too:
> 
> CREATE TABLE foo(a text, b text COLLATE "cs_CZ.UTF8")
> 
> SELECT * FROM foo ORDER BY 1 -- produce wrong order

I can't reproduce that.  Please provide more details.

> 2. Why default encoding for collate is static? There are latin2 for
> czech, cs_CZ and cs_CZ.iso88592. So any user with UTF8 has to write
> encoding explicitly. But the more used and preferred encoding is UTF8
> now. I am thinking so cs_CZ on utf8 database should mean cs_CS.UTF8.

That's tweakable.  One idea I had is to strip the ".utf8" suffix from
locale names when populating the pg_collation catalog, or create both
versions.  I agree that the current way is a bit cumbersome.

> 3. postgres=# select to_char(current_date,'tmday') collate "cs_CZ.utf8";
>  to_char
> ──────────
>  thursday -- bad result
> (1 row)

As was already pointed out, collation only covers lc_collate and
lc_ctype.  (It could cover other things, for example an application to
the money type was briefly discussed, but that's outside the current
mandate.)

As a point of order, what you wrote above attaches a collation to the
result of the function call.  To get the collation to apply to the
function call itself, you have to put the collate clause on one of the
arguments, e.g.,

select to_char(current_date,'tmday' collate "cs_CZ.utf8");

> 4. is somewhere ToDo for collation implementation?

At the moment it's mostly in the source code.  I have a list of notes
locally that I can clean up and put in the wiki once we agree on the
general direction.

> 5.
> 
> postgres=# create table xy(a text, b text collate "cs_CZ");
> ERROR:  collation "cs_CZ" for current database encoding "UTF8" does not exist
> 
> can be there some more friendly message or hint ? like "you cannot to
> use a different encoding". This collate is in pg_collates table.

That can surely be polished.

pgsql-hackers by date:

From: Heikki Linnakangas
Date: 23 September 2010, 06:03:02
Subject: Re: Configuring synchronous replication

From: Peter Eisentraut
Date: 23 September 2010, 06:10:14
Subject: Re: Per-column collation, work in progress

Re: Per-column collation, work in progress - Mailing list pgsql-hackers

Previous

Next