Re: Per-column collation, proof of concept - Mailing list pgsql-hackers

From Jaime Casanova
Subject Re: Per-column collation, proof of concept
Date
Msg-id AANLkTin5aq1Xq-eetWSM6OjK88j7b_+STWUTa4P8xL8Y@mail.gmail.com
Whole thread Raw
In response to Re: Per-column collation, proof of concept  (Peter Eisentraut <peter_e@gmx.net>)
Responses Re: Per-column collation, proof of concept
List pgsql-hackers
Hi,

sorry for the delay...
btw, the patch no longer apply cleanly but most are just hunks the
worst it's in src/backend/catalog/namespace.c because
FindConversionByName() is now called get_conversion_oid()... so maybe
this function should be named get_collation_oid(), i guess

On Tue, Aug 3, 2010 at 11:32 AM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On mån, 2010-08-02 at 01:43 -0500, Jaime Casanova wrote:
>> nowadays, CREATE DATABASE has a lc_collate clause. is the new collate
>> clause similar as the lc_collate?
>> i mean, is lc_collate what we will use as a default?
>
> Yes, if you do not specify anything per column, the database default is
> used.
>
> How to integrate the per-database or per-cluster configuration with the
> new system is something to figure out in the future.
>

well at least pg_collation should be a shared catalog, no?
and i think we shouldn't be thinking in this without think first how
to integrate this with at least per-database configuration

>> if yes, then probably we need to use pg_collation there too because
>> lc_collate and the new collate clause use different collation names.
>> """
>> postgres=# create database test with lc_collate 'en_US.UTF-8';
>> CREATE DATABASE
>> test=# create table t1 (col1 text collate "en_US.UTF-8");
>> ERROR:  collation "en_US.UTF-8" does not exist
>> test=# create table t1 (col1 text collate "en_US.utf8");
>> CREATE TABLE
>> """
>
> This is something that libc does for you.  The locale as listed by
> locale -a is called "en_US.utf8", but apparently libc takes
> "en_US.UTF-8" as well.
>

ok, but at least this is confusing

also, it doesn't recognize C collate although it is in the locales.txt
"""
test3=# create database test4 with template=template0 encoding 'utf-8'
lc_collate='C';
CREATE DATABASE
test3=# create table t3 (col1 text collate "C" );
ERROR:  collation "C" does not exist
"""

BTW, why the double quotes?

>> also i got errors from regression tests when MULTIBYTE=UTF8
>> (attached). it seems i was trying to create locales that weren't
>> defined on locales.txt (from were was fed that file?). i added a line
>> to that file (for es_EC.utf8) then i create a table with a column
>> using that collate and execute "select * from t2 where col1 > 'n'; "
>> and i got this error: "ERROR:  could not create locale "es_EC.utf8""
>> (of course, that last part was me messing the things up, but it show
>> we shouldn't be using a file locales.txt, i think)
>
> It might be that you don't have those locales installed in your system.
> locales.txt is created by using locale -a.  Check what that gives you.
>

sorry to state the obvious but this doesn't work on windows, does it?
and for some reason it also didn't work on a centos 5 (this error
ocurred when initdb'ing)
"""
loading system objects' descriptions ... ok
creating collations ...FATAL:  invalid byte sequence for encoding
"UTF8": 0xe56c09
CONTEXT:  COPY tmp_pg_collation, line 86
STATEMENT:  COPY tmp_pg_collation FROM
E'/usr/local/pgsql/9.1/share/locales.txt';
"""

--
Jaime Casanova         www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL


pgsql-hackers by date:

Previous
From: Boszormenyi Zoltan
Date:
Subject: Re: WIP partial replication patch
Next
From: Andres Freund
Date:
Subject: Re: WIP partial replication patch