Re: Proposal - Collation at database level - Mailing list pgsql-hackers

From Zdenek Kotala
Subject Re: Proposal - Collation at database level
Date
Msg-id 483EC9F9.8040608@sun.com
Whole thread Raw
In response to Proposal - Collation at database level  (Radek Strnad <radek.strnad@gmail.com>)
Responses Re: Proposal - Collation at database level  (Radek Strnad <radek.strnad@gmail.com>)
List pgsql-hackers
Radek Strnad napsal(a):

<snip>

> 
> I'm thinking of dividing the problem into two parts - in beginning
> pg_collation will contain two functions. One will have hard-coded rules
> for these basic collations (SQL_CHARACTER, GRAPHIC_IRV, LATIN1, ISO8BIT,
> UCS_BASIC). It will compare each string character bitwise and guarantee
> that the implementation will meet the SQL standard implemented in
> PostgreSQL. 
> 
> Second one will allow the user to use installed system locales. The set
> of these collations will obviously vary between systems. Catalogs will
> contain encoding and collation for calling the system locale function.
> This will allow us to use collations such as en_US.utf8, cs_CZ.iso88592
> etc. if they will be availible.
> 
> We will also need to change the way how strings are compared. Regarding
> the set database collation the right function will be used.
> http://doxygen.postgresql.org/varlena_8c.html#4c7af81f110f9be0bd8eb2bd99525675
> 
> This design will make possible switch to ICU or any other implementation
> quite simple and will not cause any major rewriting of what I'm coding
> right now.


Collation function is main point here. How you mentioned one will be only 
wrapper about strcmp and second one about strcoll. (maybe you need four - 
char/wchar) Which function will be used it is defined in pg_collation catalog by 
CREATE COLLATION command. But you need specify name of locale for system 
locales. It means you need attribute for storing locale name.

<snip>

> CATALOG(pg_collations, ###)
> {
>     NameData    colname;        /* collation name */
>     Oid        colschema;        /* collation schema */
>     bool        colpadattribute;    /* pad attribute */
>     bool        colcasesensitive;    /* case sensitive */
>     bool        colaccent;        /* accent sensitive */
>     regproc        colfunc;        /* used collation function */
>     Oid        colrepertoire;        /* collation repertoire */
> 
> } FormData_pg_collations;
> 

It would be good to send list of new and modified SQL commands (like CREATE 
COLLATION) for wide discussion.

    Zdenek


pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: Core team statement on replication in PostgreSQL
Next
From: "Guillaume Smet"
Date:
Subject: Re: Upcoming back-branch update releases