Re: [HACKERS] Implications of multi-byte support in a distribution - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: [HACKERS] Implications of multi-byte support in a distribution
Date
Msg-id 199909030055.JAA01575@ext16.sra.co.jp
Whole thread Raw
In response to Re: [HACKERS] Implications of multi-byte support in a distribution  (Thomas Lockhart <lockhart@alumni.caltech.edu>)
List pgsql-hackers
> > > Each encoding/character set can behave however you want. You can reuse
> > > collation and sorting code from another character set, or define a
> > > unique one.
> > Is it really inside one postmaster instance ?
> > If so, then is the character encoding defined at the create table /
> > create index process (maybe even separately for each field ?) or can I 
> > specify it when sort'ing ?
> 
> Yes, yes, and yes ;)

But we can't avoid calling strcoll() and some other codes surrounded
by #ifdef LOCALE? I think he actually wants is to define his own
collation *and* not to use locale if the column is ASCII only.

> I would propose that we implement the explicit collation features of
> SQL92 using implicit type conversion. So if you want to use a
> different sorting order on a *compatible* character set, then (looking
> up in Date and Darwen for the syntax...):
> 
>   'test string' COLLATE CASE_INSENSITIVITY
> 
> becomes internally
> 
>   case_insensitivity('test string'::text)
> 
> and
> 
>   c1 < c2 COLLATE CASE_INSENSITIVITY
> 
> becomes
> 
>   case_insensitivity(c1) < case_insensitivity(c2)

This idea seems great and elegant. Ok, what about throwing away #ifdef
LOCALE? Same thing can be obtained by defining a special callation
LOCALE_AWARE. This seems much more consistent for me.  Or even better,
we could explicitly have predefined COLLATION for each language (these
can be automatically generated from existing locale data). This would
avoid some platform specific locale problems.
---
Tatsuo Ishii


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] md.c is feeling much better now, thank you
Next
From: Thomas Lockhart
Date:
Subject: Re: [HACKERS] Implications of multi-byte support in a distribution