Thread: unicode regular insensitive matching

unicode regular insensitive matching

From
Jan Poslusny
Date:

-------- Original Message --------
Subject: unicode regular insensitive matching
Date: Thu, 28 Jun 2001 20:32:08 +0200
From: Jan Poslusny <pajout@gingerall.cz>
Organization: Ginger Alliance
To: pgsql-general@postgresql

I am newbie in postgreSQL and I have this problem with 7.1.2 version:
I configured via
./configure
--enable-locale
--enable-multibyte=UNICODE
--enable-unicode-conversion
--enable-recode
then I succesfully gmake, gmake check, gmake install,
then I initdb -E UNICODE,
then I createdb -E UNICODE.

but

select myfield from mytable where myfield ~* 'MiXeD national-specific
characters' order by myfield

is _NOT_ case insensitive and not ordered according to locales (if I
create another db with LATIN2 charset, all is OK)

Can anybody hint me ?
regards
pajout



Re: unicode regular insensitive matching

From
Peter Eisentraut
Date:
Jan Poslusny writes:

> then I initdb -E UNICODE,
> then I createdb -E UNICODE.

> select myfield from mytable where myfield ~* 'MiXeD national-specific
> characters' order by myfield
>
> is _NOT_ case insensitive and not ordered according to locales (if I
> create another db with LATIN2 charset, all is OK)

Unicode is only a character set.  Issues like sorting and letter-case are
determined by the locale.  You didn't say which locale you used or wanted
to use, what your input was and what ordering you expected, so there's not
a lot we can do for you.

--
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



unicode regular insensitive matching 2.

From
Jan Poslusny
Date:
I used czech locales, described in attached pg_bash_profile exactly,
briefly here:
LC_ALL=cs_CZ
LC_COLLATE=cs_CZ
LC_CTYPE=cs_CZ
LC_MONETARY=cs_CZ
LC_NUMERIC=cs_CZ
LC_TIME=cs_CZ

I used unicodeSQL script for db with UNICODE charset and latin2SQL
script for db with LATIN2 charset. I hope attached scripts are
self-describing.

I don't know, what is misconfigured or badly used.

thanks for some hint

Peter Eisentraut wrote:

> Jan Poslusny writes:
>
>
>>then I initdb -E UNICODE,
>>then I createdb -E UNICODE.
>>
>
>>select myfield from mytable where myfield ~* 'MiXeD national-specific
>>characters' order by myfield
>>
>>is _NOT_ case insensitive and not ordered according to locales (if I
>>create another db with LATIN2 charset, all is OK)
>>
>
> Unicode is only a character set.  Issues like sorting and letter-case are
> determined by the locale.  You didn't say which locale you used or wanted
> to use, what your input was and what ordering you expected, so there's not
> a lot we can do for you.
>
>

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
    . ~/.bashrc
fi

# User specific environment and startup programs

PGHOME=/usr/local/pgsql
PATH=$PATH:$HOME/bin:$PGHOME/bin
BASH_ENV=$HOME/.bashrc
USERNAME=""
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PGHOME/lib
PGDATA=/var/pgdata
PGDATESTYLE=German

LC_ALL=cs_CZ
LC_COLLATE=cs_CZ
LC_CTYPE=cs_CZ
LC_MONETARY=cs_CZ
LC_NUMERIC=cs_CZ
LC_TIME=cs_CZ

# following values have the same effect for postgreSQL sorting, matching
#LC_ALL=cs_CZ.ISO8859-2
#LC_COLLATE=C
#LC_CTYPE=cs_CZ.ISO8859-2
#LC_MONETARY=cs_CZ.ISO8859-2
#LC_NUMERIC=cs_CZ.ISO8859-2
#LC_TIME=cs_CZ.ISO8859-2

export PGHOME USERNAME BASH_ENV PATH LD_LIBRARY_PATH PGDATA
export PGDATESTYLE
export LC_ALL LC_COLLATE LC_CTYPE LC_MONETARY LC_NUMERIC LC_TIME

Attachment