Thread: case insensitive regex clause with some latin1 characters fails

case insensitive regex clause with some latin1 characters fails

From

"Ragnar Österlund"

Date:

11 September 2006, 19:09:26

Hi,

I'm not sure if this is a bug or if I'm doing something wrong. I have
a database encoded with ISO-8859-1, aka LATIN1. When I do something
like:

SELECT 'Ä' ~* 'ä';

it returns false. If i do:

SELECT 'A' ~* 'a';

I get true. According to specification, both should return true.
Anyone knows what the problem might be?

/Ragnar

Re: case insensitive regex clause with some latin1 characters fails

From

Tom Lane

Date:

11 September 2006, 19:41:28

"Ragnar Österlund" <ragoster@gmail.com> writes:
> I'm not sure if this is a bug or if I'm doing something wrong. I have
> a database encoded with ISO-8859-1, aka LATIN1. When I do something
> like:

> SELECT '�' ~* '�';

> it returns false.

Check the database's locale setting (LC_CTYPE).  It has to be one that
expects LATIN1 encoding.

The current regex code is generally not able to deal with locale-specific
behaviors in UTF8 encoding, but it should work for single-byte encodings
as long as you've got the locale setting right.
        regards, tom lane

Re: case insensitive regex clause with some latin1 characters

From

Emi Lu

Date:

11 September 2006, 20:14:19

My environment setup as:
 show lc_ctype;  lc_ctype
------------- fr_CA.UTF-8
(1 row)


fis=> SELECT 'Ä' ~* 'ä'; ?column?
---------- f
(1 row)


fis=> SELECT 'Ä' ilike 'ä'; ?column?
---------- f
(1 row)


I got the same result: false




> "Ragnar Österlund" <ragoster@gmail.com> writes:
>> I'm not sure if this is a bug or if I'm doing something wrong. I have
>> a database encoded with ISO-8859-1, aka LATIN1. When I do something
>> like:
> 
>> SELECT 'Ä' ~* 'ä';
> 
>> it returns false.
> 
> Check the database's locale setting (LC_CTYPE).  It has to be one that
> expects LATIN1 encoding.
> 
> The current regex code is generally not able to deal with locale-specific
> behaviors in UTF8 encoding, but it should work for single-byte encodings
> as long as you've got the locale setting right.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings