Unicode collation error. - Mailing list pgsql-bugs

From Peter Figuli
Subject Unicode collation error.
Date
Msg-id 1025626266.23160.17.camel@peposh
Whole thread Raw
List pgsql-bugs
Dear postgres team.
I'm not member of any of your list, but I decided to send this bugreport
since focusing same problem for more than 2 releases of PGSQL. I'm
running Linux-box 2.4.18 kernel, postgres build from your sources-7.2.1.
Steps:
1. Set locales to any UTF-8 one. Do not forget LC_COLLATE because 8-bit
collate do not produce bug.
2. Initdb database, create any table containing text field, and try
this:
SELECT name from state WHERE name like 'z%';
With any UTF-8 locale I got 'Invalid UNICODE character message...'

I was trying to trap a bug and this is my simple description.
An Error occures while testing if string is really multibyte.
Going deeper I found out, that
/src/backend/utils/atd/selfuncs.c on line 2985 (make_greater_string):
there is loop trying create greater string incrementing last byte. This
actually works fine until 0xC0 is not reached, then multibyte checker
fails. Simple hack to margin value to 128 in multibyte works now fine,
but I understand that problem is more complex there and needs probably
deeper look and solution.

Nice day

Peposh

pgsql-bugs by date:

Previous
From: Peter Figuli
Date:
Subject: JDBC and BigDecimal problem
Next
From: Stephan Szabo
Date:
Subject: Re: Bug #702: NULLs order by bug in 7.2.1