Thread: problems with lower() and unicode-databases

problems with lower() and unicode-databases

From
peter pilsl
Date:
postgres 7.4 on linux, glibc 2.2.4-6

I've a table containing unicode-data and the lower()-function does not
work proper. While it lowers standard letters like A->a,B->b ... it
fails on special letters like german umlauts (Ä , Ö ...) that are simply
keeped untouched.

Everything else (sorting etc.) is working fine and LC_COLLATE, LC_CTYPE
and all the other locales were set proper to 'de_AT.UTF-8' (thats how my
mandrake-systems calls the needed locale. On most other systems its
called 'de_AT.utf8') when doing initdb.

The database-encoding is unicode, but I also tried SQL_ASCII (just to
give it a try) and the same problem.

Whats the problem here?

The following output is copied from a unicode-terminal and copied to the
newsreader. It looks fine here, so I think you can all read it.

# select oid,t,lower(t),length(t) from test order by t;
   oid  |   t   | lower | length
-------+-------+-------+--------
  17257 | a     | a     |      1
  17268 | A     | a     |      1
  17291 | ä     | ä     |      1
  17265 | Ä     | Ä     |      1
  17269 | B     | b     |      1
  17275 | ñ     | ñ     |      1
  17277 | Ñ     | Ñ     |      1
  17262 | ö     | ö     |      1
  17266 | Ö     | Ö     |      1
  17267 | Ü     | Ü     |      1


# /usr/local/pgsql/bin/pg_controldata /data/postgresql_de/ | grep LC
LC_COLLATE:                           de_AT.UTF-8
LC_CTYPE:                             de_AT.UTF-8



I would be very happy to get a "solution", but a workaround would be
better than nothing ;)  perl on the same system can read the data from
the database and lowercase the data without any problems, but this is
too much of a *WORK* *AROUND* :)

thnx a lot,
peter

ps: of course upper does not work as well !!
pps: I looked up the changes on newer postgresql-version, but my topic
did not apperar in the list, so I didnt try new 7.4.5. I think its
merely a problem with setting than with postgreSQL. (at least I hope so ...)


--
mag. peter pilsl
goldfisch.at
IT-management
tel +43 699 1 3574035
fax +43 699 4 3574035
pilsl@goldfisch.at

Re: problems with lower() and unicode-databases

From
Tom Lane
Date:
peter pilsl <pilsl@goldfisch.at> writes:
> postgres 7.4 on linux, glibc 2.2.4-6

> I've a table containing unicode-data and the lower()-function does not
> work proper. While it lowers standard letters like A->a,B->b ... it
> fails on special letters like german umlauts (� , � ...) that are simply
> keeped untouched.

upper() and lower() didn't support multibyte character sets before 8.0.

            regards, tom lane