Re: BUG #3638: UTF8 Character encoding does NOT work - Mailing list pgsql-bugs
From | Tatsuo Ishii |
---|---|
Subject | Re: BUG #3638: UTF8 Character encoding does NOT work |
Date | |
Msg-id | 20070928.102329.76083990.t-ishii@sraoss.co.jp Whole thread Raw |
In response to | BUG #3638: UTF8 Character encoding does NOT work ("Fil Matthews" <fil@internetmediapro.com>) |
List | pgsql-bugs |
> Tatsuo Ishii wrote: > > Why do you think that an UTF-8 encoded string starting with 0x92 is > > valid? > > > > 0x92 can appear in the second, third or fourth octet, but should never > > appear in the first octet. > > -- > > Tatsuo Ishii > > SRA OSS, Inc. Japan > > > > > >> The following bug has been logged online: > >> > >> Bug reference: 3638 > >> Logged by: Fil Matthews > >> Email address: fil@internetmediapro.com > >> PostgreSQL version: 8-1 , 8-2 > >> Operating system: Linux Debian - Windows XP > >> Description: UTF8 Character encoding does NOT work > >> Details: > >> > >> Judging from the amount of Google page hits with the exact same problem I am > >> surprised and mystified by this obvious flaw in Postgres Technology.. > >> > >> Just how is one expected to work with UTF8 character sets when all and > >> every attempt at using even Postgres clients produces the SAME problem > >> every time ??? > >> > >> "invalid byte sequence for encoding "UTF8": 0x92" > >> > >> In Short A Postgres UTF8 database .. PGCLIENENCODING=UTF8 > >> > >> Tables test.text -> (Chararcter varying 10) > >> > >> In any Postgres Client ie psql , dbadmin III > >> > >> Insert into test values ( chr(146));; > >> > >> > >> Query returned successfully: 1 rows affected, 32 ms execution time. > >> > >> copy test to '/tmp/testfile.txt'; > >> > >> > >> Query returned successfully: 1 rows affected, 15 ms execution time. > >> > >> copy test from '/tmp/testfile.txt'; > >> > >> > >> Come on are you serious?? .. Just how does one work with completly valid > >> data that has an ascii 128 + value ?? > >> > >> Currently this flaw make Postgres an un-useable database technology .. Or > >> can some-one please explain this and a possible work around .. ?? > >> > >> Thank You > >> > >> ---------------------------(end of broadcast)--------------------------- > >> TIP 1: if posting/reading through Usenet, please send an appropriate > >> subscribe-nomail command to majordomo@postgresql.org so that your > >> message can get through to the mailing list cleanly > >> > > > > > > Sorry But I don't agree.. Why can't Postgres store a legitimate 8 bit > byte value that is below 255?? and treat it as text .. > Not being able to do this this makes Postgres unusable.. for storing > TEXT values.. > > I do not know ANY other database technology that doesn't allow some form > of storing a legitimate 8 bit byte ... > > Even the most simplest open -source database in the world (and most > popular) can do this.. > > The biggest and best (Thank you Larry) can do this ... > > Postgres can't. > > In other words You are claiming that UTF8 is actually UTF7 .... No. > There are 8 bits in a byte.. not 7 .. If UTF8 can't by definition > store 8 bits then what standard can?? UTF-8 does not accept arbitary 8 bit characters. The byte ranges UTF-8 accepts are precisely defined in the standard. If our implementation is different from it, please let us know. > The technology is wrong and it is incorrect... If one looks at the > output of the copy file > od -c then QUITE correctly the 8 bit value is stored as the value > given.. > > What then is the problem in putting this value back in the text field it > came from ?? PostgreSQL needs to follow the standard. That's it. -- Tatsuo Ishii SRA OSS, Inc. Japan
pgsql-bugs by date: