Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence - Mailing list pgsql-bugs

From Tatsuo Ishii
Subject Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence
Date
Msg-id 20100820.082957.113300986.t-ishii@sraoss.co.jp
Whole thread Raw
In response to Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
> We generally assume that in server-safe encodings, the ctype.h functions
> will behave sanely on any single-byte value.

I think this "wisedom" is only true for C locale.  I'm not surprised
all that it does not work with non C locales.

From array_funcs.c:
    while (isspace((unsigned char) *p))        p++;

IMO this should be something like:
    while (isspace((unsigned char) *p))        p += pg_mblen(p);
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


pgsql-bugs by date:

Previous
From: Thue Janus Kristensen
Date:
Subject: Re: BUG #5622: Query failed: server closed the connection unexpectedly
Next
From: Albert Ullrich
Date:
Subject: Re: BUG #5626: Parallel pg_restore fails with "tuple concurrently updated"