Re: Multibyte still broken - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: Multibyte still broken
Date
Msg-id 20000511100719Q.t-ishii@sra.co.jp
Whole thread Raw
In response to Multibyte still broken  (Michael Robinson <robinson@netrinsics.com>)
List pgsql-hackers
> More robust code may always be good, but "good" apparently doesn't always go
> into the tree.  Imagine my surprise, while upgrading a production server
> from 6.5.3 to 7.0, when the data dumped from the old database failed to load
> into the new database (well, crashed the backend, to be specific).
> 
> Apparently the "validate your own damn data" sentiment of the first excerpt
> above has prevailed, because, on inspection, the MB code is just as fragile
> as it was five months ago.
> 
> I was forced to perform emergency repairs to my database dump file to fool a 
> non-multibyte 7.0 into accepting it.  Since EUC_CN is compatible with 
> Latin-1, and since the benefits of multibyte are small compared to the 
> risks, I intend to stick with unibyte Postgres henceforth.
> 
> I would, though, recommend a warning in the "INSTALL" file along the lines of:
> 
>   "WARNING: Use of improperly-encoded text with multi-byte support enabled
>    WILL lead to data corruption and/or loss.  Do not enable multi-byte support
>    unless you intend to fully validate your own damn data."

Sorry for the problem. I forgot about issue:-<

What I'm thinking now to fix the problem you found is that doing data
validataion in the text/var/char input functions, rather than tweaking
the mb functions. If corrupted MB string was found, then call
elog(ERROR) to abort the transation. Will appear in 7.0.1 unless
someone objects.
--
Tatsuo Ishii


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: misc questions
Next
From: Daniele Orlandi
Date:
Subject: Re: Not using index