В Пнд, 15/11/2004 в 16:00 -0500, Tom Lane пишет:
> The real problem is that you've got invalid unicode data in the database
> (I'm not an expert, but I think that #bf is a 1-byte UTF8 sequence and
> then #ef starts a 3-byte sequence, so if this comes within 2 characters
> of end-of-line that would explain your dump problem).
FWIW, 1-byte UTF-8 sequences are always < 128. BF can only appear
inside, not at the beginning of, a UTF-8 byte sequence with more than 1
byte.
Compare
http://www.cl.cam.ac.uk/~mgk25/unicode.html#utf-8
It has a table that gives anyone who can tell bits from bytes a quick
understanding of how the UTF-8 encoding works.
--
Markus Bertheau <twanger@bluetwanger.de>