Thread: BUG #4253: to_tsvector: error with some configurations

BUG #4253: to_tsvector: error with some configurations

From
"Giorgio Valoti"
Date:
The following bug has been logged online:

Bug reference:      4253
Logged by:          Giorgio Valoti
Email address:      giorgio_v@mac.com
PostgreSQL version: 8.3.3
Operating system:   Mac OS X 10.5.3
Description:        to_tsvector: error with some configurations
Details:

Using every language containing the "a grave" letter (c3 a0) causes an error
when the function "ts_vector" is invoked.

test=> select to_tsvector('italian','prova');
ERROR:  invalid byte sequence for encoding "UTF8": 0xc3
HINT:  This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".

test=> select to_tsvector('french','prova');
ERROR:  invalid byte sequence for encoding "UTF8": 0xc3
HINT:  This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".

test=> select to_tsvector('portuguese','prova');
ERROR:  invalid byte sequence for encoding "UTF8": 0xc3
HINT:  This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".

Re: BUG #4253: to_tsvector: error with some configurations

From
Tom Lane
Date:
"Giorgio Valoti" <giorgio_v@mac.com> writes:
> Using every language containing the "a grave" letter (c3 a0) causes an error
> when the function "ts_vector" is invoked.

> test=> select to_tsvector('italian','prova');
> ERROR:  invalid byte sequence for encoding "UTF8": 0xc3

Hmm, works for me:

z=# select to_tsvector('italian','prova');
 to_tsvector
-------------
 'prov':1
(1 row)

What database encoding (server_encoding) are you using?  Is it possible
that the text search configuration files have been rewritten into a
non-UTF8 encoding?

            regards, tom lane