Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8" - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8"
Date
Msg-id 20070815185514.GC28485@svana.org
Whole thread Raw
In response to Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8"  ("Phoenix Kiula" <phoenix.kiula@gmail.com>)
List pgsql-general
On Thu, Aug 16, 2007 at 01:56:52AM +0800, Phoenix Kiula wrote:
> This is very useful, thanks. This would be "bytea"? Quick questions:
>
> 1. Even if it were bytea, would it work with regular SQL operators
> such as regexp and LIKE?

bytea is specifically designed for binary data, as such it has all
sorts of quoting rules for dealing with embedded nulls and such. It's
not quite a drop in replacement.

The earlier suggestion of SQL_ASCII is probably closer to what you
want. It does to regexes and LIKE, however postgres will treat all your
data as bytes. If you want you regexes to match Unicode character
classes that's too bad; you can't have it both ways. Sorting it goes in
byte order, you don't have a lot of choice there either.

> 2. Would tsearch2 work with bytea in the future as long as the stuff
> in it was text?

Doubt it, SQL_ASCII would work though.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-general by date:

Previous
From: Jeff Davis
Date:
Subject: Re: MVCC cons
Next
From: "Scott Marlowe"
Date:
Subject: Re: MVCC cons