Home > mailing lists

Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8" - Mailing list pgsql-general

From	Martijn van Oosterhout
Subject	Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8"
Date	August 15, 2007 15:55:41
Msg-id	20070815185514.GC28485@svana.org Whole thread Raw
In response to	Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8" ("Phoenix Kiula" <phoenix.kiula@gmail.com>)
List	pgsql-general

Tree view

On Thu, Aug 16, 2007 at 01:56:52AM +0800, Phoenix Kiula wrote:
> This is very useful, thanks. This would be "bytea"? Quick questions:
>
> 1. Even if it were bytea, would it work with regular SQL operators
> such as regexp and LIKE?

bytea is specifically designed for binary data, as such it has all
sorts of quoting rules for dealing with embedded nulls and such. It's
not quite a drop in replacement.

The earlier suggestion of SQL_ASCII is probably closer to what you
want. It does to regexes and LIKE, however postgres will treat all your
data as bytes. If you want you regexes to match Unicode character
classes that's too bad; you can't have it both ways. Sorting it goes in
byte order, you don't have a lot of choice there either.

> 2. Would tsearch2 work with bytea in the future as long as the stuff
> in it was text?

Doubt it, SQL_ASCII would work though.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

signature.asc

pgsql-general by date:

From: Jeff Davis
Date: 15 August 2007, 15:45:55
Subject: Re: MVCC cons

From: "Scott Marlowe"
Date: 15 August 2007, 16:06:30
Subject: Re: MVCC cons

Re: Best practice for: ERROR: invalid byte sequence for encoding "UTF8" - Mailing list pgsql-general

Attachment

Previous

Next