Re: encoding advice requested - Mailing list pgsql-general

From Daniel Verite
Subject Re: encoding advice requested
Date
Msg-id 20061114000055.5440910@localhost
Whole thread Raw
In response to Re: encoding advice requested  (Rick Schumeyer <rschumeyer@ieee.org>)
Responses Re: encoding advice requested  (Martijn van Oosterhout <kleptog@svana.org>)
List pgsql-general
    Rick Schumeyer wrote:

> I will have to try the WIN1252 encoding.
>
> On the client side, my application is a web browser.  On the server
> side, it is php scripts on a linux box.  The data comes from copying
> data from a browser window (pointing to another web site) and pasting it
> into an html textarea, which is then submitted.
>
> Given this, would you still suggest the WIN1252 encoding?

No, sticking to utf-8 is safer. Because in the context you describe, it's the
browser that decides the character set and encoding of the textarea data it has
to submit to the HTTP server. There's a problem when the page that contains the
textarea is US-ASCII for example, but the user pastes some non US-ASCII
characters. Then the browser has to choose a non US-ASCII encoding for the
data, possibly one that the server-side script doesn't expect. I assume this is
what happens in your case and the reason of the error you're getting. An easy
solution is to use utf-8 for the webpage, so the browser won't have to switch
to another encoding since every character is supposed to have a representation
in utf-8, "fancy quotes" and everything else.
Also, you'll find this extensively and better explained in this article, for
example:
http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html

--
 Daniel
 PostgreSQL-powered mail user agent and storage: http://www.manitou-mail.org


pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: AutoVacuum on demand?
Next
From: Casey Duncan
Date:
Subject: Re: AutoVacuum on demand?