Re: Issue when inserting Slovak characters in database via PHP code - Mailing list pgsql-general

From Albe Laurenz
Subject Re: Issue when inserting Slovak characters in database via PHP code
Date
Msg-id 52EF20B2E3209443BC37736D00C3C1380B46B378@EXADV1.host.magwien.gv.at
Whole thread Raw
In response to Issue when inserting Slovak characters in database via PHP code  ("Alain Roger" <raf.news@gmail.com>)
List pgsql-general
> I have a postgreSQL database in UNICODE (UTF-8 in v8.1.4 and
> UNICODE in v8.0.1).
>
> Via my web application i type a sentence in Slovak language
> and it is stored into DB without any slovak characters.
> Instead of that, all particular characters are replace with
> \303\251 or \303\206 or \304\314 and so on...
>
> I was thinking that issue was coming from DB encryption but
> on 2 different versions of DB (see above) i get the same result.
> after, i was thinking that it was coming from my web browser,
> but even if i setup character mode in central europe and
> Slovak language as default coding...nothing change...i tried
> on IE and Firefox.
>
> Last step, i tried to type directly from my PhpPgAdmin
> (direct typing sentence there to DB), and i realize that when
> i click on save...the changes appear in DB aswritten above
> (e.g. : \303\251,...)
>
> My latest test was to write via PhpPgAdmin (directly to DB)
> the UNICODE of slovak character contained within my
> sentence...so i used ý, í and so on...
> if i do that, those code are correctly saved into DB and when
> my PHP code show web pages, all sentences are correct.
>
> I can not imagine to write a special interface to convert
> slovak characters to unicode everytime that user would like
> to type something new.
> Something else must be badly setup...

I looked up \303\251, and it is the correct UTF-8 representation
of 'é' - is that the character you wanted to store?

If yes, that is correct, and your application and phpPgAdmin work
as expected. They store Slovak characters in the database.
\303\251 is the correct spelling of 'é'.

UNICODE and UTF-8 as database encoding are the same thing in
different PostgreSQL versions, so the result is the same in
both cases.

I guess (you don't say) that your real problem is that the
Slovak characters don't show up properly on the HTTP-browser
when you use your web application.

If that is the case, there are several possibilities for a solution:
- configure your web browser so that it sends a correct HTTP header
  that tells the web browser that the page is in UTF-8.
- use the SQL command 'set client_encoding = <whatever>' in your
  application code after you connect to the database to have PostgreSQL
  translate the Slovak characters into whatever codepage your HTTP
  server expects.
- This is the best solution: use encoded entities (like é) in your
  HTML code to represent characters other than the lower 128 ASCII
  characters.

See http://www.w3.org/TR/html401/charset.html#spec-char-encoding
for the specification of character sets in web pages.

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: "Albe Laurenz"
Date:
Subject: Re: upgrade to 8.0.9
Next
From: "surabhi.ahuja"
Date:
Subject: postmaster slowing down