Re: Unicode, php and postgresql - Mailing list pgsql-php

From Michael Glaesemann
Subject Re: Unicode, php and postgresql
Date
Msg-id 0BC6CE32-2A26-11D8-AAF1-0005029FC1A7@myrealbox.com
Whole thread Raw
In response to Unicode, php and postgresql  (Didier Bretin <dbr@informactis.com>)
List pgsql-php
Hi Didier!


On Tuesday, December 9, 2003, at 05:34 PM, Didier Bretin wrote:

> Hi,
>
> I try to install a 7.4.0 + php for developping an application in
> unicode.
> Apparently I have no problem ;).
>
> But I don't understand enough the documentation of php. My postgresql
> server is configured in unicode, and my database is entirely in
> unicode.
> In my php.ini file I set no mbstring variables. When I'm connecting to
> the
> database, I SELECT the data and then I print them, with the charset
> utf-8, to the browser and all the characters are correctly displayed.
>
> My question is : is it the right way I don't have to configure anything
> in php for dealing with unicode :) ?

In my (admittedly limited) experience with PHP 4, Unicode, and
PostgreSQL, you can go a long way with the setup you descibe, i.e., not
using multi-byte string functions. However, all I do is move info in
and out of the database: I'm not doing any fancy-pants parsing of the
data in PHP—including data sanity checking (besides preventing SQL
insertion). I would *not* recommend doing it as I've done, though it
does work for me. It's something I'm working on rectifying in my own
code, and rather than have to fix it later, I'd recommend doing it
right the first time.

The reason it works is that PHP (at least as of PHP4) is agnostic about
the strings. It just takes it from the database and hands them to your
code, not trying to read it, parse it, check it, anything unless you
explicitly do so in the code.

Again, I don't recommend this (though I've been doing it myself)
because I don't believe you'll be able to do proper data
checking—especially if you're using higher order (i.e., not ASCII) code
points. For me, this means the Japanese that moves into my database is
completely unchecked, and like I said, that's Not Good. To do proper
checking of the Japanese, I'd need to use $mb_string functions.

I'm interested in hearing other's opinions on this as well,
particularly if they think I'm wrong—I can always learn something!

hth

Michael Glaesemann
grzm myrealbox com


pgsql-php by date:

Previous
From: Didier Bretin
Date:
Subject: Unicode, php and postgresql
Next
From: Sai Hertz And Control Systems
Date:
Subject: Re: [ADMIN] Auto commit Off how will it effect us ?