Thread: Unicode, php and postgresql

Unicode, php and postgresql

From
Didier Bretin
Date:
Hi,

I try to install a 7.4.0 + php for developping an application in unicode.
Apparently I have no problem ;).

But I don't understand enough the documentation of php. My postgresql
server is configured in unicode, and my database is entirely in unicode.
In my php.ini file I set no mbstring variables. When I'm connecting to the
database, I SELECT the data and then I print them, with the charset
utf-8, to the browser and all the characters are correctly displayed.

My question is : is it the right way I don't have to configure anything
in php for dealing with unicode :) ?

And is there anybody else who played with the same configuration as mine ?

Regards.
--
            .------------------------------------------------.
    .^.     | Didier Bretin, France | dbr@informactis.com    |
    /V\     |-----------------------| www.informactis.com    |
   // \\    |                       `------------------------|
  /(   )\   | Visit: http://www.vim.org/                     |
   ^^-^^    `------------------------------------------------'


Re: Unicode, php and postgresql

From
Michael Glaesemann
Date:
Hi Didier!


On Tuesday, December 9, 2003, at 05:34 PM, Didier Bretin wrote:

> Hi,
>
> I try to install a 7.4.0 + php for developping an application in
> unicode.
> Apparently I have no problem ;).
>
> But I don't understand enough the documentation of php. My postgresql
> server is configured in unicode, and my database is entirely in
> unicode.
> In my php.ini file I set no mbstring variables. When I'm connecting to
> the
> database, I SELECT the data and then I print them, with the charset
> utf-8, to the browser and all the characters are correctly displayed.
>
> My question is : is it the right way I don't have to configure anything
> in php for dealing with unicode :) ?

In my (admittedly limited) experience with PHP 4, Unicode, and
PostgreSQL, you can go a long way with the setup you descibe, i.e., not
using multi-byte string functions. However, all I do is move info in
and out of the database: I'm not doing any fancy-pants parsing of the
data in PHP—including data sanity checking (besides preventing SQL
insertion). I would *not* recommend doing it as I've done, though it
does work for me. It's something I'm working on rectifying in my own
code, and rather than have to fix it later, I'd recommend doing it
right the first time.

The reason it works is that PHP (at least as of PHP4) is agnostic about
the strings. It just takes it from the database and hands them to your
code, not trying to read it, parse it, check it, anything unless you
explicitly do so in the code.

Again, I don't recommend this (though I've been doing it myself)
because I don't believe you'll be able to do proper data
checking—especially if you're using higher order (i.e., not ASCII) code
points. For me, this means the Japanese that moves into my database is
completely unchecked, and like I said, that's Not Good. To do proper
checking of the Japanese, I'd need to use $mb_string functions.

I'm interested in hearing other's opinions on this as well,
particularly if they think I'm wrong—I can always learn something!

hth

Michael Glaesemann
grzm myrealbox com