Thread: UNICODE

UNICODE

From
Jean-Michel POURE
Date:
Dear all,

I am running PostgreSQL 7.1.2 with UNICODE support in production.
Maybe I miss something about UNICODE:

CREATE TABLE "test" (  "source_oid" serial,  "source_timestamp" timestamp,  "source_creation" date DEFAULT 'now',
"source_modification"date DEFAULT 'now',  "source_content" text
 
);

INSERT INTO test (source_content) VALUES ('Photocopie du permis de 
construire accepté.');

Now, when trying :
SELECT * FROM test WHERE source_content ILIKE '%accept%';  ---> returns the 
record;
SELECT * FROM test WHERE source_content ILIKE '%accepté%' ---> returns nothing
SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns nothing

The same happens from ODBC, PHP and psql. Can you reproduce this?

I have tried
Best regards,
Jean-Michel POURE


Re: UNICODE

From
Marko Kreen
Date:
On Sun, Oct 28, 2001 at 09:22:24AM +0100, Jean-Michel POURE wrote:
> Dear all,
> 
> I am running PostgreSQL 7.1.2 with UNICODE support in production.
> Maybe I miss something about UNICODE:

> SELECT * FROM test WHERE source_content ILIKE '%accepté%' ---> returns 
> nothing
> SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns 
> nothing
> 
> The same happens from ODBC, PHP and psql. Can you reproduce this?

1) Did you compile PostgreSQL with --enable-locale
2) Did you set correct locale for postmaster (LANG=xxx)

-- 
marko



Re: UNICODE

From
Jean-Michel POURE
Date:
>1) Did you compile PostgreSQL with --enable-locale
Yes.

>2) Did you set correct locale for postmaster (LANG=xxx)
Database was create using CREATE db WITH ENCODING='UNICODE'.
pgsql: \encoding returns UNICODE.

The db stores multiple languages (French, English, Japanese).
Why should I define a *single* locale for postmaster?
Do I miss something?

Best regards,
Jean-Michel POURE



Re: UNICODE

From
Marko Kreen
Date:
On Sun, Oct 28, 2001 at 09:22:24AM +0100, Jean-Michel POURE wrote:
> 
> I am running PostgreSQL 7.1.2 with UNICODE support in production.
> Maybe I miss something about UNICODE:
> 
> CREATE TABLE "test" (
>   "source_oid" serial,
>   "source_timestamp" timestamp,
>   "source_creation" date DEFAULT 'now',
>   "source_modification" date DEFAULT 'now',
>   "source_content" text
> );
> 
> INSERT INTO test (source_content) VALUES ('Photocopie du permis de 
> construire accepté.');
> 
> Now, when trying :
> SELECT * FROM test WHERE source_content ILIKE '%accept%';  ---> returns the 
> record;
> SELECT * FROM test WHERE source_content ILIKE '%accepté%' ---> returns 
> nothing
> SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns 
> nothing
> 
> The same happens from ODBC, PHP and psql. Can you reproduce this?

Sorry, I misinterpreted what your problem is.  I somehow thought
you want the 'é' and 'e' produce same result - for that you need
to mess with locale, but LIKE does not use locale anyway...

Now I reread you message and here's hint:

* If client_encoding == server_encoding, the bytes are put into DB as-is - no conversion is done.

So are you abslutely sure you have on client side UTF8 strings?
Unfortunately you cant use client_encoding=latin1 as PostgreSQL 
refuses the do the conversion between them.  (I am with 7.1.3)

Eg. I did the following:

* created db with encoding = UNICODE
* Put your example into test.sql
* iconv -f latin1 -t utf8 test.sql > test2.sql
* psql < test2.sql

and it worked as it should...


-- 
marko



Re: UNICODE

From
Jean-Michel POURE
Date:
I only want this query to work under Unicode:
SELECT * FROM test WHERE source_content ILIKE '%accepté%'.

>* If client_encoding == server_encoding, the bytes are put into
>   DB as-is - no conversion is done.
>
>So are you absolutely sure you have on client side UTF8 strings?
PostgreSQL is compiled with UNICODE and LOCALE support.
Unicode is used on both ends (PostgreSQL and psql).

>Unfortunately you cant use client_encoding=latin1 as PostgreSQL
>refuses the do the conversion between them.  (I am with 7.1.3)
According to the on-line manual, only MULE provides instant transcoding.

>Eg. I did the following:
>
>* created db with encoding = UNICODE
>* Put your example into test.sql
>* iconv -f latin1 -t utf8 test.sql > test2.sql
>* psql < test2.sql
>
>and it worked as it should...

Nice to hear it works when transcoding files to UTF-8. It shows it is not a 
back-end problem.

As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire 
accepté') in Psql.
Psql does not insert the data and I have to kill it manually. Can you 
reproduce this?

Best regards,
Jean-Michel POURE


Re: UNICODE

From
Marko Kreen
Date:
On Sun, Oct 28, 2001 at 12:44:26PM +0100, Jean-Michel POURE wrote:
> I only want this query to work under Unicode:
> SELECT * FROM test WHERE source_content ILIKE '%accepté%'.

As I showed it works, if data in db is in UTF-8 and the query
string 'accepté' is in UTF8

> >* If client_encoding == server_encoding, the bytes are put into
> >  DB as-is - no conversion is done.
> >
> >So are you absolutely sure you have on client side UTF8 strings?
> PostgreSQL is compiled with UNICODE and LOCALE support.
> Unicode is used on both ends (PostgreSQL and psql).

psql uses your input literally - so is your console/xterm in
UNICODE/UTF8?

> >Eg. I did the following:
> >
> >* created db with encoding = UNICODE
> >* Put your example into test.sql
> >* iconv -f latin1 -t utf8 test.sql > test2.sql
> >* psql < test2.sql
> >
> >and it worked as it should...
> 
> Nice to hear it works when transcoding files to UTF-8. It shows it is not a 
> back-end problem.
> 
> As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire 
> accepté') in Psql.

As I said - psql does not do any conversion.

> Psql does not insert the data and I have to kill it manually. Can you 
> reproduce this?

No.  If it hangs this is serious problem.  Or did you simply
forgot final ';' ?   It btw does not seem valid sql to me,
considering you previously provided table structure.

In the end: are the strings/queries you give to psql/pg_exec
UTF-8 - this is now main thing, as you have _configured_
everything correctly.

-- 
marko



Re: UNICODE

From
Jean-Michel POURE
Date:
>psql uses your input literally - so is your console/xterm in
>UNICODE/UTF8?
Client: \encoding returns 'UNICODE'.
Server: \list show databases. All databases are UNICODE (except TEMPLATE0 
and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1 distribution 
and think my console is UNICODE.

> > As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
> > accepté') in Psql.
>As I said - psql does not do any conversion.
The faulty query is: INSERT INTO test (source_content) VALUES ('Permis de 
conduire accepté');

I just can't believe that Psql is not UTF-8 compatible. It seems unreal as 
Psql is PostgreSQL #1 helper application. Should I use PostgreSQL MULE 
encoding to have automatic trans coding. What are the guidelines, I am 
completely lost.

> > Psql does not insert the data and I have to kill it manually. Can you
> > reproduce this?
>No.  If it hangs this is serious problem.  Or did you simply
>forgot final ';' ?   It btw does not seem valid sql to me,
>considering you previously provided table structure.
Is it possible that my database is corrupted? I have used pg_dump several 
times to dump data from production server to development servers and 
conversely. Does pg_dump produce UTF8 output? What are the guidelines when 
using UTF-8: forget psql and pg_dump?

>In the end: are the strings/queries you give to psql/pg_exec
>UTF-8 - this is now main thing, as you have _configured_
>everything correctly.
Everything is configured correctly server-side (PostgreSQL, Psql).

Thank you very much for your support Marko,
Best regards,
Jean-Michel




Re: UNICODE

From
Marko Kreen
Date:
On Sun, Oct 28, 2001 at 02:34:49PM +0100, Jean-Michel POURE wrote:
> 
> >psql uses your input literally - so is your console/xterm in
> >UNICODE/UTF8?
> Client: \encoding returns 'UNICODE'.
> Server: \list show databases. All databases are UNICODE (except TEMPLATE0 
> and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1 distribution 
> and think my console is UNICODE.

You think?  Try this:
$ echo "accepté" | od -c

If your term is in utf you should get:
0000000   a   c   c   e   p   t 303 251  \n0000011

If in iso-8859-1:
0000000   a   c   c   e   p   t 351  \n0000010

It may be in some other 8bit encoding too, then the last number
may be different.

> >> As for me, I typed INSERT INTO source_content VALUES ('Permis de conduire
> >> accepté') in Psql.
> >As I said - psql does not do any conversion.
> The faulty query is: INSERT INTO test (source_content) VALUES ('Permis de 
> conduire accepté');

Hmm.  It may be a bug in input routines.  You give PostgreSQL a
1byte 'é', it expects 2 byte char and overflows somewhere.  Can
you reproduce it on 7.1.3?  Maybe its fixed there, I cant
reproduce it.

> I just can't believe that Psql is not UTF-8 compatible. It seems unreal as 
> Psql is PostgreSQL #1 helper application. Should I use PostgreSQL MULE 
> encoding to have automatic trans coding. What are the guidelines, I am 
> completely lost.

psql & pg_dump are fine.  Your problem is that you dont give to
psql and pg_exec/PHP utf-8 strings, but some iso-8859-*.

> >> Psql does not insert the data and I have to kill it manually. Can you
> >> reproduce this?
> >No.  If it hangs this is serious problem.  Or did you simply
> >forgot final ';' ?   It btw does not seem valid sql to me,
> >considering you previously provided table structure.
> Is it possible that my database is corrupted? I have used pg_dump several 
> times to dump data from production server to development servers and 
> conversely. Does pg_dump produce UTF8 output? What are the guidelines when 
> using UTF-8: forget psql and pg_dump?

As I said, psql & pg_dump are fine, they do not touch your data
when it passes through them.

It may be that all of your database is in latin1, as you
inserted strings in this encoding, not utf8.  Basically
PostgreSQL server also does not touch your data, only its
compare functions does not work, as the strings are not in
encoding you tell they are.

Solution to this is to dump your data, use the iconv utility
to convert it to utf8 and reload.


To see this you should do:
$ psql -c "SELECT source_contect FROM table where ..." \    | od -c

And then look whether the weird characters are represented in
1 or 2 bytes.

-- 
marko



Re: UNICODE

From
Jean-Michel POURE
Date:
At 17:09 28/10/01 +0200, you wrote:
>On Sun, Oct 28, 2001 at 02:34:49PM +0100, Jean-Michel POURE wrote:
> >
> > >psql uses your input literally - so is your console/xterm in
> > >UNICODE/UTF8?
> > Client: \encoding returns 'UNICODE'.
> > Server: \list show databases. All databases are UNICODE (except TEMPLATE0
> > and TEMPLATE1 which are ASCII of course). I use a Mandrake 8.1 
> distribution
> > and think my console is UNICODE.
>
>You think?  Try this:
>
>         $ echo "accepté" | od -c
>
>If your term is in utf you should get:
>
>         0000000   a   c   c   e   p   t 303 251  \n
>         0000011
>
>If in iso-8859-1:
>
>         0000000   a   c   c   e   p   t 351  \n
>         0000010
>
>It may be in some other 8bit encoding too, then the last number
>may be different.
It is:        0000000   a   c   c  e   p   t     é \n        0000010


>Hmm.  It may be a bug in input routines.  You give PostgreSQL a
>1byte 'é', it expects 2 byte char and overflows somewhere.  Can
>you reproduce it on 7.1.3?  Maybe its fixed there, I cant
>reproduce it.

I noticed some longer routines with "é" worked without any problem.
I cannot reproduce it as I converted my database to plain ASCII.
Will try UNICODE on 7.2 beta when adding Japanese text to my database.

Thank you very much for your help.
Best regards, Jean-Michel POURE




Re: UNICODE

From
Marko Kreen
Date:
On Sun, Oct 28, 2001 at 04:37:48PM +0100, Jean-Michel POURE wrote:
> At 17:09 28/10/01 +0200, you wrote:
> >        $ echo "accepté" | od -c
> It is:
>         0000000   a   c   c  e   p   t     é \n
>         0000010

Huh.  Then try 'od -t x1'.  Also what the commend 'locale'
prints.


> >Hmm.  It may be a bug in input routines.  You give PostgreSQL a
> >1byte 'é', it expects 2 byte char and overflows somewhere.  Can
> >you reproduce it on 7.1.3?  Maybe its fixed there, I cant
> >reproduce it.
> 
> I noticed some longer routines with "é" worked without any problem.
> I cannot reproduce it as I converted my database to plain ASCII.
> Will try UNICODE on 7.2 beta when adding Japanese text to my database.

Ok.  I still suggest you try to understand what was going on,
otherwise you will be in trouble again.  The logic around
encodings will be same in 7.2.

-- 
marko



Ultimate DB Server

From
"Mike Rogers"
Date:
I'm questioning whether anyone has done benchmarks on various hardware for
PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual AMD-1200's, etc.
I'm looking for benchmarks of large queries on striped -vs- non-striped
volumes, different processor speeds, etc.

Any thoughts people?


Re: UNICODE

From
Patrice Hédé
Date:
Hi Jean-Micehl,

* Jean-Michel POURE <jm.poure@freesurf.fr> [011028 18:23]:
> 
> >psql uses your input literally - so is your console/xterm in
> >UNICODE/UTF8?
> Client: \encoding returns 'UNICODE'.
> Server: \list show databases. All databases are UNICODE (except
> TEMPLATE0 and TEMPLATE1 which are ASCII of course). I use a Mandrake
> 8.1 distribution and think my console is UNICODE.

I don't know the details for the Mandrake distribution, but I would
rather think the default terminal to be iso-8859-15 or iso-8859-1
encoded (I use myself a linux debian sid, customised to be mixed
iso-8859-15/utf-8 :) ).

In that case, it's likely to cause problems.
One thing is to check your current locale (before running psql), by
typing "locale charmap" on your terminal :

Unicode :

asterix:~$ locale charmap
UTF-8

latin-9 (fr_FR@euro) :

asterix:~$ locale charmap
ISO-8859-15

Then, if you really have a Unicode term, then you may run into other
problems. Psql uses readline, and readline is not yet "utf-8" enabled
by default. There are patches for that, but I don't know why they
don't integrate the support into the code... whatever the reason, it
means that for example Backspace won't work over characters with more
than one byte, and that includes everything which is not ASCII.

So, if while typing in psql, you try to do some text editing over the
"é", then it's likely to mangle your input to psql (without
necessarily be visible in your terminal), and anything from a bad
commandline, to psql waiting for more input... When you've finished
typing your line, check if psql prompt is displaying an "=" sign :

tests=#

Third, depending on how your data is entered vs queried, it may have
some differences. For example, if you use an application which
converts UTF-8 data to D-normalisation before submitting to
PostgreSQL, then the "é" will be stored as "e"+"combining mark acute
accent". Then, when you do your query, you have to submit in the same
format, as "é" (directly typed from the keyboard) and "e"+"comb.acute
accent" are two different things (I plan to add support in PostgreSQL
for this kind of stuff for 7.3, if I manage to go a bit faster on my
other projects...).

Anyway, I have been trying a query like yours, using a UTF-8 xterm,
with a UNICODE encoding, both psql and database :

my table :

tests=# insert into matable values ('un texte accentué', 12);
INSERT 70197 1
tests=# insert into matable values ('ça accentue le problème', 14);
INSERT 70198 1

tests=# select * from matable;        montext         | valeur 
-------------------------+--------un texte accentué       |     12ça accentue le problème |     14
(2 rows)

[note that the "é", "ç" and "è" are not combining forms here...]

tests=# select * from matable where montext ilike '%accentué%';     montext      | valeur 
-------------------+--------un texte accentué |     12
(1 row)

It works fine for me.

> >> As for me, I typed INSERT INTO source_content VALUES ('Permis de
> >> conduire accepté') in Psql.
> >As I said - psql does not do any conversion.
> The faulty query is: INSERT INTO test (source_content) VALUES
> ('Permis de conduire accepté');
> 
> I just can't believe that Psql is not UTF-8 compatible. It seems
> unreal as Psql is PostgreSQL #1 helper application. Should I use
> PostgreSQL MULE encoding to have automatic trans coding. What are
> the guidelines, I am completely lost.

Psql is UTF-8 compatible. However, the terminal support of UTF-8 may
be a little shaky for now (no dead keys, no compose key) and that will
be fixed in Xfree-4.2, and readline support of UTF-8 is deficient (as
is bash's, where readline comes from). I don't know when *that* will
be fixed. I know http://www.li18nux.org/ has some patches, but I
haven't tried them yet.

> >> Psql does not insert the data and I have to kill it manually. Can
> >> you reproduce this?
> >No.  If it hangs this is serious problem.  Or did you simply forgot
> >final ';' ?   It btw does not seem valid sql to me, considering you
> >previously provided table structure.
> Is it possible that my database is corrupted? I have used pg_dump
> several times to dump data from production server to development
> servers and conversely. Does pg_dump produce UTF8 output? What are
> the guidelines when using UTF-8: forget psql and pg_dump?

One thing you really have to be careful about is the locale you're
running your terminal into (cf above with "locale charmap"). A lot of
tools are sensitive to that, as soon as they set the locale, and also
the terminal itself is sensitive to that (if you run an xterm, a
gnome-terminal or other, make sure they are started themselves with
the correct locale, rather than the locale being set by a .bashrc or
.profile AFTER the xterm is launched. One way to be sure is to launch
an Xterm from the command line in an other xterm ;) ).

> >In the end: are the strings/queries you give to psql/pg_exec UTF-8
> >- this is now main thing, as you have _configured_ everything
> >correctly.
> Everything is configured correctly server-side (PostgreSQL, Psql).
> 
> Thank you very much for your support Marko,
> Best regards,
> Jean-Michel

It's possible to work with psql and UTF-8, I'm using it :) But support
for utf-8 is not complete yet, and it's not seamless. Also, support in
Postgresql is not yet complete for UTF-8 (normalisation forms,
collation, regexes...), but it'll come :)

Patrice.

-- 
Patrice Hédé
email: patrice hede à islande org
www  : http://www.islande.org/


Re: Ultimate DB Server

From
Jean-Michel POURE
Date:
At 13:07 28/10/01 -0400, you wrote:
>I'm questioning whether anyone has done benchmarks on various hardware for
>PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual AMD-1200's, etc.
>I'm looking for benchmarks of large queries on striped -vs- non-striped
>volumes, different processor speeds, etc.

Hello Mike,

IMHO, you should consider *simple* software optimization first.

Hardware can bring a 2x gain whereas software optimization can boost an 
application by 10x. Until now, I never heard or read about a real *software 
optimization* benchmark between MySQL and PostgreSQL.

Software optimization includes the use of views, triggers, rules, PL/pgSQL 
server side programming. By definition, it is hard to compare MySQL with 
PostgreSQL because MySQL *does not include* these important features (and 
probably will never do).

I see at least two easy cases where PostgreSQL beats MySQL:
1) Create a simple relational DB with triggers storing values instead of 
performing LEFT JOINS. Increase the number of simultaneous queries. MySQL 
will die at x queries and PostgreSQL will still be working at 5x queries.
2) Use PL/pgSQL to perform complex jobs normally devoted to an application 
server (Java, PHP) on a separate platform. In some case (recursive loops 
for example), network traffic can be divided by 100. As a result, 
PostgreSQL can be 10x faster because everything is performed server-side.

This is to say that, in some circomstances, PostgreSQL running on an i586 
with IDE drive beats MySQL on a double Pentium. In real life, applications 
are always optimized at software level first before hardware level. This is 
why PostsgreSQL is *by nature* better than MySQL.

Unless MySQL gets better, there is no real challenge in comparing both systems.

Cheers,
Jean-Michel POURE



Re: Ultimate DB Server

From
mlw
Date:
> Hardware can bring a 2x gain whereas software optimization can boost an
> application by 10x. Until now, I never heard or read about a real *software
> optimization* benchmark between MySQL and PostgreSQL.

It has been my experience that a knowledgeable, SQL savvy engineer can not use
MySQL. You have to have no basic knowledge of SQL to be able to work within its
limitations. Every project with which I have tried MySQL, I have always found
myself trying to work around what I can't do with it. In that respect, it is
like working on Windows.

> I see at least two easy cases where PostgreSQL beats MySQL:
> 1) Create a simple relational DB with triggers storing values instead of
> performing LEFT JOINS. Increase the number of simultaneous queries. MySQL
> will die at x queries and PostgreSQL will still be working at 5x queries.
> 2) Use PL/pgSQL to perform complex jobs normally devoted to an application
> server (Java, PHP) on a separate platform. In some case (recursive loops
> for example), network traffic can be divided by 100. As a result,
> PostgreSQL can be 10x faster because everything is performed server-side.

Server side programming is a double edged sword. PostgreSQL is not a
distributed database, thus you are limited to the throughput of a single
system. Moving processing off to PHP or Java on a different system can reduce
the load on your server by distributing processing to other systems. If you can
cut query execution time by moving work off to other systems, you can
effectively increase the capacity of your database server.

Typically, on a heavily used database, you should try to limit server side
programming to that which reduces the database work load. If you are moving
work, which can be done on the client, back to the server, you will bottleneck
at the server while the client is sitting idle.

> 
> This is to say that, in some circomstances, PostgreSQL running on an i586
> with IDE drive beats MySQL on a double Pentium. In real life, applications
> are always optimized at software level first before hardware level. This is
> why PostsgreSQL is *by nature* better than MySQL.

One of the reasons why PostgreSQL beats MySQL, IMHO, is that it has the SQL
features that allow you to control and reduce the database work load by doing
things smarter.

> 
> Unless MySQL gets better, there is no real challenge in comparing both systems.

It is funny, I know guys that love MySQL. Even when I show them the cool things
they can do with Postgres, they just don't seem to get it. It is sort of like
talking to an Amiga user.


Re: Ultimate DB Server

From
"Mike Rogers"
Date:
MySQL and PostgreSQL are starting to move together as far as I can see.
MySQL has the _option_ of transactional database formats (you can use both
normal MyISAM tables and transactional tables).  MySQL 4.0 has all those
various features you speak of.  On all too many applications, MySQL kicks
ass.  Admitedly, if you do massive complex database applications, PostgreSQL
can smoke it when done right, but MySQL works great for most tasks.  It's
not even a matter of which is better or how to compare them.  It is a
question of 'what is your purpose for the database' and then deciding based
on the intended purpose.   I did mention that it would be running BOTH MySQL and PostgreSQL, and
not just one.  I use them both for various purposes, depending on the need
and am trying to move it to a seperate server to increase the speed of
queries on BOTH database systems.  It's not a question of which is better,
but a question of what will maximize output for cost.   I think you may have misinterpreted the question
--
Mike

----- Original Message -----
From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
To: <pgsql-hackers@postgresql.org>
Cc: "Mike Rogers" <temp6453@hotmail.com>
Sent: Sunday, October 28, 2001 3:18 PM
Subject: Re: [HACKERS] Ultimate DB Server


> At 13:07 28/10/01 -0400, you wrote:
> >I'm questioning whether anyone has done benchmarks on various hardware
for
> >PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual AMD-1200's,
etc.
> >I'm looking for benchmarks of large queries on striped -vs- non-striped
> >volumes, different processor speeds, etc.
>
> Hello Mike,
>
> IMHO, you should consider *simple* software optimization first.
>
> Hardware can bring a 2x gain whereas software optimization can boost an
> application by 10x. Until now, I never heard or read about a real
*software
> optimization* benchmark between MySQL and PostgreSQL.
>
> Software optimization includes the use of views, triggers, rules, PL/pgSQL
> server side programming. By definition, it is hard to compare MySQL with
> PostgreSQL because MySQL *does not include* these important features (and
> probably will never do).
>
> I see at least two easy cases where PostgreSQL beats MySQL:
> 1) Create a simple relational DB with triggers storing values instead of
> performing LEFT JOINS. Increase the number of simultaneous queries. MySQL
> will die at x queries and PostgreSQL will still be working at 5x queries.
> 2) Use PL/pgSQL to perform complex jobs normally devoted to an application
> server (Java, PHP) on a separate platform. In some case (recursive loops
> for example), network traffic can be divided by 100. As a result,
> PostgreSQL can be 10x faster because everything is performed server-side.
>
> This is to say that, in some circomstances, PostgreSQL running on an i586
> with IDE drive beats MySQL on a double Pentium. In real life, applications
> are always optimized at software level first before hardware level. This
is
> why PostsgreSQL is *by nature* better than MySQL.
>
> Unless MySQL gets better, there is no real challenge in comparing both
systems.
>
> Cheers,
> Jean-Michel POURE
>
>


Re: UNICODE

From
Tatsuo Ishii
Date:
> In that case, it's likely to cause problems.
> One thing is to check your current locale (before running psql), by
> typing "locale charmap" on your terminal :
> 
> Unicode :
> 
> asterix:~$ locale charmap
> UTF-8

Just curious. Are there any working charmap for UTF-8? I mean, the
charmap contains not only ISO 8859-* but also other languages defined
in UNICODE 2.0 at least. I coudn't find such a thing around me. Also,
does it handle Unicode combined characters?
--
Tatsuo Ishii


Re: UNICODE

From
Tatsuo Ishii
Date:
I'm not sure what you are expecting but...

> SELECT * FROM test WHERE source_content ILIKE '%accept%'; ---> returns a record

> SELECT * FROM test WHERE source_content ILIKE '%accepté%' ---> returns a record

> SELECT * FROM test WHERE source_content ILIKE '%accepte%' ---> returns 0 record

So all of above seem to work fine for me.

$ pg_config --configure
--prefix=/usr/local/pgsql --enable-multibyte=EUC_JP --enable-unicode-conversion --with-tcl --with-perl --enable-syslog
--enable-debug--with-CXX --with-java
 
$ pg_config --version
PostgreSQL 7.1.3
--
Tatsuo Ishii


Re: Ultimate DB Server

From
"Christopher Kings-Lynne"
Date:
> MySQL and PostgreSQL are starting to move together as far as I can see.
> MySQL has the _option_ of transactional database formats (you can use both
> normal MyISAM tables and transactional tables).  MySQL 4.0 has all those
> various features you speak of.

Not it doesn't.

It supports the UNION statement (thank god!)

And this is for 4.1:

-------
MySQL 4.1, the following development release

Internally, through a new .frm file format for table definitions, MySQL 4.0
lays the foundation for the new features of MySQL 4.1, such as nested
subqueries, stored procedures, and foreign key integrity rules, which form
the top of the wish list for many of our customers. Along with those, we
will also include simpler additions, such as multi-table UPDATE statements.

After those additions, critics of MySQL have to be more imaginative than
ever in pointing out deficiencies in the MySQL Database Management System.
For long already known for its stability, speed, and ease of use, MySQL will
then match the requirement checklist of very demanding buyers.
--------

I don't get how you can have different tables being transactional in your
database??

ie. What on earth does this do? (pseudo)

create table blah not_transactional;
create table hum not_transactional;

begin;
insert into blah values (1);
insert into hum values (2);
rollback;

?????


On all too many applications, MySQL kicks
> ass.  Admitedly, if you do massive complex database applications,
> PostgreSQL
> can smoke it when done right, but MySQL works great for most tasks.  It's
> not even a matter of which is better or how to compare them.  It is a
> question of 'what is your purpose for the database' and then
> deciding based
> on the intended purpose.
>     I did mention that it would be running BOTH MySQL and PostgreSQL, and
> not just one.  I use them both for various purposes, depending on the need
> and am trying to move it to a seperate server to increase the speed of
> queries on BOTH database systems.  It's not a question of which is better,
> but a question of what will maximize output for cost.
>     I think you may have misinterpreted the question
> --
> Mike
>
> ----- Original Message -----
> From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
> To: <pgsql-hackers@postgresql.org>
> Cc: "Mike Rogers" <temp6453@hotmail.com>
> Sent: Sunday, October 28, 2001 3:18 PM
> Subject: Re: [HACKERS] Ultimate DB Server
>
>
> > At 13:07 28/10/01 -0400, you wrote:
> > >I'm questioning whether anyone has done benchmarks on various hardware
> for
> > >PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual AMD-1200's,
> etc.
> > >I'm looking for benchmarks of large queries on striped -vs- non-striped
> > >volumes, different processor speeds, etc.
> >
> > Hello Mike,
> >
> > IMHO, you should consider *simple* software optimization first.
> >
> > Hardware can bring a 2x gain whereas software optimization can boost an
> > application by 10x. Until now, I never heard or read about a real
> *software
> > optimization* benchmark between MySQL and PostgreSQL.
> >
> > Software optimization includes the use of views, triggers,
> rules, PL/pgSQL
> > server side programming. By definition, it is hard to compare MySQL with
> > PostgreSQL because MySQL *does not include* these important
> features (and
> > probably will never do).
> >
> > I see at least two easy cases where PostgreSQL beats MySQL:
> > 1) Create a simple relational DB with triggers storing values instead of
> > performing LEFT JOINS. Increase the number of simultaneous
> queries. MySQL
> > will die at x queries and PostgreSQL will still be working at
> 5x queries.
> > 2) Use PL/pgSQL to perform complex jobs normally devoted to an
> application
> > server (Java, PHP) on a separate platform. In some case (recursive loops
> > for example), network traffic can be divided by 100. As a result,
> > PostgreSQL can be 10x faster because everything is performed
> server-side.
> >
> > This is to say that, in some circomstances, PostgreSQL running
> on an i586
> > with IDE drive beats MySQL on a double Pentium. In real life,
> applications
> > are always optimized at software level first before hardware level. This
> is
> > why PostsgreSQL is *by nature* better than MySQL.
> >
> > Unless MySQL gets better, there is no real challenge in comparing both
> systems.
> >
> > Cheers,
> > Jean-Michel POURE
> >
> >
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



Re: Ultimate DB Server

From
"Christopher Kings-Lynne"
Date:
Doh! I messed up my example!

The first table was supposed to be transactional.

> I don't get how you can have different tables being transactional in your
> database??
> 
> ie. What on earth does this do? (pseudo)
> 
create table blah transactional;
> create table hum not_transactional;
> 
> begin;
> insert into blah values (1);
> insert into hum values (2);
> rollback;
> 
> ?????

Chris



Re: Ultimate DB Server

From
bpalmer
Date:
> Not it doesn't.
>
> It supports the UNION statement (thank god!)
>
> And this is for 4.1:

This crap shouldn't be on the hackers list,  please take it else where.
The hackers lists is for people developing postgresql,  not for people
auguing about the merits of postgresql vs mysql.


Please go elsewhere.

- Brandon


----------------------------------------------------------------------------c: 646-456-5455
              h: 201-798-4983b. palmer,  bpalmer@crimelabs.net           pgp:crimelabs.net/bpalmer.pgp5
 



Re: Ultimate DB Server

From
"Mike Rogers"
Date:
What that does is very simple: it rolls back the one that is keeping track
of it's transactions.  Think of the overhead if someone doesn't have
transactional statements.  The idea is, in PGSQL, all inserts and updates
are essentially logged so that they can be rolled back.  Here is the MySQL
concept:   Have a log table that logs all transactions (lets say, failed or not)       1.    begin transaction       2.
  insert into non-transactional table 'user did this, status-
 
unprocessed'       3.    insert into payment table       4.    insert into product table       5.    update to
processed      6.    insert into shipping       7.    update to 'pending shipping'   Perfectly common transaction that
happens. Now!  What if you want the
 
entry inserted and dealt with as a status and what happens, but you don't
want all the evidence of that to disappear when you hit rollback.  It means
you can have some things roll back and others don't.  In PGSQL, that would
have to be begin/rollback for only transactional entries.
--
Mike

----- Original Message -----
From: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
To: "Mike Rogers" <temp6453@hotmail.com>; <pgsql-hackers@postgresql.org>;
"Jean-Michel POURE" <jm.poure@freesurf.fr>
Sent: Sunday, October 28, 2001 9:52 PM
Subject: RE: [HACKERS] Ultimate DB Server


> > MySQL and PostgreSQL are starting to move together as far as I can see.
> > MySQL has the _option_ of transactional database formats (you can use
both
> > normal MyISAM tables and transactional tables).  MySQL 4.0 has all those
> > various features you speak of.
>
> Not it doesn't.
>
> It supports the UNION statement (thank god!)
>
> And this is for 4.1:
>
> -------
> MySQL 4.1, the following development release
>
> Internally, through a new .frm file format for table definitions, MySQL
4.0
> lays the foundation for the new features of MySQL 4.1, such as nested
> subqueries, stored procedures, and foreign key integrity rules, which form
> the top of the wish list for many of our customers. Along with those, we
> will also include simpler additions, such as multi-table UPDATE
statements.
>
> After those additions, critics of MySQL have to be more imaginative than
> ever in pointing out deficiencies in the MySQL Database Management System.
> For long already known for its stability, speed, and ease of use, MySQL
will
> then match the requirement checklist of very demanding buyers.
> --------
>
> I don't get how you can have different tables being transactional in your
> database??
>
> ie. What on earth does this do? (pseudo)
>
> create table blah not_transactional;
> create table hum not_transactional;
>
> begin;
> insert into blah values (1);
> insert into hum values (2);
> rollback;
>
> ?????
>
>
> On all too many applications, MySQL kicks
> > ass.  Admitedly, if you do massive complex database applications,
> > PostgreSQL
> > can smoke it when done right, but MySQL works great for most tasks.
It's
> > not even a matter of which is better or how to compare them.  It is a
> > question of 'what is your purpose for the database' and then
> > deciding based
> > on the intended purpose.
> >     I did mention that it would be running BOTH MySQL and PostgreSQL,
and
> > not just one.  I use them both for various purposes, depending on the
need
> > and am trying to move it to a seperate server to increase the speed of
> > queries on BOTH database systems.  It's not a question of which is
better,
> > but a question of what will maximize output for cost.
> >     I think you may have misinterpreted the question
> > --
> > Mike
> >
> > ----- Original Message -----
> > From: "Jean-Michel POURE" <jm.poure@freesurf.fr>
> > To: <pgsql-hackers@postgresql.org>
> > Cc: "Mike Rogers" <temp6453@hotmail.com>
> > Sent: Sunday, October 28, 2001 3:18 PM
> > Subject: Re: [HACKERS] Ultimate DB Server
> >
> >
> > > At 13:07 28/10/01 -0400, you wrote:
> > > >I'm questioning whether anyone has done benchmarks on various hardwar
e
> > for
> > > >PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual AMD-1200's,
> > etc.
> > > >I'm looking for benchmarks of large queries on striped -vs-
non-striped
> > > >volumes, different processor speeds, etc.
> > >
> > > Hello Mike,
> > >
> > > IMHO, you should consider *simple* software optimization first.
> > >
> > > Hardware can bring a 2x gain whereas software optimization can boost
an
> > > application by 10x. Until now, I never heard or read about a real
> > *software
> > > optimization* benchmark between MySQL and PostgreSQL.
> > >
> > > Software optimization includes the use of views, triggers,
> > rules, PL/pgSQL
> > > server side programming. By definition, it is hard to compare MySQL
with
> > > PostgreSQL because MySQL *does not include* these important
> > features (and
> > > probably will never do).
> > >
> > > I see at least two easy cases where PostgreSQL beats MySQL:
> > > 1) Create a simple relational DB with triggers storing values instead
of
> > > performing LEFT JOINS. Increase the number of simultaneous
> > queries. MySQL
> > > will die at x queries and PostgreSQL will still be working at
> > 5x queries.
> > > 2) Use PL/pgSQL to perform complex jobs normally devoted to an
> > application
> > > server (Java, PHP) on a separate platform. In some case (recursive
loops
> > > for example), network traffic can be divided by 100. As a result,
> > > PostgreSQL can be 10x faster because everything is performed
> > server-side.
> > >
> > > This is to say that, in some circomstances, PostgreSQL running
> > on an i586
> > > with IDE drive beats MySQL on a double Pentium. In real life,
> > applications
> > > are always optimized at software level first before hardware level.
This
> > is
> > > why PostsgreSQL is *by nature* better than MySQL.
> > >
> > > Unless MySQL gets better, there is no real challenge in comparing both
> > systems.
> > >
> > > Cheers,
> > > Jean-Michel POURE
> > >
> > >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 4: Don't 'kill -9' the postmaster
> >
>
>


Re: Ultimate DB Server

From
Hannu Krosing
Date:
Mike Rogers wrote:
> 
> What that does is very simple: it rolls back the one that is keeping track
> of it's transactions.  Think of the overhead if someone doesn't have
> transactional statements.  The idea is, in PGSQL, all inserts and updates
> are essentially logged so that they can be rolled back.  Here is the MySQL
> concept:
>     Have a log table that logs all transactions (lets say, failed or not)
>         1.    begin transaction
>         2.    insert into non-transactional table 'user did this,
>               status - unprocessed'
>         3.    insert into payment table
>         4.    insert into product table
>         5.    update to processed
>         6.    insert into shipping
>         7.    update to 'pending shipping'
>   Perfectly common transaction that happens.  Now!  What if you want the
> entry inserted and dealt with as a status and what happens, but you don't
> want all the evidence of that to disappear when you hit rollback. 
> It means you can have some things roll back and others don't.  In PGSQL,
> that would have to be begin/rollback for only transactional entries.

Or you would run two parallel transactions (currently you need two
connections 
for this) - one for logging and one for work.

I agree that having non_transactional (i.e. logging) tables may be
sometimes 
desirable. I've been told that some of Oracles debugging/logging
facilities 
are almost useless due-to the fact that they disappear at rollback.

------------------
Hannu


Re: Ultimate DB Server

From
Jean-Michel POURE
Date:
>This crap shouldn't be on the hackers list,  please take it else where.
>The hackers lists is for people developing postgresql,  not for people
>auguing about the merits of postgresql vs mysql.

Agreed. Should be on pgsql-general@postgresql.org.




Re: Ultimate DB Server

From
"Todd Williamsen"
Date:
Ram plays a big factor in queries, most queries are stored in ram.  Also
depends on which platform as well.

Thank you,

Todd Williamsen, MCSE
home: 847.265.4692
Cell: 847.867.9427


-----Original Message-----
From: Mike Rogers [mailto:temp6453@hotmail.com]
Sent: Sunday, October 28, 2001 11:08 AM
To: mysql@lists.mysql.com; pgsql-hackers@postgresql.org;
pgsql-admin@postgresql.org
Subject: Ultimate DB Server


I'm questioning whether anyone has done benchmarks on various hardware
for PGSQL and MySQL.  I'm either thinking dual P3-866's, Dual
AMD-1200's, etc. I'm looking for benchmarks of large queries on striped
-vs- non-striped volumes, different processor speeds, etc.

Any thoughts people?


---------------------------------------------------------------------
Before posting, please check:
   http://www.mysql.com/manual.php   (the manual)
   http://lists.mysql.com/           (the list archive)

To request this thread, e-mail <mysql-thread89232@lists.mysql.com>
To unsubscribe, e-mail
<mysql-unsubscribe-todd=williamsen.net@lists.mysql.com>
Trouble unsubscribing? Try: http://lists.mysql.com/php/unsubscribe.php


Re: Ultimate DB Server

From
mlw
Date:
Mike Mascari wrote:
> 
> mlw wrote:
> >
> ..
> > It is funny, I know guys that love MySQL. Even when I show them the cool things
> > they can do with Postgres, they just don't seem to get it. It is sort of like
> > talking to an Amiga user.
> 
> Hey. As someone who learned 68000 assembly on the Amiga back in '86,
> I take that personally. There's nothing like writing a pixel editor
> in 4096-color HAM mode in 68000 off of a floppy-based Commodore
> Macro Assembler. Sadly, however, I don't yet see the Amiga 1000 on
> the PostgreSQL ports list. ;-)

Sorry, I like to needle Amiga users. One of my closest friends is, what can
only be described as, a complete Amiga zealot. Most of the time it is pretty
fun to get him going, I hope you know it is all good natured fun. 

I built my first robot using an RCA 1802 back in the late '70s. I still think
the P.C. was the worst computer design. Going from any platform to the
8080~8088 was such a let down. If someone had ported CP/M to the RCA 1802 back
in the '70s, computers may have been different today.


Re: Ultimate DB Server

From
Mike Mascari
Date:
mlw wrote:
> 
..
> It is funny, I know guys that love MySQL. Even when I show them the cool things
> they can do with Postgres, they just don't seem to get it. It is sort of like
> talking to an Amiga user.

Hey. As someone who learned 68000 assembly on the Amiga back in '86,
I take that personally. There's nothing like writing a pixel editor
in 4096-color HAM mode in 68000 off of a floppy-based Commodore
Macro Assembler. Sadly, however, I don't yet see the Amiga 1000 on
the PostgreSQL ports list. ;-)

Mike Mascari
mascarm@mascari.com