Thread: How to insert non-english characters to the db?

How to insert non-english characters to the db?

From
Andreas Fromm
Date:
Hi,

I need to store strings with non-english characters in my DB. I created
my db with unicode encoding, but am not able to insert data into it from
psql nor  pgaccess, which seems not to suport non english characters at
all. I tryed the sql-postgresql emacs mode, which displays the
characters, but does'mt insert them.

In a long term I need to insert the values form a perl-cgi script, but
wasn't able until now to insert any special character at all.

Any suggestions will be appreciated, regards

Andreas Fromm




Re: How to insert non-english characters to the db?

From
"M. Bastin"
Date:
Hi Andreas,

In the short run you might use pgSQL4RB
<http://aliacta.com/pgsql4rb.htm> which fully supports unicode.

It allows you to make postgresql client applications for Mac, OSX,
Windows, and soon Linux, all from the same code base.  You might use
it in demo mode just to insert your foreign laguage characters while
you start up your project.

Marc


At 11:45 AM +0200 9/8/03, Andreas Fromm wrote:
>Hi,
>
>I need to store strings with non-english characters in my DB. I
>created my db with unicode encoding, but am not able to insert data
>into it from psql nor  pgaccess, which seems not to suport non
>english characters at all. I tryed the sql-postgresql emacs mode,
>which displays the characters, but does'mt insert them.
>
>In a long term I need to insert the values form a perl-cgi script,
>but wasn't able until now to insert any special character at all.
>
>Any suggestions will be appreciated, regards
>
>Andreas Fromm
>
>
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


Re: How to insert non-english characters to the db?

From
Oliver Fromme
Date:
Andreas Fromm wrote:
 > I need to store strings with non-english characters in my DB. I created
 > my db with unicode encoding, but am not able to insert data into it from
 > psql nor  pgaccess, which seems not to suport non english characters at
 > all.

template1=# create table test ( testcol varchar ) ;
CREATE TABLE
template1=# insert into test values ( 'föö bär' );
INSERT 154068 1
template1=# select * from test;
 testcol
---------
 föö bär
 (1 row)

template1=# \! env | egrep LC\|LANG
LC_CTYPE=de_DE.ISO8859-1

As you can see, psql supports non-English characters fine.
However, psql respects your locale settings.  So, if your
locale settings are wrong, it will default to the standard
C / POSIX locale, i.e. US-ASCII.

The correct setting depends on your operating system.  In
my case (FreeBSD), I set the environment variable LC_CTYPE
to "de_DE.ISO8859-1".  On Solaris you can try "iso_8859_1".

Regards
   Oliver

--
Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München
Any opinions expressed in this message may be personal to the author
and may not necessarily reflect the opinions of secnetix in any way.

"If you think C++ is not overly complicated, just what is a protected
abstract virtual base pure virtual private destructor, and when was the
last time you needed one?"
        -- Tom Cargil, C++ Journal

Re: How to insert non-english characters to the db?

From
Andreas Fromm
Date:
Aarni Ruuhimäki wrote:

>Hi,
>
>I got my sorting and non-english characters working ok with
>
>./configure --enable-locale
>
>and with
>
>initdb --locale=fi_FI -E LATIN1
>
>and just to make sure all databases created with ENCODING = 'LATIN1'
>
>You probably need to replace locale=fi_FI with locale=de_DE or something
>similar.
>
>BR,
>
>Aarni
>
>
Thanks for all the replies, but I'm still having problems. I recompiled
postgres with

./configure --enable-locale --enable-nls='en de'

and made an

initdb --locale=de_DE -E UNICODE (tryed LATIN1 also)

but when starting the postmaster I get the following message:

FATAL:  invalid value for "lc_messages": "de_DE"
postmaster successfully started

I'm on a debian/linux box that has many apps with different
localisations availabel. Does the newest version PgSQL 7.4beta not have
the localisation support jet?

Regards

Andreas


Re: How to insert non-english characters to the db?

From
Tom Lane
Date:
Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes:
> initdb --locale=de_DE -E UNICODE (tryed LATIN1 also)
> FATAL:  invalid value for "lc_messages": "de_DE"

> I'm on a debian/linux box that has many apps with different
> localisations availabel. Does the newest version PgSQL 7.4beta not have
> the localisation support jet?

No, your problem is that your OS doesn't have complete support for de_DE
locale.  You will need to pick a supported locale for LC_MESSAGES.  I'd
recommend something along the lines of

    export LC_ALL=de_DE
    export LC_MESSAGES=C        -- or something else that works
    initdb -E UNICODE

You could try just editing the lc_messages setting in postgresql.conf,
but I am not sure whether that will be sufficient; the original initdb
may have suffered internal failures due to the broken locale setting.
Safest to redo it.

            regards, tom lane

Re: How to insert non-english characters to the db?

From
Tomka Gergely
Date:
2003-09-09 ragyogó napján Andreas Fromm ezt üzente:

> FATAL:  invalid value for "lc_messages": "de_DE"
> postmaster successfully started
>
> I'm on a debian/linux box that has many apps with different
> localisations availabel. Does the newest version PgSQL 7.4beta not have
> the localisation support jet?

Maybe a littlebit trivial, but locale-gen is correct on your system?

--
Tomka Gergely
"S most - vajon barbárok nélkül mi lesz velünk?
Ők mégiscsak megoldás voltak valahogy..."


Re: How to insert non-english characters to the db?

From
Andreas Fromm
Date:


Tom Lane wrote:
Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes: 
initdb --locale=de_DE -E UNICODE (tryed LATIN1 also)
FATAL:  invalid value for "lc_messages": "de_DE"   
 
I'm on a debian/linux box that has many apps with different 
localisations availabel. Does the newest version PgSQL 7.4beta not have 
the localisation support jet?   
No, your problem is that your OS doesn't have complete support for de_DE
locale.  You will need to pick a supported locale for LC_MESSAGES.  I'd
recommend something along the lines of
export LC_ALL=de_DEexport LC_MESSAGES=C		-- or something else that worksinitdb -E UNICODE

You could try just editing the lc_messages setting in postgresql.conf,
but I am not sure whether that will be sufficient; the original initdb
may have suffered internal failures due to the broken locale setting.
Safest to redo it.
		regards, tom lane 
Tryed it out, but the initdb still fails:

$ initdb -E UNICODE --lc-messages=C
The files belonging to this database system will be owned by user "afromm".
This user must also own the server process.

The database cluster will be initialized with locales:
    COLLATE:  de_DE     CTYPE:   de_DE  MESSAGES: C
    MONETARY: de_DE     NUMERIC: de_DE  TIME:     de_DE

fixing permissions on existing directory /home/afromm/devel/pg/data... ok
creating directory /home/afromm/devel/pg/data/base... ok
creating directory /home/afromm/devel/pg/data/global... ok
creating directory /home/afromm/devel/pg/data/pg_xlog... ok
creating directory /home/afromm/devel/pg/data/pg_clog... ok
selecting default shared_buffers... 50
selecting default max_connections... 10
creating configuration files... ok
creating template1 database in /home/afromm/devel/pg/data/base/1... ok
initializing pg_shadow... FATAL:  XX000: failed to initialize lc_messages to ""
LOCATION:  InitializeGUCOptions, guc.c:1871

initdb: failed


without the --lc-messages switch I get the same error. Why doesn't initdb initialize lc_messages, as told, to C rather then ""?

In /etc/locale.gen there is nothing set, but shouldn't I be able to create a UTF-8-aware DB on a machine that doesn't have localisation support at all? The problem is that I don't have root access to this machine.
-- 
Andreas Fromm

-----------------------------
Drink wet cement...          ... and get stoned

Re: How to insert non-english characters to the db?

From
Tomka Gergely
Date:
2003-09-09 ragyogó napján Andreas Fromm ezt üzente:

> In /etc/locale.gen there is nothing set, but shouldn't I be able to
> create a UTF-8-aware DB on a machine that doesn't have localisation
> support at all? The problem is that I don't have root access to this
> machine.

If there are no data in the /etc/locale.gen file, maybe teh sysop never
initialize the locale "system" of Debian. This may be the problem, or part
of the problem. Maybe. I never try to use debian without localization...

--
Tomka Gergely
"S most - vajon barbárok nélkül mi lesz velünk?
Ők mégiscsak megoldás voltak valahogy..."


Re: How to insert non-english characters to the db?

From
Tom Lane
Date:
Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes:
> Tryed it out, but the initdb still fails:

> The database cluster will be initialized with locales:
>     COLLATE:  de_DE     CTYPE:   de_DE  MESSAGES: C
>     MONETARY: de_DE     NUMERIC: de_DE  TIME:     de_DE

> initializing pg_shadow... FATAL:  XX000: failed to initialize
> lc_messages to ""
> LOCATION:  InitializeGUCOptions, guc.c:1871

Hm.  After looking at the code a little bit, I think I was mistaken to
suppose there was something wrong with LC_MESSAGES in particular.
It looks like this is simply the first place that tries to set a locale
setting and checks the return value from setlocale().  So the most
likely theory is that *all* setlocale calls are failing, ie, there's
something broken about your system's locale configuration.

I don't know enough about locale stuff to know how to fix that;
you might try Tomka Gergely's advice to start with.

> In /etc/locale.gen there is nothing set, but shouldn't I be able to
> create a UTF-8-aware DB on a machine that doesn't have localisation
> support at all?

This is unrelated to your character-encoding desires.  You have a broken
system library that Postgres depends on.  It would still depend on it no
matter what -E value you ask for.

            regards, tom lane