Thread: How to insert non-english characters to the db?
Hi, I need to store strings with non-english characters in my DB. I created my db with unicode encoding, but am not able to insert data into it from psql nor pgaccess, which seems not to suport non english characters at all. I tryed the sql-postgresql emacs mode, which displays the characters, but does'mt insert them. In a long term I need to insert the values form a perl-cgi script, but wasn't able until now to insert any special character at all. Any suggestions will be appreciated, regards Andreas Fromm
Hi Andreas, In the short run you might use pgSQL4RB <http://aliacta.com/pgsql4rb.htm> which fully supports unicode. It allows you to make postgresql client applications for Mac, OSX, Windows, and soon Linux, all from the same code base. You might use it in demo mode just to insert your foreign laguage characters while you start up your project. Marc At 11:45 AM +0200 9/8/03, Andreas Fromm wrote: >Hi, > >I need to store strings with non-english characters in my DB. I >created my db with unicode encoding, but am not able to insert data >into it from psql nor pgaccess, which seems not to suport non >english characters at all. I tryed the sql-postgresql emacs mode, >which displays the characters, but does'mt insert them. > >In a long term I need to insert the values form a perl-cgi script, >but wasn't able until now to insert any special character at all. > >Any suggestions will be appreciated, regards > >Andreas Fromm > > > > >---------------------------(end of broadcast)--------------------------- >TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Andreas Fromm wrote: > I need to store strings with non-english characters in my DB. I created > my db with unicode encoding, but am not able to insert data into it from > psql nor pgaccess, which seems not to suport non english characters at > all. template1=# create table test ( testcol varchar ) ; CREATE TABLE template1=# insert into test values ( 'föö bär' ); INSERT 154068 1 template1=# select * from test; testcol --------- föö bär (1 row) template1=# \! env | egrep LC\|LANG LC_CTYPE=de_DE.ISO8859-1 As you can see, psql supports non-English characters fine. However, psql respects your locale settings. So, if your locale settings are wrong, it will default to the standard C / POSIX locale, i.e. US-ASCII. The correct setting depends on your operating system. In my case (FreeBSD), I set the environment variable LC_CTYPE to "de_DE.ISO8859-1". On Solaris you can try "iso_8859_1". Regards Oliver -- Oliver Fromme, secnetix GmbH & Co KG, Oettingenstr. 2, 80538 München Any opinions expressed in this message may be personal to the author and may not necessarily reflect the opinions of secnetix in any way. "If you think C++ is not overly complicated, just what is a protected abstract virtual base pure virtual private destructor, and when was the last time you needed one?" -- Tom Cargil, C++ Journal
Aarni Ruuhimäki wrote: >Hi, > >I got my sorting and non-english characters working ok with > >./configure --enable-locale > >and with > >initdb --locale=fi_FI -E LATIN1 > >and just to make sure all databases created with ENCODING = 'LATIN1' > >You probably need to replace locale=fi_FI with locale=de_DE or something >similar. > >BR, > >Aarni > > Thanks for all the replies, but I'm still having problems. I recompiled postgres with ./configure --enable-locale --enable-nls='en de' and made an initdb --locale=de_DE -E UNICODE (tryed LATIN1 also) but when starting the postmaster I get the following message: FATAL: invalid value for "lc_messages": "de_DE" postmaster successfully started I'm on a debian/linux box that has many apps with different localisations availabel. Does the newest version PgSQL 7.4beta not have the localisation support jet? Regards Andreas
Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes: > initdb --locale=de_DE -E UNICODE (tryed LATIN1 also) > FATAL: invalid value for "lc_messages": "de_DE" > I'm on a debian/linux box that has many apps with different > localisations availabel. Does the newest version PgSQL 7.4beta not have > the localisation support jet? No, your problem is that your OS doesn't have complete support for de_DE locale. You will need to pick a supported locale for LC_MESSAGES. I'd recommend something along the lines of export LC_ALL=de_DE export LC_MESSAGES=C -- or something else that works initdb -E UNICODE You could try just editing the lc_messages setting in postgresql.conf, but I am not sure whether that will be sufficient; the original initdb may have suffered internal failures due to the broken locale setting. Safest to redo it. regards, tom lane
2003-09-09 ragyogó napján Andreas Fromm ezt üzente: > FATAL: invalid value for "lc_messages": "de_DE" > postmaster successfully started > > I'm on a debian/linux box that has many apps with different > localisations availabel. Does the newest version PgSQL 7.4beta not have > the localisation support jet? Maybe a littlebit trivial, but locale-gen is correct on your system? -- Tomka Gergely "S most - vajon barbárok nélkül mi lesz velünk? Ők mégiscsak megoldás voltak valahogy..."
Tom Lane wrote:
Tryed it out, but the initdb still fails:Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes:initdb --locale=de_DE -E UNICODE (tryed LATIN1 also) FATAL: invalid value for "lc_messages": "de_DE"I'm on a debian/linux box that has many apps with different localisations availabel. Does the newest version PgSQL 7.4beta not have the localisation support jet?No, your problem is that your OS doesn't have complete support for de_DE locale. You will need to pick a supported locale for LC_MESSAGES. I'd recommend something along the lines of export LC_ALL=de_DEexport LC_MESSAGES=C -- or something else that worksinitdb -E UNICODE You could try just editing the lc_messages setting in postgresql.conf, but I am not sure whether that will be sufficient; the original initdb may have suffered internal failures due to the broken locale setting. Safest to redo it. regards, tom lane
$ initdb -E UNICODE --lc-messages=C
The files belonging to this database system will be owned by user "afromm".
This user must also own the server process.
The database cluster will be initialized with locales:
COLLATE: de_DE CTYPE: de_DE MESSAGES: C
MONETARY: de_DE NUMERIC: de_DE TIME: de_DE
fixing permissions on existing directory /home/afromm/devel/pg/data... ok
creating directory /home/afromm/devel/pg/data/base... ok
creating directory /home/afromm/devel/pg/data/global... ok
creating directory /home/afromm/devel/pg/data/pg_xlog... ok
creating directory /home/afromm/devel/pg/data/pg_clog... ok
selecting default shared_buffers... 50
selecting default max_connections... 10
creating configuration files... ok
creating template1 database in /home/afromm/devel/pg/data/base/1... ok
initializing pg_shadow... FATAL: XX000: failed to initialize lc_messages to ""
LOCATION: InitializeGUCOptions, guc.c:1871
initdb: failed
without the --lc-messages switch I get the same error. Why doesn't initdb initialize lc_messages, as told, to C rather then ""?
In /etc/locale.gen there is nothing set, but shouldn't I be able to create a UTF-8-aware DB on a machine that doesn't have localisation support at all? The problem is that I don't have root access to this machine.
-- Andreas Fromm ----------------------------- Drink wet cement... ... and get stoned
2003-09-09 ragyogó napján Andreas Fromm ezt üzente: > In /etc/locale.gen there is nothing set, but shouldn't I be able to > create a UTF-8-aware DB on a machine that doesn't have localisation > support at all? The problem is that I don't have root access to this > machine. If there are no data in the /etc/locale.gen file, maybe teh sysop never initialize the locale "system" of Debian. This may be the problem, or part of the problem. Maybe. I never try to use debian without localization... -- Tomka Gergely "S most - vajon barbárok nélkül mi lesz velünk? Ők mégiscsak megoldás voltak valahogy..."
Andreas Fromm <Andreas.Fromm@physik.uni-erlangen.de> writes: > Tryed it out, but the initdb still fails: > The database cluster will be initialized with locales: > COLLATE: de_DE CTYPE: de_DE MESSAGES: C > MONETARY: de_DE NUMERIC: de_DE TIME: de_DE > initializing pg_shadow... FATAL: XX000: failed to initialize > lc_messages to "" > LOCATION: InitializeGUCOptions, guc.c:1871 Hm. After looking at the code a little bit, I think I was mistaken to suppose there was something wrong with LC_MESSAGES in particular. It looks like this is simply the first place that tries to set a locale setting and checks the return value from setlocale(). So the most likely theory is that *all* setlocale calls are failing, ie, there's something broken about your system's locale configuration. I don't know enough about locale stuff to know how to fix that; you might try Tomka Gergely's advice to start with. > In /etc/locale.gen there is nothing set, but shouldn't I be able to > create a UTF-8-aware DB on a machine that doesn't have localisation > support at all? This is unrelated to your character-encoding desires. You have a broken system library that Postgres depends on. It would still depend on it no matter what -E value you ask for. regards, tom lane