Thread: Foreign character struggles

Foreign character struggles

From
Tim Perdue
Date:
I compiled postgres with --enable-multibyte and --enable-recode, and it
doesn't appear to help with my problem.

I have a database which contains "foreign" characters in city names, like "S�o
Paulo" (Sao Paulo). 

If an end-user types plain-english Sao Paulo, I want the database to pull up
"S�o Paulo", essentially just treating the accented characters as if they were 
regular ASCII.

select to_ascii(city) from latlong where ccode='BR';
ERROR:  pg_to_ascii(): unsupported encoding from SQL_ASCII

select convert(city,'UNICODE', 'LATIN1') from latlong where ccode='BR';
ERROR:  Could not convert UTF-8 to ISO8859-1

Also, my "Up Arrow" and "Delete" keys no longer work since I recompiled 7.2.3
on debian.

Thanks for any help,

Tim Perdue


Re: Foreign character struggles

From
Tom Lane
Date:
Tim Perdue <tim@perdue.net> writes:
> I compiled postgres with --enable-multibyte and --enable-recode, and it
> doesn't appear to help with my problem.

I think this is a locale issue, not a character set issue.  You
definitely need --enable-locale, but I doubt you need either of the
above (unless you need to deal with Unicode or Far-Eastern languages).

> If an end-user types plain-english Sao Paulo, I want the database to pull up
> "S�o Paulo", essentially just treating the accented characters as if they were 
> regular ASCII.

I'd suggest matching on "upper(city)" to get rid of accents; given the
right locale setting that should work, and you can add a functional
index to make it fast.

> Also, my "Up Arrow" and "Delete" keys no longer work since I recompiled 7.2.3
> on debian.

You are missing libreadline.
        regards, tom lane


Re: Foreign character struggles

From
Roberto Mello
Date:
On Fri, Oct 25, 2002 at 10:37:59AM -0400, Tom Lane wrote:
> 
> I think this is a locale issue, not a character set issue.  You
> definitely need --enable-locale, but I doubt you need either of the
> above (unless you need to deal with Unicode or Far-Eastern languages).

Where is the procedure for working with i18n'd characters described in the
documentation? I'm looking for something that mentions the specifics of
locale interaction and all that. 

I ask because the sort of question Tim asked is a recurrent one in a 
portuguese PostgreSQL mailing list I subscribe to.

Thanks,

-Roberto

-- 
+----|        Roberto Mello   -    http://www.brasileiro.net/  |------+
+       Computer Science Graduate Student, Utah State University      +
+       USU Free Software & GNU/Linux Club - http://fslc.usu.edu/     +


Re: Foreign character struggles

From
Tom Lane
Date:
Roberto Mello <rmello@cc.usu.edu> writes:
> Where is the procedure for working with i18n'd characters described in the
> documentation? I'm looking for something that mentions the specifics of
> locale interaction and all that. 

Offhand I can't think of any section that addresses that topic
specifically, although there are passing mentions in the installation
docs and other places.  Want to write up a new section?

BTW, as of 7.3 both --enable-locale and --enable-multibyte are standard,
so at least the "did you build with the right options?" FAQ will go
away.  There'll still be "did you initdb with the right locale?" to
trap the unwary, though :-(
        regards, tom lane


Re: Foreign character struggles

From
Tim Perdue
Date:
On Fri, Oct 25, 2002 at 10:37:59AM -0400, Tom Lane wrote:
> Tim Perdue <tim@perdue.net> writes:
> > I compiled postgres with --enable-multibyte and --enable-recode, and it
> > doesn't appear to help with my problem.
> 
> I think this is a locale issue, not a character set issue.  You
> definitely need --enable-locale, but I doubt you need either of the
> above (unless you need to deal with Unicode or Far-Eastern languages).

I skipped --enable-locale because I feared I would have to dump/restore
all my databases and require re-testing the application. Is that unfounded?
> > Also, my "Up Arrow" and "Delete" keys no longer work since I recompiled 7.2.3
> > on debian.
> 
> You are missing libreadline.

Thanks. libreadline is there, it just isn't being picked up by psql. Any
suggestions?

Tim Perdue


Re: Foreign character struggles

From
Tom Lane
Date:
Tim Perdue <tim@perdue.net> writes:
> I skipped --enable-locale because I feared I would have to dump/restore
> all my databases and require re-testing the application. Is that unfounded?

If you skipped enable-locale then you are outta luck.  The fact that
there is a connection between "a" and "accented a" is purely a locale
issue.

>> You are missing libreadline.

> Thanks. libreadline is there, it just isn't being picked up by psql. Any
> suggestions?

Do you have both libreadline and libreadline headers (libreadline-devel
rpm, usually)?
        regards, tom lane


Re: Foreign character struggles

From
Tim Perdue
Date:
On Fri, Oct 25, 2002 at 12:24:43PM -0400, Tom Lane wrote:
> If you skipped enable-locale then you are outta luck.  The fact that
> there is a connection between "a" and "accented a" is purely a locale
> issue.

What I meant was, if I recompile --enable-locale and install over the current
builds, I would have to dump/restore everything and re-test the app. Or so I
wondered.

> >> You are missing libreadline.
> 
> > Thanks. libreadline is there, it just isn't being picked up by psql. Any
> > suggestions?
> 
> Do you have both libreadline and libreadline headers (libreadline-devel
> rpm, usually)?

Nope it wasn't, but it is now. When I get the clarification on the above, I'll
rebuild everything.

Tim

-- 
Founder - SourceForge.net / PHPBuilder.com / Geocrawler.com
Perdue, Inc.
515-554-9520


Re: Foreign character struggles

From
Tony Grant
Date:
On Fri, 2002-10-25 at 15:33, Tim Perdue wrote:
> I compiled postgres with --enable-multibyte and --enable-recode, and it
> doesn't appear to help with my problem.

createdb my_db_name -E LATIN1

Worked just fine for me but the client wanted to be able to search with
accents so I turned the to_ascii stuff off. See
www.3continents.com/base_de_donnees.htm and search for "Amnésie" if you
want the english search to work you search for "Amnesia"...

The client wants the user to check spelling...

Before it worked just the way you wanted _but_ I am using a JDBC request
via JSP.

Cheers

Tony Grant

--
www.tgds.net
Library management software toolkit, redhat linux on Sony Vaio C1XD,
Dreamweaver MX with Tomcat and PostgreSQL