Re: PostgreSQL 8.3.7: soundex function returns UTF-16 characters - Mailing list pgsql-bugs

From Tom Lane
Subject Re: PostgreSQL 8.3.7: soundex function returns UTF-16 characters
Date
Msg-id 802.1239113181@sss.pgh.pa.us
Whole thread Raw
In response to Re: PostgreSQL 8.3.7: soundex function returns UTF-16 characters  (Frans <frans@geodan.nl>)
List pgsql-bugs
Frans <frans@geodan.nl> writes:
> Does it make sense that the locale setting
> influences the workings of the soundex function?

Yeah, it absolutely would, because soundex depends on the C library's
isalpha() and toupper() functions, and those are influenced by locale.

It is clear from looking at the code that soundex isn't expecting
isalpha() to return true for anything except the ASCII letters A-Z,a-z.
That's true in the standard C locale but typically not true in others.
In your example with pi, I think the code would've indexed off the end
of its letter array and gotten unpredictable results.  We could/should
tighten that up, I think, even if we're not willing to rewrite the
code for full multibyte support just yet.

            regards, tom lane

pgsql-bugs by date:

Previous
From: Dimitri Fontaine
Date:
Subject: 8.2 pg_freespacemap crash
Next
From: Tom Lane
Date:
Subject: Re: postgresql-8.3.6-1PGDG : redirect_stderr = on does not start server