On 30/04/2010 5:33 PM, Kenichiro Tanaka wrote:
> Hi
>
> The hyphen which written in 'Olympus E-PL1' is different from
> the one which written in 'Camera - Black'.
>
> em-dash
> http://www.fileformat.info/info/unicode/char/2014/index.htm
> en-dash
> http://www.fileformat.info/info/unicode/char/2013/index.htm
> figure-dash
> http://www.fileformat.info/info/unicode/char/2012/index.htm
>
> I have no idea to fix using PostgreSQL's function,because they don't equal.
> I think you have to change the data or change the behavior of your
> application .
The usual solution to this sort of thing is to provide a functional
index on the problem field that computes a "simplified" version of the
text - stripping accents, dumbing all dashes down to simple minus signs,
etc.
I'm not aware of any canned tool to do this in PostgreSQL. Everyone's
needs seem to vary, so it'd be hard to provide one.
--
Craig Ringer