The following bug has been logged on the website:
Bug reference: 7999
Logged by: david
Email address: somloieater@gmail.com
PostgreSQL version: 9.1.8
Operating system: linux
Description: =
\y and \Y do not behave correctly next to
multibyte utf-8 characters - they seem to invert their senses=CB=90
Propper behaivour with ascii e
'es'~$$\y[e=C9=9B]s$$ =3D> t =
Inverted behaviour with epsilon
'=C9=9Bs'~$$\y[e=C9=9B]s$$ =3D> f
'=C9=9Bs'~$$[e=C9=9B]\ys$$ =3D> t
'=C9=9Bs'~$$[e=C9=9B]\Ys$$ =3D> f
This seems to be a case of utf8 characters not being recognised as
word-forming:
'=C9=9B'~$$\w'$$ =3D> f
I've checked with a few other characters which are >1byte in utf8. U+00F0
counds as \w, but nothing I've tried > FF matches. I wonder if it's
something to do with >256? =
In case anyone else hits this bug, replacing \y with
(^|$|\s|[[:punct:]]) seems to work for me, although it's ugly.