Hello,
unfortunately octal doesn't seem to work either -
On Tue, Mar 19, 2013 at 7:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alexander Farber <alexander.farber@gmail.com> writes:
>> # select 'АБВГД' ~ '^[\u0410-\u042F]{2,}$';
>> WARNING: nonstandard use of escape in a string literal
>
> I think Unicode escapes were introduced in 9.0. In 8.4 you'd probably
> have to write out the UTF8 equivalent as octal escapes :-(
# select 'АБВГД' ~ '^[\2020-\2057]{2,}$';
WARNING: nonstandard use of escape in a string literal
LINE 1: select 'АБВГД' ~ '^[\2020-\2057]{2,}$';
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
ERROR: invalid byte sequence for encoding "UTF8": 0x82
HINT: This error can also happen if the byte sequence does not
match the encoding expected by the server, which is controlled by
"client_encoding".
But writing out UTF8 equivalents seems to work
(trying to detect capitalized Russian letters as per
http://www.unicode.org/charts/PDF/U0400.pdf ):
# select 'АБВГД' ~ '^[А-Я]{2,}$';
?column?
----------
t
(1 row)
And then I try to solve my 2nd problem (detecting 3
letters in a row, a rare case in Russian language):
# select 'ОШИБББКА' ~ '(.)\1\1';
WARNING: nonstandard use of escape in a string literal
LINE 1: select 'ОШИБББКА' ~ '(.)\1\1';
^
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
?column?
----------
f
(1 row)
Does anybody please know why this fails in 8.4.13?
According to the table 9-18 in
http://www.postgresql.org/docs/8.4/static/functions-matching.html
it should be ok to use \1 for referencing
parts captured by round brackets?
Regards
Alex