Thread: Migrating between versions. Problem with regexp

Migrating between versions. Problem with regexp

From
Kaloyan Iliev Iliev
Date:
Dear friends,

I have the following problem.

libvar=# select version();
                            version
----------------------------------------------------------------
 PostgreSQL 7.2.3 on i386-pc-bsdi4.0.1, compiled by GCC 2.7.2.1
(1 row)

libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.';
 ?column?
----------
 f
(1 row)


=======================================

libvar=# select version();

version
--------------------------------------------------------------------------------------------------------------
 PostgreSQL 8.0.0beta1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC)
3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk)
(1 row)

libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.';
ERROR:  invalid regular expression: invalid escape \ sequence

I am using:
This is perl, v5.6.0 built for i386-bsdos
When I receive a text from CGI I made quotemeta over it to avoid symbols
that may be met in regular expressions.
Then I made DBI->quote over the same string to avoid any sql injection;
The problem is that I use cyrillic and quotemeta put before every
cyrillic character a \.
Then DBI->quote make \ to \\.
And then when I use this string in regular expression in Postgres I
recevie an error in the new version of postgres.
Could anyone sugest solution to my problem.
10x in advance.
   Kaloyan

Re: Migrating between versions. Problem with regexp

From
Tom Lane
Date:
Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes:
> libvar=# select '\\a\\�\\�\\.' ~ '\\a\\�\\�\\.';
> ERROR:  invalid regular expression: invalid escape \ sequence

7.4 and later are stricter about the use of \ than prior versions; see
http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES
You could go back to (approximately) the old behavior by changing
regex_flavor to "extended".

            regards, tom lane

Re: Migrating between versions. Problem with regexp

From
Kaloyan Iliev Iliev
Date:
Thank you
     Kaloyan



Tom Lane wrote:

>Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes:
>
>
>>libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.';
>>ERROR:  invalid regular expression: invalid escape \ sequence
>>
>>
>
>7.4 and later are stricter about the use of \ than prior versions; see
>http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES
>You could go back to (approximately) the old behavior by changing
>regex_flavor to "extended".
>
>            regards, tom lane
>
>
>
>

Re: Migrating between versions. Problem with regexp

From
Kaloyan Iliev Iliev
Date:
Dear Tom Lane,
I still haven't read all the documentation but I find the following.
If I escape with \\  latin  letters it OK, but if escape with \\
cyrillic letters then it is a mistake.

libvar=# select '\\a\ÿ' ~* '\\a\ÿ';
 ?column?
----------
 f
(1 row)

libvar=# select '\\a\\ÿ' ~* '\\a\\ÿ';
ERROR:  invalid regular expression: invalid escape \ sequence
libvar=# select '\\a\\ÿ' ~* '\\a\ÿ';
 ?column?
----------
 f
(1 row)

libvar=# select '\\a\ÿ' ~* '\\a\\ÿ';
ERROR:  invalid regular expression: invalid escape \ sequence

So I understand that you can't adapt it to all languages and alphabets
but in case this is an error and can be fixed I report it.
Kaloyan

Tom Lane wrote:

>Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes:
>
>
>>libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.';
>>ERROR:  invalid regular expression: invalid escape \ sequence
>>
>>
>
>7.4 and later are stricter about the use of \ than prior versions; see
>http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES
>You could go back to (approximately) the old behavior by changing
>regex_flavor to "extended".
>
>            regards, tom lane
>
>---------------------------(end of broadcast)---------------------------
>TIP 8: explain analyze is your friend
>
>
>
>
>

Re: Migrating between versions. Problem with regexp

From
Tom Lane
Date:
Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes:
> I still haven't read all the documentation but I find the following.
> If I escape with \\  latin  letters it OK, but if escape with \\
> cyrillic letters then it is a mistake.

No, you're missing the point: there are certain escapes that mean
something, and it rejects the rest.  Backslash is not a general-purpose
quoting character.

            regards, tom lane