Thread: Migrating between versions. Problem with regexp
Dear friends, I have the following problem. libvar=# select version(); version ---------------------------------------------------------------- PostgreSQL 7.2.3 on i386-pc-bsdi4.0.1, compiled by GCC 2.7.2.1 (1 row) libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.'; ?column? ---------- f (1 row) ======================================= libvar=# select version(); version -------------------------------------------------------------------------------------------------------------- PostgreSQL 8.0.0beta1 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.2 (Mandrake Linux 10.0 3.3.2-6mdk) (1 row) libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.'; ERROR: invalid regular expression: invalid escape \ sequence I am using: This is perl, v5.6.0 built for i386-bsdos When I receive a text from CGI I made quotemeta over it to avoid symbols that may be met in regular expressions. Then I made DBI->quote over the same string to avoid any sql injection; The problem is that I use cyrillic and quotemeta put before every cyrillic character a \. Then DBI->quote make \ to \\. And then when I use this string in regular expression in Postgres I recevie an error in the new version of postgres. Could anyone sugest solution to my problem. 10x in advance. Kaloyan
Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes: > libvar=# select '\\a\\�\\�\\.' ~ '\\a\\�\\�\\.'; > ERROR: invalid regular expression: invalid escape \ sequence 7.4 and later are stricter about the use of \ than prior versions; see http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES You could go back to (approximately) the old behavior by changing regex_flavor to "extended". regards, tom lane
Thank you Kaloyan Tom Lane wrote: >Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes: > > >>libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.'; >>ERROR: invalid regular expression: invalid escape \ sequence >> >> > >7.4 and later are stricter about the use of \ than prior versions; see >http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES >You could go back to (approximately) the old behavior by changing >regex_flavor to "extended". > > regards, tom lane > > > >
Dear Tom Lane, I still haven't read all the documentation but I find the following. If I escape with \\ latin letters it OK, but if escape with \\ cyrillic letters then it is a mistake. libvar=# select '\\a\ÿ' ~* '\\a\ÿ'; ?column? ---------- f (1 row) libvar=# select '\\a\\ÿ' ~* '\\a\\ÿ'; ERROR: invalid regular expression: invalid escape \ sequence libvar=# select '\\a\\ÿ' ~* '\\a\ÿ'; ?column? ---------- f (1 row) libvar=# select '\\a\ÿ' ~* '\\a\\ÿ'; ERROR: invalid regular expression: invalid escape \ sequence So I understand that you can't adapt it to all languages and alphabets but in case this is an error and can be fixed I report it. Kaloyan Tom Lane wrote: >Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes: > > >>libvar=# select '\\a\\à\\à\\.' ~ '\\a\\à\\à\\.'; >>ERROR: invalid regular expression: invalid escape \ sequence >> >> > >7.4 and later are stricter about the use of \ than prior versions; see >http://www.postgresql.org/docs/7.4/static/functions-matching.html#POSIX-ESCAPE-SEQUENCES >You could go back to (approximately) the old behavior by changing >regex_flavor to "extended". > > regards, tom lane > >---------------------------(end of broadcast)--------------------------- >TIP 8: explain analyze is your friend > > > > >
Kaloyan Iliev Iliev <news1@faith.digsys.bg> writes: > I still haven't read all the documentation but I find the following. > If I escape with \\ latin letters it OK, but if escape with \\ > cyrillic letters then it is a mistake. No, you're missing the point: there are certain escapes that mean something, and it rejects the rest. Backslash is not a general-purpose quoting character. regards, tom lane