Thread: Regular expression on a string problem.

Regular expression on a string problem.

From
"Paul Mendoza"
Date:
Here is a problem I'm having with a function I've created. It should be returning a varchar value no matter what the input is but I'm getting a null value back. Then it says in the error that I'm using "nonstandard use of escape in a string literal at character 9". What do you think this should be changed to?

Query OK (0.45 sec)
Return Value: NULL
WARNING:  nonstandard use of escape in a string literal at character 9
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
QUERY:  SELECT  '/(\.PR\.)|(\.PR$)/'
CONTEXT:  PL/pgSQL function "clean_name" line 17 at assignment


Here is the function. p_text is a varchar that might have "PR" in it.

DECLARE
       var_regex_pattern varchar;
     

BEGIN


   
     -- If "PR" is not in the text variable.
     IF position('PR' in p_text) = 0 THEN
        var_regex_pattern = '/\./';
        RETURN substring(p_text from var_regex_pattern);
     END IF;
   
     IF position('PR' in p_text) > 0 THEN
        -- If "PR" is in the text variable.
        var_regex_pattern = '/(\.PR\.)|(\.PR$)/';
        RETURN substring(p_text from var_regex_pattern);
     END IF;
   

END;

Re: Regular expression on a string problem.

From
Richard Huxton
Date:
Paul Mendoza wrote:
> Here is a problem I'm having with a function I've created. It should be
> returning a varchar value no matter what the input is but I'm getting a null
> value back. Then it says in the error that I'm using "nonstandard use of
> escape in a string literal at character 9". What do you think this should be
> changed to?
>
> Query OK (0.45 sec)
> Return Value: NULL
> WARNING:  nonstandard use of escape in a string literal at character 9
> HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
> QUERY:  SELECT  '/(\.PR\.)|(\.PR$)/'

It's just the new (SQL-standard) string formatting rules.

Old-style PG strings allowed C-style backslash escapes (typically for
newlines etc, as the hint says). The SQL standard doesn't, so an
"escape" string format was introduced to allow the old behaviour along
with some config settings (standard_conforming_strings,
escape_string_warning, backslash_quote)

Now, you're not actually using the backslash as a C-escape, just as a
protector to the regexp. So, escape_string_warning is warning you
unnecessarily.

Now, none of that is why you're getting no results.

You're getting no results because you're using Perl-compatible regexps
and PostgreSQL is expecting POSIX-style*. See chapter 9.7.3 for details.

* - I keep doing this myself

--
   Richard Huxton
   Archonet Ltd

Re: Regular expression on a string problem.

From
"Albe Laurenz"
Date:
Paul Mendoza wrote:
> Sent: Saturday, September 08, 2007 12:53 AM
> To: pgsql-general@postgresql.org
> Subject: [GENERAL] Regular expression on a string problem.
>
> Here is a problem I'm having with a function I've created. It
> should be returning a varchar value no matter what the input
> is but I'm getting a null value back. Then it says in the
> error that I'm using "nonstandard use of escape in a string
> literal at character 9". What do you think this should be changed to?
>
> Query OK (0.45 sec)
> Return Value: NULL
> WARNING:  nonstandard use of escape in a string literal at character 9
> HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
> QUERY:  SELECT  '/(\.PR\.)|(\.PR$)/'
> CONTEXT:  PL/pgSQL function "clean_name" line 17 at assignment
>
>
> Here is the function. p_text is a varchar that might have "PR" in it.
>
> DECLARE
>        var_regex_pattern varchar;
>
>
> BEGIN
>
>
>
>      -- If "PR" is not in the text variable.
>      IF position('PR' in p_text) = 0 THEN
>         var_regex_pattern = '/\./';
>         RETURN substring(p_text from var_regex_pattern);
>      END IF;
>
>      IF position('PR' in p_text) > 0 THEN
>         -- If "PR" is in the text variable.
>         var_regex_pattern = '/(\.PR\.)|(\.PR$)/';
>         RETURN substring(p_text from var_regex_pattern);
>      END IF;
>
>
> END;

The warning message and the problem that you encounter are not
really related.

You get the warning because standard_conforming_strings is off
and you use a string literal with a backslash in it without
preceeding it with the string escape syntax token E.

Because standard_conforming_strings is off, the backslash is
interpreted a an escape character and consequently when you
write '/(\.PR\.)|(\.PR$)/', the string actually becomes
'/(.PR.)|(.PR$)/'. You need to have double backslashes to
represent single ones and you need to say
E'/(\\.PR\\.)|(\\.PR$)/' if you want to avoid the warning.

This probably causes the behaviour that surprises you.
To help you further, I'd have to know what you actually
want to achieve, i.e. which output you desire for which
input.

The whole function could be written as a single SQL statement
using the CASE construct.

Yours,
Laurenz Albe