Thread: Validating user-input to be inserted in regular expressions

Validating user-input to be inserted in regular expressions

From
Vincenzo Ciancia
Date:
Hi all,

I would like to take user input, which should be interpreted literally, and
put it inside a regular expression, something like

select * from files where path ~ (USER_INPUT || '.*')

How should I escape the user input? I mean: I know that I must insert double
backslashes before special characters (in the sense of regular expressions)
but how to know what these special characters are? Should I find every
possible character in the documentation for regular expressions? Is there a
safer way?

Thanks

Vincenzo Ciancia

--
Please note that I do not read the e-mail address used in the from field but
I read vincenzo_ml at yahoo dot it
Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo
vincenzo_ml at yahoo dot it

Re: Validating user-input to be inserted in regular expressions

From
Vincenzo Ciancia
Date:
Vincenzo Ciancia wrote:

>  Should I find every
> possible character in the documentation for regular expressions?

Is the answer trivial? I checked the manual and the FAQ, and googled for the
answer, but I didn't find it. Is there a more appropriate place where I can
ask my question?

Thanks

Vincenzo

--
Please note that I do not read the e-mail address used in the from field but
I read vincenzo_ml at yahoo dot it
Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo
vincenzo_ml at yahoo dot it

Re: Validating user-input to be inserted in regular expressions

From
Sean Davis
Date:
On Jan 25, 2005, at 8:48 AM, Vincenzo Ciancia wrote:

> Vincenzo Ciancia wrote:
>
>>  Should I find every
>> possible character in the documentation for regular expressions?
>
> Is the answer trivial? I checked the manual and the FAQ, and googled
> for the
> answer, but I didn't find it. Is there a more appropriate place where
> I can
> ask my question?


Yes, this is a fine place to ask your question.  Sometimes it does take
a day or two to get an answer.  Will quote_literal do what you want?

http://www.postgresql.org/docs/8.0/interactive/functions-string.html

Search on that page for quote_literal.

Sean


Re: Validating user-input to be inserted in regular expressions

From
Vincenzo Ciancia
Date:
Sean Davis wrote:

> Yes, this is a fine place to ask your question.  Sometimes it does take
> a day or two to get an answer.  Will quote_literal do what you want?
>
> http://www.postgresql.org/docs/8.0/interactive/functions-string.html
>
> Search on that page for quote_literal.
>

Thank you for your answer. Unfortunately quote_literal is not what I am
looking for, in fact it quotes special characters in the sense of strings,
not in the sense of regular expressions. Here's some example to explain my
problem a little better:

I would like to select strings that begin with 'a.', so I do NOT want the
following (suppose 'a.' is generic user input)

  relfs=# select true where 'aa' ~ ('a.'||'.*');
   bool
  ------
   t

I could as well use 'a\\.', but what characters should I escape? Surely
$^+.*[] and possibly others. The function quote_literal does:

  relfs=# select true where 'aa' ~ (quote_literal('a.')||'.*');
   bool
  ------
  (0 righe)

but it's mere illusion :) In fact we have:

  relfs=# select true where '\'aa\'' ~ (quote_literal('a.')||'.*');
   bool
  ------
   t

Thanks for any suggestions

Vincenzo

--
Please note that I do not read the e-mail address used in the from field but
I read vincenzo_ml at yahoo dot it
Attenzione: non leggo l'indirizzo di posta usato nel campo from, ma leggo
vincenzo_ml at yahoo dot it

Re: Validating user-input to be inserted in regular expressions

From
Michael Fuhr
Date:
On Tue, Jan 25, 2005 at 04:28:06PM +0100, Vincenzo Ciancia wrote:

> Thank you for your answer. Unfortunately quote_literal is not what I am
> looking for, in fact it quotes special characters in the sense of strings,
> not in the sense of regular expressions.

It sounds like you're looking for the equivalent of Perl's quotemeta:

% perl -le 'print quotemeta "abc.*"'
abc\.\*

I'm not aware of any such function in PostgreSQL, but you could use
a PL/Perl function that simply calls quotemeta:

CREATE FUNCTION quotemeta(text) RETURNS text AS '
return quotemeta $_[0];
' LANGUAGE plperl IMMUTABLE STRICT;

SELECT quotemeta('abc.*');
 quotemeta
-----------
 abc\.\*
(1 row)

There might be differences between PostgreSQL's and Perl's regular
expression engines, but perhaps not enough to matter in this case.

I expect it would be easy to add such a function to PostgreSQL, so
consider suggesting it to the developers or even writing it yourself
and submitting a patch.

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/