Thread: Dollar quoting inside a regex bracket expression

Dollar quoting inside a regex bracket expression

From
David Fetter
Date:
Kind people,

I was checking edge cases with dollar quoting, and ran across
something I don't quite understand.  This is in CVS TIP as of this
afternoon.  The two functions below are different only in that inside
the regex bracket expression, the first uses \\ to indicate a literal
\, while the second attempts to use $qq$\$qq$ for the same thing.  Is
this a bug?  A feature that needs documenting?  Some obvious thing
I've missed?

TIA for help on this.

------------------------------
--                          --
--  This does as expected:  --
--                          --
------------------------------

test=# CREATE OR REPLACE FUNCTION has_bad_chars(text) RETURNS BOOLEAN
AS $function$
test$#     SELECT $1 ~ $q$[\t\r\n\v|\\]$q$;
test$# $function$ LANGUAGE SQL;
CREATE FUNCTION
test=# select has_bad_chars($$\t$$);
 has_bad_chars
---------------
 t
(1 row)

----------------------
--                  --
--  This does not.  --
--                  --
----------------------
CREATE OR REPLACE FUNCTION has_bad_chars(text) RETURNS BOOLEAN
AS $function$
     SELECT $1 ~ $q$[\t\r\n\v|$qq$\$qq$]$q$;
$function$ LANGUAGE SQL;
CREATE FUNCTION
SELECT has_bad_chars($$\t$$);
 has_bad_chars
---------------
 f
(1 row)


--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: Dollar quoting inside a regex bracket expression

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> CREATE OR REPLACE FUNCTION has_bad_chars(text) RETURNS BOOLEAN
> AS $function$
>      SELECT $1 ~ $q$[\t\r\n\v|$qq$\$qq$]$q$;
> $function$ LANGUAGE SQL;

Why would you expect that to work?  Dollar-quote is not a construct
known to any regex engine that I know about.  What you've got there
is a bracket expression redundantly matching the set of characters
    \t \r \n \v | $ q
(I think that's what it will be read as, anyway, but I'm not a
regexp guru...)

            regards, tom lane

Re: Dollar quoting inside a regex bracket expression

From
David Fetter
Date:
On Sun, Sep 26, 2004 at 11:45:42PM -0400, Tom Lane wrote:
> David Fetter <david@fetter.org> writes:
> > CREATE OR REPLACE FUNCTION has_bad_chars(text) RETURNS BOOLEAN
> > AS $function$
> >      SELECT $1 ~ $q$[\t\r\n\v|$qq$\$qq$]$q$;
> > $function$ LANGUAGE SQL;
>
> Why would you expect that to work?

Mis-expectations.  I expected--unreasonably, I see--every part of
8.0beta to do dollar quoting and didn't see how the cases of, say,
pl/python and the regex engine were similar.  Is this worth a mention
as part of the regex docs, or is my expectation universally
unreasonable?

Cheers,
D
--
David Fetter david@fetter.org http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

Re: Dollar quoting inside a regex bracket expression

From
Tom Lane
Date:
David Fetter <david@fetter.org> writes:
> Mis-expectations.  I expected--unreasonably, I see--every part of
> 8.0beta to do dollar quoting and didn't see how the cases of, say,
> pl/python and the regex engine were similar.  Is this worth a mention
> as part of the regex docs, or is my expectation universally
> unreasonable?

Regex patterns don't have the notion of a quoted substring at all, so
it seems moderately unreasonable to me to expect dollar quoting to mean
something to regex (even discounting the fact that it couldn't work
lexically because $ has different lexical properties in a regex
pattern).

The dollar quote stuff is documented as a SQL string constant
representation.  It doesn't seem to me that you'd expect it to work
anywhere except in SQL statements.

            regards, tom lane