Thread: BUG #4436: (E'\\' LIKE E'\\') => f

BUG #4436: (E'\\' LIKE E'\\') => f

From
"Mathieu Fenniak"
Date:
The following bug has been logged online:

Bug reference:      4436
Logged by:          Mathieu Fenniak
Email address:      hjoiiv@mathieu.fenniak.net
PostgreSQL version: 8.3.3
Operating system:   Linux x86-64
Description:        (E'\\' LIKE E'\\') => f
Details:

I noticed that (SELECT E'\\' LIKE E'\\') returns false, where I would expect
it to return true.  I asked on the #postgresql/freenode IRC channel, and
nobody had a good explanation for this return value, suggesting it may be a
minor bug.

Re: BUG #4436: (E'\\' LIKE E'\\') => f

From
Bruce Momjian
Date:
Mathieu Fenniak wrote:
>
> The following bug has been logged online:
>
> Bug reference:      4436
> Logged by:          Mathieu Fenniak
> Email address:      hjoiiv@mathieu.fenniak.net
> PostgreSQL version: 8.3.3
> Operating system:   Linux x86-64
> Description:        (E'\\' LIKE E'\\') => f
> Details:
>
> I noticed that (SELECT E'\\' LIKE E'\\') returns false, where I would expect
> it to return true.  I asked on the #postgresql/freenode IRC channel, and
> nobody had a good explanation for this return value, suggesting it may be a
> minor bug.

I believe this is caused because backslash is the default escape
character for LIKE, so you need:

    test=> SELECT E'\\' LIKE E'\\\\';
     ?column?
    ----------
     t
    (1 row)

or change the escape character:

    test=> SELECT E'\\' LIKE E'\\' escape 'a';
     ?column?
    ----------
     t
    (1 row)

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: BUG #4436: (E'\\' LIKE E'\\') => f

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Mathieu Fenniak wrote:
>> I noticed that (SELECT E'\\' LIKE E'\\') returns false,

> I believe this is caused because backslash is the default escape
> character for LIKE, so you need:
>     test=> SELECT E'\\' LIKE E'\\\\';

Yeah.  The given case is actually an invalid LIKE pattern.  I wonder
whether we should make LIKE throw error for an invalid pattern.
You get an error for the corresponding case in regex:

regression=# select E'\\' ~ E'\\';
ERROR:  invalid regular expression: invalid escape \ sequence

but IIRC the LIKE code just silently ignores a trailing escape
character.

            regards, tom lane

Re: BUG #4436: (E'\\' LIKE E'\\') => f

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Mathieu Fenniak wrote:
> >> I noticed that (SELECT E'\\' LIKE E'\\') returns false,
>
> > I believe this is caused because backslash is the default escape
> > character for LIKE, so you need:
> >     test=> SELECT E'\\' LIKE E'\\\\';
>
> Yeah.  The given case is actually an invalid LIKE pattern.  I wonder
> whether we should make LIKE throw error for an invalid pattern.
> You get an error for the corresponding case in regex:
>
> regression=# select E'\\' ~ E'\\';
> ERROR:  invalid regular expression: invalid escape \ sequence
>
> but IIRC the LIKE code just silently ignores a trailing escape
> character.

Yes, I think we should throw an error;  the original query looked odd to
me too.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

Re: BUG #4436: (E'\\' LIKE E'\\') => f

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> Yeah.  The given case is actually an invalid LIKE pattern.  I wonder
>> whether we should make LIKE throw error for an invalid pattern.

> Yes, I think we should throw an error;  the original query looked odd to
> me too.

A quick check in the standard supports the idea of throwing an error.
In fact, SQL92 saith

             ii) If there is not a partitioning of the string P into sub-
                 strings such that each substring has length 1 or 2, no
                 substring of length 1 is the escape character E, and each
                 substring of length 2 is the escape character E followed by
                 either the escape character E, an <underscore> character,
                 or the <percent> character, then an exception condition is
                 raised: data exception-invalid escape sequence.

which not only requires E to not be the last character, but says that
it's a bug to escape anything but % _ or the escape character.  That
last part is too anal for me, but it does seem we're on safe ground to
throw error for escape with nothing to escape.  I'll go make it so.

            regards, tom lane