Re: [BUGS] BUG #14512: Backslashes in LIKE - Mailing list pgsql-bugs

From David G. Johnston
Subject Re: [BUGS] BUG #14512: Backslashes in LIKE
Date
Msg-id CAKFQuwYO5sJL0B+R=nFMBOa_um1f_S-Y4QtkD+eYd4MKjQJP5A@mail.gmail.com
Whole thread Raw
In response to Re: [BUGS] BUG #14512: Backslashes in LIKE  (Vojtěch Rylko <vojta.rylko@gmail.com>)
List pgsql-bugs
On Wed, Jan 25, 2017 at 2:28 AM, Vojtěch Rylko <vojta.rylko@gmail.com> wrote:
2017-01-24 18:48 GMT+01:00 David G. Johnston <david.g.johnston@gmail.com>:
​Not a hacker but I'd say that the '\' LIKE '\\\' expression is encountering an invalid optimization that determines that the LIKE cannot succeed (due to string length differences, probably) - it too should fail like the other '\\' LIKE '\\\' example.

So, it is a "failure to fail" type of bug.  Confirmed using a 9.3.12 instance.

From user perspective I see this bug quite similar to behaviour of boolean expression evaluation, where it is stated in documentation:

if the result of an expression can be determined by evaluating only some parts of it, then other subexpressions might not be evaluated at all -- 4.2.14. Expression Evaluation Rules

So I expect this:

root=# select 1 where '\\' like '\\\';
ERROR:  LIKE pattern must not end with escape character
root=# select 1 where false and '\\' like '\\\';
 ?column? 
----------
(0 rows)

same as I expect 

root=# select 1 where 1/0 = 0 and false;
ERROR:  division by zero
root=# select 1 where false and 1/0 = 0;
 ?column? 
----------
(0 rows)

(Note that examples above are not deterministic because of unspecified order of subexpressions evaluation in where clause.)

But reported behaviour confuses me as it seems like leaked internals of LIKE implementation.


​I think we all agree that it does.  The opinions we are looking for are whether, given that you've written a correctly formed LIKE pattern, do you want every single instance of testing against that pattern to be preceded by a test that checks whether the given pattern is valid?​  While not measured it is a run-time cost that should return true in nearly all cases expect for development bugs.

I use RegEx a lot - I'm already used to the cost being built-in and, frankly, when doing string comparison work, I suspect that the order of magnitude such a pre-check would add would be nominal.

Given that any supposedly successful match against the pattern would fail in the case of a silly typo of this form I'm leaning more to the fact that having a bad pattern escape detection would be very difficult.  Patterns that check for invalid data are more at risk...

select 1 where 'abc\' like 'abc\'; -- fails, supposed to use \\ on the end of the pattern

Given time to think about it more I'm now leaning toward keeping the present behavior.

David J.


pgsql-bugs by date:

Previous
From: John McKown
Date:
Subject: Re: [BUGS] BUG #14514: Bug in Subquery
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] BUG #14514: Bug in Subquery