Precedence of NOT LIKE, NOT BETWEEN, etc - Mailing list pgsql-hackers

From Tom Lane
Subject Precedence of NOT LIKE, NOT BETWEEN, etc
Date
Msg-id 9543.1424720597@sss.pgh.pa.us
Whole thread Raw
Responses Re: Precedence of NOT LIKE, NOT BETWEEN, etc  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Precedence of NOT LIKE, NOT BETWEEN, etc  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
While fooling around with testing operator precedence warnings,
I discovered that there's some existing precedence behavior that's
not at all what I expected.  Consider these examples (all done in
9.4; this is longstanding behavior):

regression=# SELECT 1 < '(3,4)'::point LIKE 42.0;
ERROR:  operator does not exist: point ~~ numeric
LINE 1: SELECT 1 < '(3,4)'::point LIKE 42.0;                                 ^

regression=# SELECT 1 < '(3,4)'::point NOT LIKE 42.0;
ERROR:  operator does not exist: integer < point
LINE 1: SELECT 1 < '(3,4)'::point NOT LIKE 42.0;                ^

regression=# SELECT 1 LIKE '(3,4)'::point < 42.0;
ERROR:  operator does not exist: integer ~~ point
LINE 1: SELECT 1 LIKE '(3,4)'::point < 42.0;                ^

regression=# SELECT 1 NOT LIKE '(3,4)'::point < 42.0;
ERROR:  operator does not exist: integer !~~ point
LINE 1: SELECT 1 NOT LIKE '(3,4)'::point < 42.0;                ^

These queries are silly of course, the point is just to expose which
operator the parser thinks binds more tightly.  And what we have above is:
LIKE binds more tightly than <, in either combination.  NOT LIKE binds
less tightly than < if < is on the left, but more tightly than < if <
is on the right.

This seems like a bug; it's certainly not what you'd call intuitive.

The cause, if I'm understanding the Bison manual correctly, is that
the NOT token has lower precedence than '<', so at the critical point
where it has scanned 1 < '(3,4)'::point and has to either reduce that
(thus binding < more tightly) or shift, it looks at the lookahead token
which is either LIKE or NOT in the first two examples above.  LIKE has
higher priority than '<' so in the first example it shifts; but NOT
has lower priority than '<' so in the second example it reduces.

The other productions starting with "a_expr NOT ..." also have the
same problem; that includes NOT ILIKE, NOT SIMILAR TO, NOT BETWEEN,
and NOT IN.

I'm not seeing any terribly pleasing ways to fix this.  Aside from
the option of doing nothing, it seems like these are the choices:

1. We could hack base_yylex() to reduce NOT LIKE to a single token
which could be given the same precedence as LIKE.  Ditto for the other
four cases.  This would result in nice behavior for these cases, but as
Robert has correctly pointed out, hacks in base_yylex() risk creating
other poorly-understood behaviors.

2. We could change the precedence levels so that LIKE/ILIKE/SIMILAR,
BETWEEN, and IN all have precedence just above NOT, which would pretty
much mask the problem since there would be no other tokens with in-between
precedence levels.  In the context of the operator precedence changes
I proposed earlier, this would mean inserting the IS tests and comparison
operators between IN_P and POSTFIXOP rather than where the
single-character comparison ops live now.  We would likely also have to
flatten LIKE/BETWEEN/IN into a single %nonassoc precedence level to avoid
having weird interactions among them.  This isn't terribly attractive
because it would risk larger behavioral changes than the previous
proposal.

Thoughts?
        regards, tom lane



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Abbreviated keys for text cost model fix
Next
From: Heikki Linnakangas
Date:
Subject: Re: Reduce pinning in btree indexes