I wrote:
> I'm not seeing any terribly pleasing ways to fix this. Aside from
> the option of doing nothing, it seems like these are the choices:
> 1. We could hack base_yylex() to reduce NOT LIKE to a single token
> which could be given the same precedence as LIKE. Ditto for the other
> four cases. This would result in nice behavior for these cases, but as
> Robert has correctly pointed out, hacks in base_yylex() risk creating
> other poorly-understood behaviors.
Indeed, that would create some misbehaviors. An example is that because
LIKE is a type_func_name_keyword, it's legal to create a function named
like() and then do something like this:
regression=# select not like(42);
but that would fail if base_yylex() reduced the two tokens to one.
However, I've thought of a variant idea that seems pretty promising.
Instead of having base_yylex() reduce the two tokens to one, how
about making it replace the NOT with some new token, say NOT_HP
for "high priority NOT", but keep the lookahead token as a separate
token? Then we would just need to add productions to the grammar
for any case where LIKE etc might validly follow NOT without being
part of the special syntaxes. There's exactly one such case, namely
the "NOT a_expr" production, so this is a pretty cheap change.
I've experimented with the attached crude prototype patch and it seems
to do what's wanted here, while still accepting the "not like(42)"
construction. There is a problem with the patch as it stands, which is
that I used the same NOT_HP token for all of LIKE ILIKE SIMILAR BETWEEN
and IN productions. Because the grammar currently has BETWEEN and IN as
their own precedence levels, we'd need three different NOT-lookahead
tokens if we wanted NOT BETWEEN and NOT IN to share those precedence
levels. I think that's probably overkill, especially if we're changing
nearby precedence levels. What I propose we really do is adopt this
solution, but fold all of LIKE ILIKE SIMILAR BETWEEN and IN into a
single %nonassoc precedence level. This would only change the behavior
for cases where you'd written two of these constructs adjacently without
parens, eg
a LIKE b BETWEEN c AND d
which doesn't seem like something that would occur in working applications
anyway.
Another reason this is a prototype is that it does result in one change in
the regression test results:
*** 63,69 ****
CONTEXT: SQL function "test1"
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
AS 'not even SQL';
! ERROR: syntax error at or near "not"
LINE 2: AS 'not even SQL';
^
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
--- 63,69 ----
CONTEXT: SQL function "test1"
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
AS 'not even SQL';
! ERROR: syntax error at or near "not even"
LINE 2: AS 'not even SQL';
^
CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
The reason that's happening is that we aren't doing quite enough in
base_yylex to back up the lexer state completely after a lookahead.
You can exhibit similar oddities around the existing substitutions
such as "nulls first". I think this can be fixed relatively cheaply
by looking at yyleng, but haven't experimented yet.
Also, it strikes me that we could significantly reduce, maybe even fully
eliminate, the funny behaviors around the existing base_yylex()
substitutions if we made them use the same idea, ie replace the leading
token with something special but keep the second token's separate
identity. Unless somebody sees a hole in this idea, I'll probably go
do that and then come back to the precedence issues.
regards, tom lane
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d91bedd..f60c863 100644
*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
*************** static Node *makeRecursiveViewSelect(cha
*** 637,643 ****
* list and so can never be entered directly. The filter in parser.c
* creates these tokens when required.
*/
! %token NULLS_FIRST NULLS_LAST WITH_ORDINALITY WITH_TIME
/* Precedence: lowest to highest */
--- 637,643 ----
* list and so can never be entered directly. The filter in parser.c
* creates these tokens when required.
*/
! %token NOT_HP NULLS_FIRST NULLS_LAST WITH_ORDINALITY WITH_TIME
/* Precedence: lowest to highest */
*************** static Node *makeRecursiveViewSelect(cha
*** 649,656 ****
%right NOT
%nonassoc IS ISNULL NOTNULL /* IS sets precedence for IS NULL, etc */
%nonassoc '<' '>' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
! %nonassoc LIKE ILIKE SIMILAR
! %nonassoc ESCAPE
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
--- 649,656 ----
%right NOT
%nonassoc IS ISNULL NOTNULL /* IS sets precedence for IS NULL, etc */
%nonassoc '<' '>' '=' LESS_EQUALS GREATER_EQUALS NOT_EQUALS
! %nonassoc LIKE ILIKE SIMILAR NOT_HP
! %nonassoc ESCAPE /* must be just above LIKE/ILIKE/SIMILAR */
%nonassoc OVERLAPS
%nonassoc BETWEEN
%nonassoc IN_P
*************** a_expr: c_expr { $$ = $1; }
*** 11223,11228 ****
--- 11223,11230 ----
{ $$ = makeOrExpr($1, $3, @2); }
| NOT a_expr
{ $$ = makeNotExpr($2, @1); }
+ | NOT_HP a_expr %prec NOT
+ { $$ = makeNotExpr($2, @1); }
| a_expr LIKE a_expr
{
*************** a_expr: c_expr { $$ = $1; }
*** 11237,11248 ****
$$ = (Node *) makeSimpleA_Expr(AEXPR_LIKE, "~~",
$1, (Node *) n, @2);
}
! | a_expr NOT LIKE a_expr
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_LIKE, "!~~",
$1, $4, @2);
}
! | a_expr NOT LIKE a_expr ESCAPE a_expr
{
FuncCall *n = makeFuncCall(SystemFuncName("like_escape"),
list_make2($4, $6),
--- 11239,11250 ----
$$ = (Node *) makeSimpleA_Expr(AEXPR_LIKE, "~~",
$1, (Node *) n, @2);
}
! | a_expr NOT_HP LIKE a_expr %prec NOT_HP
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_LIKE, "!~~",
$1, $4, @2);
}
! | a_expr NOT_HP LIKE a_expr ESCAPE a_expr %prec NOT_HP
{
FuncCall *n = makeFuncCall(SystemFuncName("like_escape"),
list_make2($4, $6),
*************** a_expr: c_expr { $$ = $1; }
*** 11263,11274 ****
$$ = (Node *) makeSimpleA_Expr(AEXPR_ILIKE, "~~*",
$1, (Node *) n, @2);
}
! | a_expr NOT ILIKE a_expr
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_ILIKE, "!~~*",
$1, $4, @2);
}
! | a_expr NOT ILIKE a_expr ESCAPE a_expr
{
FuncCall *n = makeFuncCall(SystemFuncName("like_escape"),
list_make2($4, $6),
--- 11265,11276 ----
$$ = (Node *) makeSimpleA_Expr(AEXPR_ILIKE, "~~*",
$1, (Node *) n, @2);
}
! | a_expr NOT_HP ILIKE a_expr %prec NOT_HP
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_ILIKE, "!~~*",
$1, $4, @2);
}
! | a_expr NOT_HP ILIKE a_expr ESCAPE a_expr
{
FuncCall *n = makeFuncCall(SystemFuncName("like_escape"),
list_make2($4, $6),
*************** a_expr: c_expr { $$ = $1; }
*** 11293,11299 ****
$$ = (Node *) makeSimpleA_Expr(AEXPR_SIMILAR, "~",
$1, (Node *) n, @2);
}
! | a_expr NOT SIMILAR TO a_expr %prec SIMILAR
{
FuncCall *n = makeFuncCall(SystemFuncName("similar_escape"),
list_make2($5, makeNullAConst(-1)),
--- 11295,11301 ----
$$ = (Node *) makeSimpleA_Expr(AEXPR_SIMILAR, "~",
$1, (Node *) n, @2);
}
! | a_expr NOT_HP SIMILAR TO a_expr %prec NOT_HP
{
FuncCall *n = makeFuncCall(SystemFuncName("similar_escape"),
list_make2($5, makeNullAConst(-1)),
*************** a_expr: c_expr { $$ = $1; }
*** 11301,11307 ****
$$ = (Node *) makeSimpleA_Expr(AEXPR_SIMILAR, "!~",
$1, (Node *) n, @2);
}
! | a_expr NOT SIMILAR TO a_expr ESCAPE a_expr
{
FuncCall *n = makeFuncCall(SystemFuncName("similar_escape"),
list_make2($5, $7),
--- 11303,11309 ----
$$ = (Node *) makeSimpleA_Expr(AEXPR_SIMILAR, "!~",
$1, (Node *) n, @2);
}
! | a_expr NOT_HP SIMILAR TO a_expr ESCAPE a_expr
{
FuncCall *n = makeFuncCall(SystemFuncName("similar_escape"),
list_make2($5, $7),
*************** a_expr: c_expr { $$ = $1; }
*** 11441,11447 ****
(Node *) list_make2($4, $6),
@2);
}
! | a_expr NOT BETWEEN opt_asymmetric b_expr AND b_expr %prec BETWEEN
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_NOT_BETWEEN,
"NOT BETWEEN",
--- 11443,11449 ----
(Node *) list_make2($4, $6),
@2);
}
! | a_expr NOT_HP BETWEEN opt_asymmetric b_expr AND b_expr %prec NOT_HP
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_NOT_BETWEEN,
"NOT BETWEEN",
*************** a_expr: c_expr { $$ = $1; }
*** 11457,11463 ****
(Node *) list_make2($4, $6),
@2);
}
! | a_expr NOT BETWEEN SYMMETRIC b_expr AND b_expr %prec BETWEEN
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_NOT_BETWEEN_SYM,
"NOT BETWEEN SYMMETRIC",
--- 11459,11465 ----
(Node *) list_make2($4, $6),
@2);
}
! | a_expr NOT_HP BETWEEN SYMMETRIC b_expr AND b_expr %prec NOT_HP
{
$$ = (Node *) makeSimpleA_Expr(AEXPR_NOT_BETWEEN_SYM,
"NOT BETWEEN SYMMETRIC",
*************** a_expr: c_expr { $$ = $1; }
*** 11485,11491 ****
$$ = (Node *) makeSimpleA_Expr(AEXPR_IN, "=", $1, $3, @2);
}
}
! | a_expr NOT IN_P in_expr
{
/* in_expr returns a SubLink or a list of a_exprs */
if (IsA($4, SubLink))
--- 11487,11493 ----
$$ = (Node *) makeSimpleA_Expr(AEXPR_IN, "=", $1, $3, @2);
}
}
! | a_expr NOT_HP IN_P in_expr %prec NOT_HP
{
/* in_expr returns a SubLink or a list of a_exprs */
if (IsA($4, SubLink))
*************** subquery_Op:
*** 12557,12567 ****
{ $$ = $3; }
| LIKE
{ $$ = list_make1(makeString("~~")); }
! | NOT LIKE
{ $$ = list_make1(makeString("!~~")); }
| ILIKE
{ $$ = list_make1(makeString("~~*")); }
! | NOT ILIKE
{ $$ = list_make1(makeString("!~~*")); }
/* cannot put SIMILAR TO here, because SIMILAR TO is a hack.
* the regular expression is preprocessed by a function (similar_escape),
--- 12559,12569 ----
{ $$ = $3; }
| LIKE
{ $$ = list_make1(makeString("~~")); }
! | NOT_HP LIKE
{ $$ = list_make1(makeString("!~~")); }
| ILIKE
{ $$ = list_make1(makeString("~~*")); }
! | NOT_HP ILIKE
{ $$ = list_make1(makeString("!~~*")); }
/* cannot put SIMILAR TO here, because SIMILAR TO is a hack.
* the regular expression is preprocessed by a function (similar_escape),
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index db49275..6170c0c 100644
*** a/src/backend/parser/parser.c
--- b/src/backend/parser/parser.c
*************** base_yylex(YYSTYPE *lvalp, YYLTYPE *lloc
*** 101,106 ****
--- 101,136 ----
/* Do we need to look ahead for a possible multiword token? */
switch (cur_token)
{
+ case NOT:
+
+ /*
+ * If we have NOT LIKE etc, change NOT to NOT_HP
+ */
+ cur_yylval = lvalp->core_yystype;
+ cur_yylloc = *llocp;
+ next_token = core_yylex(&(lvalp->core_yystype), llocp, yyscanner);
+ switch (next_token)
+ {
+ case BETWEEN:
+ case IN_P:
+ case LIKE:
+ case ILIKE:
+ case SIMILAR:
+ cur_token = NOT_HP;
+ break;
+ default:
+ break;
+ }
+ /* save the lookahead token for next time */
+ yyextra->lookahead_token = next_token;
+ yyextra->lookahead_yylval = lvalp->core_yystype;
+ yyextra->lookahead_yylloc = *llocp;
+ yyextra->have_lookahead = true;
+ /* and back up the output info to cur_token */
+ lvalp->core_yystype = cur_yylval;
+ *llocp = cur_yylloc;
+ break;
+
case NULLS_P:
/*