Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> Consider this: SELECT 3+-2; What would you expect from that? I
>> personally would expect the result of 1. But it produces an error,
>> because '+-' is treated as some user-defined operator, which is
>> not true.
> That is part of my concern here. The current behavior is what you say
> you would expect! Your patches change that behavor!!
> postgres=> select 3+-2;
> ?column?
> --------
> 1
> (1 row)
OTOH, with current sources:
regression=> select 3+- 2;
ERROR: Unable to identify an operator '+-' for types 'int4' and 'int4' You will have to retype this query using
anexplicit cast
regression=> select f1+-f1 from int4_tbl;
ERROR: Unable to identify an operator '+-' for types 'int4' and 'int4' You will have to retype this query using
anexplicit cast
To my mind, without spaces this construction *is* ambiguous, and frankly
I'd have expected the second interpretation ('+-' is a single operator
name). Almost every computer language in the world uses "greedy"
tokenization where the next token is the longest series of characters
that can validly be a token. I don't regard the above behavior as
predictable, natural, nor obvious. In fact, I'd say it's a bug that
"3+-2" and "3+-x" are not lexed in the same way.
However, aside from arguing about whether the current behavior is good
or bad, these examples seem to indicate that it doesn't take an infinite
amount of lookahead to reproduce the behavior. It looks to me like we
could preserve the current behavior by parsing a '-' as a separate token
if it *immediately* precedes a digit, and otherwise allowing it to be
folded into the preceding operator. That could presumably be done
without VLTC.
regards, tom lane