Re: [HACKERS] Postgres' lexer - Mailing list pgsql-hackers

From Leon
Subject Re: [HACKERS] Postgres' lexer
Date
Msg-id 37CE6CA7.B34BD4CE@udmnet.ru
Whole thread Raw
In response to Re: [HACKERS] Postgres' lexer  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:


> To my mind, without spaces this construction *is* ambiguous, and frankly
> I'd have expected the second interpretation ('+-' is a single operator
> name).  Almost every computer language in the world uses "greedy"
> tokenization where the next token is the longest series of characters
> that can validly be a token.  I don't regard the above behavior as
> predictable, natural, nor obvious.  In fact, I'd say it's a bug that
> "3+-2" and "3+-x" are not lexed in the same way.
>

Completely agree with that. This differentiating behavior looks like a bug.

> However, aside from arguing about whether the current behavior is good
> or bad, these examples seem to indicate that it doesn't take an infinite
> amount of lookahead to reproduce the behavior.  It looks to me like we
> could preserve the current behavior by parsing a '-' as a separate token
> if it *immediately* precedes a digit, and otherwise allowing it to be
> folded into the preceding operator.  That could presumably be done
> without VLTC.

Ok. If we *have* to preserve old weird behavior, here is the patch.
It is to be applied over all my other patches. Though if I were to
decide whether to restore old behavior, I wouldn't do it. Because it
is inconsistency in grammar, i.e. a bug.

--
Leon.
Attachment

pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Commercial question
Next
From: The Hermit Hacker
Date:
Subject: RE: [HACKERS] md.c is feeling much better now, thank you