Re: [HACKERS] Postgres' lexer - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Postgres' lexer
Date
Msg-id 14978.935166666@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Postgres' lexer  (Brook Milligan <brook@biology.nmsu.edu>)
List pgsql-hackers
Brook Milligan <brook@biology.nmsu.edu> writes:
> I think the question for SQL is, does the language allow an ambiguity
> here?  If not, wouldn't it be much smarter to keep the minus sign as
> its own token and deal with the semantics in the parser?

I don't see a good reason to tokenize the '-' as part of the number
either.  I think that someone may have hacked the lexer to try to
merge unary minus into numeric constants, so that in an expression likeWHERE field < -2
the -2 would be treated as a constant rather than an expression
involving application of unary minus --- which is important because
the optimizer is too dumb to optimize the query if it looks like an
expression.

However, trying to make that happen at lex time is just silly.
The lexer doesn't have enough context to handle all the cases
anyway.  We've currently got code in the grammar to do the same
reduction.  (IMHO that's still too early, and it ought to be done
post-rewriter as part of a general-purpose constant expression
reducer; will get around to that someday ;-).)

So it seems to me that we should just rip *all* this cruft out of the
lexer, and always return '-' as a separate token, never as part of
a number. (*)  Then we wouldn't need this lookahead feature.

But it'd be good to get an opinion from the other tgl first ;-).
I'm just a kibitzer when it comes to the lex/yacc stuff.
        regards, tom lane

(*) not counting float exponents, eg "1.234e-56" of course.


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Postgres' lexer
Next
From: Dmitry Samersoff
Date:
Subject: What does it mean?