Dear Tom,
> In particular you can't any longer tell the difference between BOOLEAN
> and "boolean" (with quotes), which are not the same thing --- a quoted
> string is never a keyword, per spec. [...]
Ok, so you mean that on -boolean- the lexer returns a BOOLEAN_P token, but
with -"boolean"- it returns an Ident and -boolean- as a lval. Indeed, in
such a case I cannot recognize that simply boolean vs "boolean" if they
are both idents that look the same.
As a matter of fact, this can also be fixed with some post-filtering. Say,
all quoted idents could be returned with a leading " to show it was
dquoted, and the IDENT rules in the parser could remove when it is not
needed anymore to distinguish the case.
Not beautiful, I agree, but my point is that the current number of tokens
and number of states and automaton size are not inherent to SQL but to the
way the lexing/parsing is performed in postgresql.
> The basic point here is that eliminating tokens as you propose will
> result in small changes in behavior, none of which are good or per spec.
> Making the parser automaton smaller would be nice, but not at that
> price.
Ok. I don't want to change the spec. I still stand that it can be done,
although some more twicking is required. It was just a "proof of concept",
not a patch submission. Well, a "proof of concept" must still be a proof;-)
I attach a small patch that solve the boolean vs "boolean" issue, still as
a proof of concept that it is 'doable' to preserve semantics with a
different lexer/parser balance. I don't claim that it should be applied, I
just claim that the automaton size could be smaller, especially by
shortening the "unreserved_keyword" list.
> You have not proven that you can have the same result.
Well, I passed the regression tests, but that does not indeed prove
anything, because these issues are not tested at all.
Maybe you could consider to add the "regression" part of the attached
patcht, which creates a user "boolean" type.
Anyway, my motivation is about "hints" and "advises", and that does not
help a lot to solve these issues.
--
Fabien.