>> Hmm. Given that we expect the lexer to have downcased any unquoted
>> words, this seems like a workable solution --- where else are we using
>> strcasecmp() unnecessarily?
Wait a minute --- I spoke too quickly. The lexer's behavior is to
downcase unquoted identifiers in a *locale sensitive* fashion --- it
uses isupper() and tolower(). We concluded that that was correct for
identifiers according to SQL99, whereas keyword matching should not be
locale-dependent. See the comments for ScanKeywordLookup.
> I've identified several other such places. However, in reality we have to
> consider every single strcasecmp() call suspicious. In many places an
> ASCII-only alternative is needed or the code needs to be rewritten.
I think our problems are worse than that: once the identifier has been
through a locale-dependent case conversion we really have a problem
matching it to an ASCII string. The only real solution may be to
require *all* keywords to be matched in the lexer, and forbid strcmp()
matching in later phases entirely.
regards, tom lane