Re: compiler warnings on the buildfarm - Mailing list pgsql-hackers

From Stefan Kaltenbrunner
Subject Re: compiler warnings on the buildfarm
Date
Msg-id 4697219C.8060405@kaltenbrunner.cc
Whole thread Raw
In response to Re: compiler warnings on the buildfarm  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: compiler warnings on the buildfarm  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> animal: lionfish            warnings: 16
>> scan.l:180: warning, the character range [<80>-<FF>] is ambiguous in a
>> case-insensitive scanner
>> scan.l:180: warning, the character range [<80>-<FF>] is ambiguous in a
>> case-insensitive scanner
>> scan.l:302: warning, the character range [<80>-<FF>] is ambiguous in a
>> case-insensitive scanner
> 
> This is evidently complaining about plpgsql's scan.l, which specifies
> %option case-insensitive
> and then defines
> ident_start        [A-Za-z\200-\377_]
> which is the way we do it in the main grammar too.  But I've never
> seen this message in any of the flex versions I've used with PG.
> (Which flex version is installed on lionfish anyway?)

$ flex -V
flex 2.5.31



> 
> I find some relevant points in the flex manual:
> http://flex.sourceforge.net/manual/Patterns.html
> 
>   Character classes are expanded immediately when seen in the flex
>   input. This means the character classes are sensitive to the locale in
>   which flex is executed, and the resulting scanner will not be sensitive
>   to the runtime locale. This may or may not be desirable.
>   
>   Character classes with ranges, such as `[a-Z]', should be used with
>   caution in a case-insensitive scanner if the range spans upper or
>   lowercase characters. Flex does not know if you want to fold all upper
>   and lowercase characters together, or if you want the literal numeric
>   range specified (with no case folding). When in doubt, flex will assume
>   that you meant the literal numeric range, and will issue a warning. The
>   exception to this rule is a character range such as `[a-z]' or `[S-W]'
>   where it is obvious that you want case-folding to occur.
> 
> What I suspect is happening is that lionfish is running the buildfarm
> script in a non-C locale, in which flex finds that some high-bit-set
> characters are case-folded by tolower() and accordingly issues this
> complaint.  Now the statements that "it assumes you meant the literal
> numeric range" and that the behavior is fully determined at compile time
> (ie, no run-time invocations of tolower(), as indeed are not to be seen
> in pl_scan.c) seem to mean that we'll get the behavior we want anyway.
> But the warning is a bit nervous-making.

hmmm - note that lionfish is not the only box reporting that kind of
warning - also affected are:

rosella (which is definitly running in a non-C locale as all the errors
are in german there)
wildebeest

Stefan


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: compiler warnings on the buildfarm
Next
From: Stefan Kaltenbrunner
Date:
Subject: Re: compiler warnings on the buildfarm