Thread: Portable check for unportable macro usage

Portable check for unportable macro usage

From
Tom Lane
Date:
A portability hazard that we fight constantly is that the <ctype.h>
macros such as isalpha() can't safely be called on "char" values.
Per POSIX, you have to make sure the input is cast to "unsigned char".
That's far too easy to forget, but on most machines you won't get any
warning about it.

We have a couple of buildfarm machines that will generate "char value
used as array subscript" warnings for this, because their implementations
of these macros are indeed just array lookups.  But I think gaur is
the only one that's still active, and it's not going to last forever.

Ralph Corderoy, one of the nmh hackers, was looking at this problem
and came up with this cute solution [1]:
   #ifndef NDEBUG   #if EOF != -1   #error "Please report this to nmh's authors."   #endif
   extern int ctype_identity[257]; /* [n] = n-1 */   #define isupper(c) ((isupper)((ctype_identity + 1)[c]))   ...
#endif

This essentially converts the macros to always have an array lookup
as the first step, and thus you'll get the char-as-array-subscript
warning if your compiler has one.  (AFAIK, all versions of gcc will
emit that with -Wall.)

You wouldn't want this in production, of course, since it's pure overhead
and the only value is as a compile-time check.  For our purposes, rather
than "#ifndef NDEBUG" we could either use "#ifdef USE_ASSERT_CHECKING"
or invent a separate macro to control this replacement.  I'd be a bit
inclined to invent a separate macro and set up a buildfarm member to
#define that symbol and throw an error on char-as-array-subscript.
Individual developers could #define it for test purposes but I wouldn't
expect most of us to bother, as long as there's a backstop in the build
farm.

Thoughts, objections?
        regards, tom lane

[1] http://lists.nongnu.org/archive/html/nmh-workers/2016-10/msg00313.html



Re: Portable check for unportable macro usage

From
Andres Freund
Date:
On 2016-10-19 12:14:50 -0400, Tom Lane wrote:
> A portability hazard that we fight constantly is that the <ctype.h>
> macros such as isalpha() can't safely be called on "char" values.
> Per POSIX, you have to make sure the input is cast to "unsigned char".
> That's far too easy to forget, but on most machines you won't get any
> warning about it.

> We have a couple of buildfarm machines that will generate "char value
> used as array subscript" warnings for this, because their implementations
> of these macros are indeed just array lookups.  But I think gaur is
> the only one that's still active, and it's not going to last forever.
> 
> Ralph Corderoy, one of the nmh hackers, was looking at this problem
> and came up with this cute solution [1]:
> 
>     #ifndef NDEBUG
>     #if EOF != -1
>     #error "Please report this to nmh's authors."
>     #endif
> 
>     extern int ctype_identity[257]; /* [n] = n-1 */
>     #define isupper(c) ((isupper)((ctype_identity + 1)[c]))
>     ...
>     #endif

Hm. I'd be kind of inclined to instead do something akin to

#include <ctype.h>
#define system_isupper(c) isupper(c)
#undef isupper
#define isupper(c) (AssertVariableIsOfTypeMacro(c, unsigned char), isupper(c))
=>
/home/andres/src/postgresql/src/include/c.h:745:7: error: static assertion failed: "c does not have type unsigned
char"

that would probably result in better error messages,and we can leave it
enabled regardless of debug mode.

That requires one usage adoption in wparser_def.c's p_iswhat()

Greetings,

Andres Freund



Re: Portable check for unportable macro usage

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> Hm. I'd be kind of inclined to instead do something akin to

> #include <ctype.h>
> #define system_isupper(c) isupper(c)
> #undef isupper

Note that that doesn't do what you are probably thinking it does.

What is actually happening there, I believe, is that you're forcing
a fallback to the plain-function definition of isupper().  Ralph's
proposal accomplishes the same thing less messily by parenthesizing
the new macro's reference to isupper.  But in either case, we're
disabling any possible macro optimization in <ctype.h>, which makes
me not want to say it's something we'd enable unconditionally.

> #define isupper(c) (AssertVariableIsOfTypeMacro(c, unsigned char), isupper(c))
> =>
> /home/andres/src/postgresql/src/include/c.h:745:7: error: static assertion failed: "c does not have type unsigned
char"

Not sure we really want that; it breaks the standard-intended usage
of applying these functions to the result of getc().  It might be
all right for the core code, but I could see third-party authors
getting pretty upset with us for unilaterally imposing non-POSIX
semantics on these functions.
        regards, tom lane