I apologize for the long post, here goes...
I recently tried building postgres 8.0.1 on a Solaris 2.9 SPARC
machine, using gcc-4.0.1. CFLAGS was set to pass -O2.
Running 'make check' failed, the initdb phase ended with postgres
dying with SIGBUS. The stack trace showed that the line in question
was a call to the MemSet macro. Changing the optimization level
changed which particular call to MemSet failed, but all the locations
were similar.
In each case, the call to MemSet had compile time constants for val
and len, where val was nonzero (see src/backend/catalog/index.c:411 or
src/backend/commands/trigger.c:301, I saw crashes at both, depending
on optimization flags). Because of the way the MemSet macro is
defined, the compile time non-zero val argument should cause the
entire conditional and loop to be optimized away, and the call should
reduce to a simple call to memset, but where the address argument has
been cast first to an int32 *, and then to a char *. GCC then goes
ahead and does its own expansion on the call to memset. Somehow, the
assembly it generates causes a misaligned integer store, and therefore
a SIGBUS on SPARC.
All of this got me wondering about the use of the _start variable in
the definition of MemSet (src/include/c.h:593):
#define MemSet(start, val, len) \
do \
{ \
int32 * _start = (int32 *) (start); \
int _val = (val); \
Size _len = (len); \
\
if ((((long) _start) & INT_ALIGN_MASK) == 0 && \
(_len & INT_ALIGN_MASK) == 0 && \
_val == 0 && \
_len <= MEMSET_LOOP_LIMIT) \
{ \
int32 * _stop = (int32 *) ((char *) _start + _len); \
while (_start < _stop) \
*_start++ = 0; \
} \
else \
memset((char *) _start, _val, _len); \
} while (0)
So, here _start is set to the evaluation of start, cast to an int32 *.
If the conditional is either eliminated at compile time or evaluates
to false, then _start is cast to char * and passed to memset. The
misaligned integer store got me thinking about the cast through int32
*, and I tried the following variation on MemSet (and MemSetAligned):
#define MemSet(start, val, len) \
do \
{ void * _vstart = (void *)(start); \
int32 * _start = (int32 *) (_vstart); \
int _val = (val); \
Size _len = (len); \
\
if ((((long) _start) & INT_ALIGN_MASK) == 0 && \
(_len & INT_ALIGN_MASK) == 0 && \
_val == 0 && \
_len <= MEMSET_LOOP_LIMIT) \
{ \
int32 * _stop = (int32 *) ((char *) _start + _len); \
while (_start < _stop) \
*_start++ = 0; \
} \
else \
memset(_vstart, _val, _len); \
} while (0)
Here, _vstart is used to hold the evaluation of the start expression
as a void *, and _start is derived from it by a cast to int32 *. Then,
if the call to memset happens, _vstart is passed to memset instead of
_start. In this way, the memset call never sees the result of the
int32 * cast.
After making this change, 'make check' passes all tests, for all
optimization levels.
I am no language laywer, so I don't know for certain what is going on
here, but it does seem to be that one of the following must be true:
1) There is something illegal/undefined about the address argument to
memset in the original version, due to the int32 * and char * casts.
or
2) The above is not true, and the MemSet macro is well defined, but
gcc is somehow confused by the int32 * cast, and then makes
unwarranted alignment assumptions in its expansion of memset, despire
the final cast to char *.
So, after all that: is the MemSet macro OK and gcc wrong, or vice versa?
Thanks,
Andrew