FAQ
I apologize for the long post, here goes...

I recently tried building postgres 8.0.1 on a Solaris 2.9 SPARC
machine, using gcc-4.0.1. CFLAGS was set to pass -O2.

Running 'make check' failed, the initdb phase ended with postgres
dying with SIGBUS. The stack trace showed that the line in question
was a call to the MemSet macro. Changing the optimization level
changed which particular call to MemSet failed, but all the locations
were similar.

In each case, the call to MemSet had compile time constants for val
and len, where val was nonzero (see src/backend/catalog/index.c:411 or
src/backend/commands/trigger.c:301, I saw crashes at both, depending
on optimization flags). Because of the way the MemSet macro is
defined, the compile time non-zero val argument should cause the
entire conditional and loop to be optimized away, and the call should
reduce to a simple call to memset, but where the address argument has
been cast first to an int32 *, and then to a char *. GCC then goes
ahead and does its own expansion on the call to memset. Somehow, the
assembly it generates causes a misaligned integer store, and therefore
a SIGBUS on SPARC.

All of this got me wondering about the use of the _start variable in
the definition of MemSet (src/include/c.h:593):

#define MemSet(start, val, len) \
do \
{ \
int32 * _start = (int32 *) (start); \
int _val = (val); \
Size _len = (len); \
\
if ((((long) _start) & INT_ALIGN_MASK) == 0 && \
(_len & INT_ALIGN_MASK) == 0 && \
_val == 0 && \
_len <= MEMSET_LOOP_LIMIT) \
{ \
int32 * _stop = (int32 *) ((char *) _start + _len); \
while (_start < _stop) \
*_start++ = 0; \
} \
else \
memset((char *) _start, _val, _len); \
} while (0)

So, here _start is set to the evaluation of start, cast to an int32 *.
If the conditional is either eliminated at compile time or evaluates
to false, then _start is cast to char * and passed to memset. The
misaligned integer store got me thinking about the cast through int32
*, and I tried the following variation on MemSet (and MemSetAligned):

#define MemSet(start, val, len) \
do \
{ void * _vstart = (void *)(start); \
int32 * _start = (int32 *) (_vstart); \
int _val = (val); \
Size _len = (len); \
\
if ((((long) _start) & INT_ALIGN_MASK) == 0 && \
(_len & INT_ALIGN_MASK) == 0 && \
_val == 0 && \
_len <= MEMSET_LOOP_LIMIT) \
{ \
int32 * _stop = (int32 *) ((char *) _start + _len); \
while (_start < _stop) \
*_start++ = 0; \
} \
else \
memset(_vstart, _val, _len); \
} while (0)

Here, _vstart is used to hold the evaluation of the start expression
as a void *, and _start is derived from it by a cast to int32 *. Then,
if the call to memset happens, _vstart is passed to memset instead of
_start. In this way, the memset call never sees the result of the
int32 * cast.

After making this change, 'make check' passes all tests, for all
optimization levels.

I am no language laywer, so I don't know for certain what is going on
here, but it does seem to be that one of the following must be true:

1) There is something illegal/undefined about the address argument to
memset in the original version, due to the int32 * and char * casts.

or

2) The above is not true, and the MemSet macro is well defined, but
gcc is somehow confused by the int32 * cast, and then makes
unwarranted alignment assumptions in its expansion of memset, despire
the final cast to char *.

So, after all that: is the MemSet macro OK and gcc wrong, or vice versa?

Thanks,
Andrew

Search Discussions

  • Tom Lane at Jul 18, 2005 at 2:36 pm

    Andrew Morrow writes:
    I am no language laywer, so I don't know for certain what is going on
    here, but it does seem to be that one of the following must be true:
    1) There is something illegal/undefined about the address argument to
    memset in the original version, due to the int32 * and char * casts.
    I think you are absolutely right --- the cast to int32* allows the
    compiler to assume that the pointer is word-aligned, and even casting
    it back to char * or void * at the memset call wouldn't really undo
    the damage. So really, correct coding of the macro should be along
    the lines of

    { void *_vstart = (void *)(start); \
    int _val = (val); \
    Size _len = (len); \
    \
    if ((((long) _vstart) & INT_ALIGN_MASK) == 0 && \
    (_len & INT_ALIGN_MASK) == 0 && \
    _val == 0 && \
    _len <= MEMSET_LOOP_LIMIT) \
    { \
    int32 * _start = (int32 *) (_vstart); \
    int32 * _stop = (int32 *) ((char *) _start + _len); \
    while (_start < _stop) \
    *_start++ = 0; \
    } \
    else \
    memset(_vstart, _val, _len); \
    } while (0)

    Interesting that we have not seen this before.

    regards, tom lane
  • Tom Lane at Jul 18, 2005 at 4:01 pm

    Andrew Morrow writes:
    I recently tried building postgres 8.0.1 on a Solaris 2.9 SPARC
    machine, using gcc-4.0.1. CFLAGS was set to pass -O2.
    ...
    I am no language laywer, so I don't know for certain what is going on
    here, but it does seem to be that one of the following must be true:
    1) There is something illegal/undefined about the address argument to
    memset in the original version, due to the int32 * and char * casts.
    I've patched MemSet() in all supported branches (back to 7.2) per this
    report.

    Are you interested in adding this machine to the PG buildfarm?
    http://www.pgbuildfarm.org/index.html
    Machines with out-of-the-ordinary compiler properties are especially
    interesting candidates for buildfarm testing ...

    regards, tom lane

Related Discussions

Discussion Navigation
viewthread | post
Discussion Overview
grouppgsql-ports @
categoriespostgresql
postedJul 18, '05 at 1:51p
activeJul 18, '05 at 4:01p
posts3
users2
websitepostgresql.org
irc#postgresql

2 users in discussion

Tom Lane: 2 posts Andrew Morrow: 1 post

People

Translate

site design / logo © 2021 Grokbase