FAQ

fix for strict-alias warnings

Andrew Dunstan
Oct 11, 2003 at 2:00 pm
This patch will stop gcc from issuing warnings about type-punned objects when -fstrict-aliasing is turned on, as it is in the latest gcc when you use -O2

enjoy

andrew
reply

Search Discussions

45 responses

  • Bruce Momjian at Oct 11, 2003 at 4:31 pm
    Patch applied. Thanks.

    ---------------------------------------------------------------------------


    Andrew Dunstan wrote:
    This patch will stop gcc from issuing warnings about type-punned objects when -fstrict-aliasing is turned on, as it is in the latest gcc when you use -O2

    enjoy

    andrew
    [ Attachment, skipping... ]
    ---------------------------(end of broadcast)---------------------------
    TIP 4: Don't 'kill -9' the postmaster
    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Tom Lane at Oct 11, 2003 at 5:03 pm

    Bruce Momjian writes:
    Patch applied. Thanks.
    I hope you applied it with the additional changes you asked for ---
    at the very least, cast to (void*) and then to the destination type.
    As-is, the patch simply suppresses all error detection for these
    conversions, which seems a bad move.

    regards, tom lane
  • Bruce Momjian at Oct 11, 2003 at 5:14 pm

    Tom Lane wrote:
    Bruce Momjian <pgman@candle.pha.pa.us> writes:
    Patch applied. Thanks.
    I hope you applied it with the additional changes you asked for ---
    at the very least, cast to (void*) and then to the destination type.
    As-is, the patch simply suppresses all error detection for these
    conversions, which seems a bad move.
    I don't have a version that does the double-cast, but I still have the
    patch to back out and put in a new one. Andrew's point was that we cast
    to void * in many places, so this case is not unique. Is that wrong?

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Tom Lane at Oct 11, 2003 at 5:29 pm

    Bruce Momjian writes:
    I don't have a version that does the double-cast, but I still have the
    patch to back out and put in a new one. Andrew's point was that we cast
    to void * in many places, so this case is not unique. Is that wrong?
    I do not like code that uses cast to void* as a substitute for casting
    to the real destination type. I think it's a lazy substitute for
    providing the correct cast, and it renders the code more fragile because
    there is *no* possibility of the compiler detecting a problem should you
    change the source or destination datatype in a way that renders the cast
    wrong.

    I have not gone around and tried to fix all the places that are lazy in
    this way, but I don't want to introduce more, and for sure I don't want
    to set a precedent that we'll weaken our type checking any time gcc
    burps for ill-defined reasons.

    I agree completely with all of the objections you raised in your
    original comment on the patch. In particular, I don't think we
    understand why gcc is complaining about these few places and not any of
    the thousands of other casts in our code. Until we understand that
    difference completely, we are not "fixing a bug" by introducing void*
    casts. I'd have to call it cargo-cult programming instead.

    I am perfectly content to leave the warnings in place until we have a
    satisfactory explanation.

    regards, tom lane
  • Andrew Dunstan at Oct 11, 2003 at 5:48 pm
    Tough words! :-)

    ISTM the best thing would be to back out the patch, add -fno-strict-aliasing
    for gcc, and add a TODO to fix this thoroughly.

    Having -fstrict-aliasing on and ignoring the warnings doesn't seem like a
    sound strategy. I think we should fix it or turn it off. The web is littered
    with projects that got bizzare happenings when they turned it on without any
    accompanying code changes.

    I agree with Tom that my patch isn't ideal (I thought I said as much).
    Fixing it thoroughly will require some significant code changes, though. We
    seem to be far too close to 7.4 release to contemplate that.

    cheers

    andrew


    ----- Original Message -----
    From: "Tom Lane" <tgl@sss.pgh.pa.us>
    To: "Bruce Momjian" <pgman@candle.pha.pa.us>
    Cc: "Andrew Dunstan" <andrew@dunslane.net>; "PG Patches"
    <pgsql-patches@postgresql.org>
    Sent: Saturday, October 11, 2003 1:29 PM
    Subject: Re: [PATCHES] fix for strict-alias warnings

    Bruce Momjian <pgman@candle.pha.pa.us> writes:
    I don't have a version that does the double-cast, but I still have the
    patch to back out and put in a new one. Andrew's point was that we cast
    to void * in many places, so this case is not unique. Is that wrong?
    I do not like code that uses cast to void* as a substitute for casting
    to the real destination type. I think it's a lazy substitute for
    providing the correct cast, and it renders the code more fragile because
    there is *no* possibility of the compiler detecting a problem should you
    change the source or destination datatype in a way that renders the cast
    wrong.

    I have not gone around and tried to fix all the places that are lazy in
    this way, but I don't want to introduce more, and for sure I don't want
    to set a precedent that we'll weaken our type checking any time gcc
    burps for ill-defined reasons.

    I agree completely with all of the objections you raised in your
    original comment on the patch. In particular, I don't think we
    understand why gcc is complaining about these few places and not any of
    the thousands of other casts in our code. Until we understand that
    difference completely, we are not "fixing a bug" by introducing void*
    casts. I'd have to call it cargo-cult programming instead.

    I am perfectly content to leave the warnings in place until we have a
    satisfactory explanation.

    regards, tom lane

    ---------------------------(end of broadcast)---------------------------
    TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
  • Bruce Momjian at Oct 11, 2003 at 6:10 pm

    Andrew Dunstan wrote:
    Tough words! :-)

    ISTM the best thing would be to back out the patch, add -fno-strict-aliasing
    for gcc, and add a TODO to fix this thoroughly.

    Having -fstrict-aliasing on and ignoring the warnings doesn't seem like a
    sound strategy. I think we should fix it or turn it off. The web is littered
    with projects that got bizzare happenings when they turned it on without any
    accompanying code changes.

    I agree with Tom that my patch isn't ideal (I thought I said as much).
    Fixing it thoroughly will require some significant code changes, though. We
    seem to be far too close to 7.4 release to contemplate that.
    I have backed out the patch.

    Looking at the case in tablecmds.c and proc.c, the first was assigning a
    struct with a NodeTag pointer as its first element to another struct
    with NodeTag as its first element. In fact, we do this all over the
    place, having different structure pointers with a start element of
    NodeTag. The proc.c cases were using MemSet, which was checking if the
    int* as aligned for int* access. In fact, we could change MemSet to
    always take a void *, and do the int* casting when we access it after
    testing for alignment.

    The big question in my mind is whether there there is other struct *
    passing that could be masked right now by void* casting, and if so, do
    they have different first elements? This determined whether we do
    -fstrict-aliasing for gcc, or fix just these few cases.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Andrew Dunstan at Oct 11, 2003 at 6:42 pm
    ----- Original Message -----
    From: "Bruce Momjian" <pgman@candle.pha.pa.us>
    I have backed out the patch.

    Looking at the case in tablecmds.c and proc.c, the first was assigning a
    struct with a NodeTag pointer as its first element to another struct
    with NodeTag as its first element. In fact, we do this all over the
    place, having different structure pointers with a start element of
    NodeTag.
    Right - and it is what would have to change if you really want to obey the
    ISO C rules, I believe. This is handled easily in other languages using
    variant records, but C is kinda primitive here :-)

    As I understand it, instead of

    struct foo {
    int tag
    foostuff f;
    }

    struct bar {
    int tag;
    barstuff b;
    }

    you would need to do something like

    struct foo {
    foostuff f;
    };

    struct bar {
    barstuff b;
    };

    struct foobar {
    int tag;
    union {
    struct foo foo;
    struct bar bar;
    } v;
    };


    The proc.c cases were using MemSet, which was checking if the
    int* as aligned for int* access. In fact, we could change MemSet to
    always take a void *, and do the int* casting when we access it after
    testing for alignment.
    Since MemSet is generic, that is probably a good idea.
    The big question in my mind is whether there there is other struct *
    passing that could be masked right now by void* casting, and if so, do
    they have different first elements? This determined whether we do
    -fstrict-aliasing for gcc, or fix just these few cases.
    Just analysing this is a non-trivial piece of work, I think.

    cheers

    andrew
  • Bruce Momjian at Oct 11, 2003 at 7:55 pm

    Andrew Dunstan wrote:
    The proc.c cases were using MemSet, which was checking if the
    int* as aligned for int* access. In fact, we could change MemSet to
    always take a void *, and do the int* casting when we access it after
    testing for alignment.
    Since MemSet is generic, that is probably a good idea.
    I have applied the following patch to add an additional void* cast to
    MemSet, and added a comment that this was safe because alignment is
    checked below --- patch attached.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Bruce Momjian at Oct 12, 2003 at 11:19 pm

    Andrew Dunstan wrote:

    ----- Original Message -----
    From: "Bruce Momjian" <pgman@candle.pha.pa.us>
    I have backed out the patch.

    Looking at the case in tablecmds.c and proc.c, the first was assigning a
    struct with a NodeTag pointer as its first element to another struct
    with NodeTag as its first element. In fact, we do this all over the
    place, having different structure pointers with a start element of
    NodeTag.
    I have attached and applied the following patch to use makeNode for
    structures that will later be cast to Node*, rather than having them be
    allocated as stack variables.

    This leaves the only remaning compiler warning coming from common.c listed
    below. What is the exact warning generated --- this seems like a
    different issue.

    ---------------------------------------------------------------------------

    Index: src/bin/psql/command.c
    ===================================================================
    RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/command.c,v
    retrieving revision 1.103
    diff -c -w -r1.103 command.c
    *** src/bin/psql/command.c 29 Sep 2003 16:39:18 -0000 1.103
    --- src/bin/psql/command.c 11 Oct 2003 13:50:15 -0000
    ***************
    *** 1280,1286 ****
    case '7':
    case '8':
    case '9':
    ! c = parse_char((char **) &p);
    break;

    default:
    --- 1280,1286 ----
    case '7':
    case '8':
    case '9':
    ! c = parse_char((void *) &p);
    break;

    default:

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Bruce Momjian at Oct 13, 2003 at 12:18 am

    Bruce Momjian wrote:
    Andrew Dunstan wrote:
    ----- Original Message -----
    From: "Bruce Momjian" <pgman@candle.pha.pa.us>
    I have backed out the patch.

    Looking at the case in tablecmds.c and proc.c, the first was assigning a
    struct with a NodeTag pointer as its first element to another struct
    with NodeTag as its first element. In fact, we do this all over the
    place, having different structure pointers with a start element of
    NodeTag.
    I have attached and applied the following patch to use makeNode for
    structures that will later be cast to Node*, rather than having them be
    allocated as stack variables.
    Oh, and thanks to everyone for doing the research on this compiler issue.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Andrew Dunstan at Oct 13, 2003 at 12:39 am
    The warning is this:

    command.c: In function `unescape':
    command.c:1283: warning: dereferencing type-punned pointer will break
    strict-aliasing rules

    p is declared thus:

    const unsigned char *p;

    If I change common.c/h so that parse_char() takes an (unsigned char **)
    argument, cast its 2nd argument to the call to strtol to (char **), and
    change the cast in the call to (unsigned char **) those warnings go away,
    but I get one from the other place parse_char() is called, namely prompt.c,
    which has similar code but this time p is not declared as unsigned. - and so
    it goes.

    Grrr. What a pain.

    andrew


    ----- Original Message -----
    From: "Bruce Momjian" <pgman@candle.pha.pa.us>
    To: "Andrew Dunstan" <andrew@dunslane.net>
    Cc: "PG Patches" <pgsql-patches@postgresql.org>
    Sent: Sunday, October 12, 2003 7:18 PM
    Subject: Re: [PATCHES] fix for strict-alias warnings

    Andrew Dunstan wrote:
    ----- Original Message -----
    From: "Bruce Momjian" <pgman@candle.pha.pa.us>
    I have backed out the patch.

    Looking at the case in tablecmds.c and proc.c, the first was assigning
    a
    struct with a NodeTag pointer as its first element to another struct
    with NodeTag as its first element. In fact, we do this all over the
    place, having different structure pointers with a start element of
    NodeTag.
    I have attached and applied the following patch to use makeNode for
    structures that will later be cast to Node*, rather than having them be
    allocated as stack variables.

    This leaves the only remaning compiler warning coming from common.c listed
    below. What is the exact warning generated --- this seems like a
    different issue.

    -------------------------------------------------------------------------- -
    Index: src/bin/psql/command.c
    ===================================================================
    RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/command.c,v
    retrieving revision 1.103
    diff -c -w -r1.103 command.c
    *** src/bin/psql/command.c 29 Sep 2003 16:39:18 -0000 1.103
    --- src/bin/psql/command.c 11 Oct 2003 13:50:15 -0000
    ***************
    *** 1280,1286 ****
    case '7':
    case '8':
    case '9':
    ! c = parse_char((char **) &p);
    break;

    default:
    --- 1280,1286 ----
    case '7':
    case '8':
    case '9':
    ! c = parse_char((void *) &p);
    break;

    default:

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania
    19073
    >


    ----------------------------------------------------------------------------
    ----

    Index: src/backend/commands/tablecmds.c
    ===================================================================
    RCS file: /cvsroot/pgsql-server/src/backend/commands/tablecmds.c,v
    retrieving revision 1.88
    diff -c -c -r1.88 tablecmds.c
    *** src/backend/commands/tablecmds.c 11 Oct 2003 18:04:25 -0000 1.88
    --- src/backend/commands/tablecmds.c 12 Oct 2003 23:10:21 -0000
    ***************
    *** 3449,3454 ****
    --- 3449,3455 ----
    Relation pkrel)
    {
    HeapScanDesc scan;
    + TriggerData *trigdata = makeNode(TriggerData); /* must be Node aligned */
    HeapTuple tuple;
    Trigger trig;
    List *list;
    ***************
    *** 3506,3512 ****
    while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
    {
    FunctionCallInfoData fcinfo;
    - TriggerData trigdata;

    /*
    * Make a call to the trigger function
    --- 3507,3512 ----
    ***************
    *** 3518,3537 ****
    /*
    * We assume RI_FKey_check_ins won't look at flinfo...
    */
    ! trigdata.type = T_TriggerData;
    ! trigdata.tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
    ! trigdata.tg_relation = rel;
    ! trigdata.tg_trigtuple = tuple;
    ! trigdata.tg_newtuple = NULL;
    ! trigdata.tg_trigger = &trig;

    ! fcinfo.context = (Node *) &trigdata;

    RI_FKey_check_ins(&fcinfo);
    }

    heap_endscan(scan);

    pfree(trig.tgargs);
    }

    --- 3518,3538 ----
    /*
    * We assume RI_FKey_check_ins won't look at flinfo...
    */
    ! trigdata->type = T_TriggerData;
    ! trigdata->tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
    ! trigdata->tg_relation = rel;
    ! trigdata->tg_trigtuple = tuple;
    ! trigdata->tg_newtuple = NULL;
    ! trigdata->tg_trigger = &trig;

    ! fcinfo.context = (Node *) trigdata;

    RI_FKey_check_ins(&fcinfo);
    }

    heap_endscan(scan);

    + pfree(trigdata);
    pfree(trig.tgargs);
    }

    Index: src/backend/executor/execQual.c
    ===================================================================
    RCS file: /cvsroot/pgsql-server/src/backend/executor/execQual.c,v
    retrieving revision 1.148
    diff -c -c -r1.148 execQual.c
    *** src/backend/executor/execQual.c 11 Oct 2003 18:04:25 -0000 1.148
    --- src/backend/executor/execQual.c 12 Oct 2003 23:10:23 -0000
    ***************
    *** 699,705 ****
    List *arguments = fcache->args;
    Datum result;
    FunctionCallInfoData fcinfo;
    ! ReturnSetInfo rsinfo; /* for functions returning sets */
    ExprDoneCond argDone;
    bool hasSetArg;
    int i;
    --- 699,706 ----
    List *arguments = fcache->args;
    Datum result;
    FunctionCallInfoData fcinfo;
    ! /* for functions returning sets, must be aligned as Node, so use
    makeNode */
    ! ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo);
    ExprDoneCond argDone;
    bool hasSetArg;
    int i;
    ***************
    *** 746,760 ****
    */
    if (fcache->func.fn_retset)
    {
    ! fcinfo.resultinfo = (Node *) &rsinfo;
    ! rsinfo.type = T_ReturnSetInfo;
    ! rsinfo.econtext = econtext;
    ! rsinfo.expectedDesc = NULL;
    ! rsinfo.allowedModes = (int) SFRM_ValuePerCall;
    ! rsinfo.returnMode = SFRM_ValuePerCall;
    /* isDone is filled below */
    ! rsinfo.setResult = NULL;
    ! rsinfo.setDesc = NULL;
    }

    /*
    --- 747,761 ----
    */
    if (fcache->func.fn_retset)
    {
    ! fcinfo.resultinfo = (Node *) rsinfo;
    ! rsinfo->type = T_ReturnSetInfo;
    ! rsinfo->econtext = econtext;
    ! rsinfo->expectedDesc = NULL;
    ! rsinfo->allowedModes = (int) SFRM_ValuePerCall;
    ! rsinfo->returnMode = SFRM_ValuePerCall;
    /* isDone is filled below */
    ! rsinfo->setResult = NULL;
    ! rsinfo->setDesc = NULL;
    }

    /*
    ***************
    *** 803,812 ****
    if (callit)
    {
    fcinfo.isnull = false;
    ! rsinfo.isDone = ExprSingleResult;
    result = FunctionCallInvoke(&fcinfo);
    *isNull = fcinfo.isnull;
    ! *isDone = rsinfo.isDone;
    }
    else
    {
    --- 804,813 ----
    if (callit)
    {
    fcinfo.isnull = false;
    ! rsinfo->isDone = ExprSingleResult;
    result = FunctionCallInvoke(&fcinfo);
    *isNull = fcinfo.isnull;
    ! *isDone = rsinfo->isDone;
    }
    else
    {
    ***************
    *** 903,909 ****
    TupleDesc tupdesc = NULL;
    Oid funcrettype;
    FunctionCallInfoData fcinfo;
    ! ReturnSetInfo rsinfo;
    MemoryContext callerContext;
    MemoryContext oldcontext;
    TupleTableSlot *slot;
    --- 904,910 ----
    TupleDesc tupdesc = NULL;
    Oid funcrettype;
    FunctionCallInfoData fcinfo;
    ! ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo); /* must be Node aligned */
    MemoryContext callerContext;
    MemoryContext oldcontext;
    TupleTableSlot *slot;
    ***************
    *** 992,1006 ****
    * doesn't actually get to see the resultinfo, but set it up anyway
    * because we use some of the fields as our own state variables.
    */
    ! fcinfo.resultinfo = (Node *) &rsinfo;
    ! rsinfo.type = T_ReturnSetInfo;
    ! rsinfo.econtext = econtext;
    ! rsinfo.expectedDesc = expectedDesc;
    ! rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
    ! rsinfo.returnMode = SFRM_ValuePerCall;
    /* isDone is filled below */
    ! rsinfo.setResult = NULL;
    ! rsinfo.setDesc = NULL;

    /*
    * Switch to short-lived context for calling the function or
    --- 993,1007 ----
    * doesn't actually get to see the resultinfo, but set it up anyway
    * because we use some of the fields as our own state variables.
    */
    ! fcinfo.resultinfo = (Node *) rsinfo;
    ! rsinfo->type = T_ReturnSetInfo;
    ! rsinfo->econtext = econtext;
    ! rsinfo->expectedDesc = expectedDesc;
    ! rsinfo->allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
    ! rsinfo->returnMode = SFRM_ValuePerCall;
    /* isDone is filled below */
    ! rsinfo->setResult = NULL;
    ! rsinfo->setDesc = NULL;

    /*
    * Switch to short-lived context for calling the function or
    ***************
    *** 1028,1044 ****
    if (direct_function_call)
    {
    fcinfo.isnull = false;
    ! rsinfo.isDone = ExprSingleResult;
    result = FunctionCallInvoke(&fcinfo);
    }
    else
    {
    result = ExecEvalExpr(funcexpr, econtext,
    ! &fcinfo.isnull, &rsinfo.isDone);
    }

    /* Which protocol does function want to use? */
    ! if (rsinfo.returnMode == SFRM_ValuePerCall)
    {
    /*
    * Check for end of result set.
    --- 1029,1045 ----
    if (direct_function_call)
    {
    fcinfo.isnull = false;
    ! rsinfo->isDone = ExprSingleResult;
    result = FunctionCallInvoke(&fcinfo);
    }
    else
    {
    result = ExecEvalExpr(funcexpr, econtext,
    ! &fcinfo.isnull, &rsinfo->isDone);
    }

    /* Which protocol does function want to use? */
    ! if (rsinfo->returnMode == SFRM_ValuePerCall)
    {
    /*
    * Check for end of result set.
    ***************
    *** 1047,1053 ****
    * tupdesc or tuplestore (since we can't get a tupdesc in the
    * function-returning-tuple case)
    */
    ! if (rsinfo.isDone == ExprEndResult)
    break;

    /*
    --- 1048,1054 ----
    * tupdesc or tuplestore (since we can't get a tupdesc in the
    * function-returning-tuple case)
    */
    ! if (rsinfo->isDone == ExprEndResult)
    break;

    /*
    ***************
    *** 1093,1100 ****
    }
    tupstore = tuplestore_begin_heap(true, false, SortMem);
    MemoryContextSwitchTo(oldcontext);
    ! rsinfo.setResult = tupstore;
    ! rsinfo.setDesc = tupdesc;
    }

    /*
    --- 1094,1101 ----
    }
    tupstore = tuplestore_begin_heap(true, false, SortMem);
    MemoryContextSwitchTo(oldcontext);
    ! rsinfo->setResult = tupstore;
    ! rsinfo->setDesc = tupdesc;
    }

    /*
    ***************
    *** 1127,1139 ****
    /*
    * Are we done?
    */
    ! if (rsinfo.isDone != ExprMultipleResult)
    break;
    }
    ! else if (rsinfo.returnMode == SFRM_Materialize)
    {
    /* check we're on the same page as the function author */
    ! if (!first_time || rsinfo.isDone != ExprSingleResult)
    ereport(ERROR,
    (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
    errmsg("table-function protocol for materialize mode was not
    followed")));
    --- 1128,1140 ----
    /*
    * Are we done?
    */
    ! if (rsinfo->isDone != ExprMultipleResult)
    break;
    }
    ! else if (rsinfo->returnMode == SFRM_Materialize)
    {
    /* check we're on the same page as the function author */
    ! if (!first_time || rsinfo->isDone != ExprSingleResult)
    ereport(ERROR,
    (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
    errmsg("table-function protocol for materialize mode was not
    followed")));
    ***************
    *** 1144,1150 ****
    ereport(ERROR,
    (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
    errmsg("unrecognized table-function returnMode: %d",
    ! (int) rsinfo.returnMode)));

    first_time = false;
    }
    --- 1145,1151 ----
    ereport(ERROR,
    (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
    errmsg("unrecognized table-function returnMode: %d",
    ! (int) rsinfo->returnMode)));

    first_time = false;
    }
    ***************
    *** 1152,1159 ****
    MemoryContextSwitchTo(callerContext);

    /* The returned pointers are those in rsinfo */
    ! *returnDesc = rsinfo.setDesc;
    ! return rsinfo.setResult;
    }


    --- 1153,1160 ----
    MemoryContextSwitchTo(callerContext);

    /* The returned pointers are those in rsinfo */
    ! *returnDesc = rsinfo->setDesc;
    ! return rsinfo->setResult;
    }


    Index: src/backend/port/sysv_shmem.c
    ===================================================================
    RCS file: /cvsroot/pgsql-server/src/backend/port/sysv_shmem.c,v
    retrieving revision 1.19
    diff -c -c -r1.19 sysv_shmem.c
    *** src/backend/port/sysv_shmem.c 11 Oct 2003 18:04:25 -0000 1.19
    --- src/backend/port/sysv_shmem.c 12 Oct 2003 23:10:24 -0000
    ***************
    *** 365,371 ****

    if (hdr->magic != PGShmemMagic)
    {
    ! shmdt(hdr);
    return NULL; /* segment belongs to a non-Postgres app */
    }

    --- 365,371 ----

    if (hdr->magic != PGShmemMagic)
    {
    ! shmdt((void *)hdr);
    return NULL; /* segment belongs to a non-Postgres app */
    }

    ----------------------------------------------------------------------------
    ----

    ---------------------------(end of broadcast)---------------------------
    TIP 9: the planner will ignore your desire to choose an index scan if your
    joining column's datatypes do not match
  • Tom Lane at Oct 13, 2003 at 4:44 am

    Bruce Momjian writes:
    I have attached and applied the following patch to use makeNode for
    structures that will later be cast to Node*, rather than having them be
    allocated as stack variables.
    AFAICT, this adds unnecessary palloc overhead without actually reducing
    the risk of optimization problems.

    regards, tom lane
  • Andrew Dunstan at Oct 13, 2003 at 5:35 am
    ----- Original Message -----
    From: "Tom Lane" <tgl@sss.pgh.pa.us>
    Bruce Momjian <pgman@candle.pha.pa.us> writes:
    I have attached and applied the following patch to use makeNode for
    structures that will later be cast to Node*, rather than having them be
    allocated as stack variables.
    AFAICT, this adds unnecessary palloc overhead without actually reducing
    the risk of optimization problems.
    Even without the extra overhead, the danger of strict-aliasing is not just
    related to alignment. As I understand it, given strict-aliasing assumptions
    the compiler is free to reorder some operations on things it thinks can't be
    the same thing, or even optimise them away because they can have no effect.
    I'm not 100% sure we have avoided that danger.

    Quote from nice page on this subject: "ISO C is not your grandfather's C".
    :-)

    cheers

    andrew
  • Tom Lane at Oct 13, 2003 at 6:01 am

    "Andrew Dunstan" <andrew@dunslane.net> writes:
    Even without the extra overhead, the danger of strict-aliasing is not just
    related to alignment.
    If I understand the issue at all, it has *nothing* to do with alignment.
    As I understand it, given strict-aliasing assumptions
    the compiler is free to reorder some operations on things it thinks can't be
    the same thing, or even optimise them away because they can have no effect. Yah...
    I'm not 100% sure we have avoided that danger.
    I don't think we understand the dangers quite yet, and I think the
    patches applied to date constitute useless thrashing rather than fixes.
    I'd like to see less quick-hack patching and more discussion.

    In particular, given that there is as yet no demonstrated effect other
    than mere warnings issued by a possibly-buggy gcc release, I think it's
    premature to be hacking our sources at all.

    regards, tom lane
  • Bruce Momjian at Oct 14, 2003 at 12:32 am

    Tom Lane wrote:
    "Andrew Dunstan" <andrew@dunslane.net> writes:
    Even without the extra overhead, the danger of strict-aliasing is not just
    related to alignment.
    If I understand the issue at all, it has *nothing* to do with alignment.
    As I understand it, given strict-aliasing assumptions
    the compiler is free to reorder some operations on things it thinks can't be
    the same thing, or even optimise them away because they can have no effect. Yah...
    I'm not 100% sure we have avoided that danger.
    I don't think we understand the dangers quite yet, and I think the
    patches applied to date constitute useless thrashing rather than fixes.
    I'd like to see less quick-hack patching and more discussion.

    In particular, given that there is as yet no demonstrated effect other
    than mere warnings issued by a possibly-buggy gcc release, I think it's
    premature to be hacking our sources at all.
    OK, patch removed. When no one commented after 24 hours on my
    makeNode() idea, I though I was on to something. :-(

    In reading http://www.gnu.org/software/gcc/bugs.html#nonbugs_c and the
    link it references,
    http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html, they seem
    to be talking about any pointer casting.

    It also has this gem:

    I have seen some commits that "fix" gcc 3.3 alias warnings, that does
    not give me warm fuzzy feelings (the commits that is), and I have alse
    seen a lot of confusion about aliasing (and ISO C in general) on
    different mailing lists, so I have tried to explain some of the issues
    that I know have/will bite us.

    indicating they might remove these warnings soon anyway.

    I am not even going to point this gcc issue on the 7.4 open items list.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Tom Lane at Oct 14, 2003 at 4:38 am

    Bruce Momjian writes:
    Tom Lane wrote:
    I don't think we understand the dangers quite yet, and I think the
    patches applied to date constitute useless thrashing rather than fixes.
    In reading http://www.gnu.org/software/gcc/bugs.html#nonbugs_c and the
    link it references,
    http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html, they seem
    to be talking about any pointer casting.
    The latter reference makes it absolutely crystal-clear that inserting
    void* casts does *not* fix the issue. Also observe the statement
    gcc may warn for some constructs that break the aliasing rules, but not
    all of them (or not even most of them!), so a warning-free source code
    does not give you any guarantee.
    I have to agree now with Andrew's last mail that -fno-strict-aliasing is
    the only safe solution. Since gcc isn't even pretending that it can
    warn in all cases where the optimization might break things, I'm not
    sure we could ever responsibly enable this optimization. I do not feel
    this is our problem; it is the compiler hackers' fault if they need to
    make unsupportable, untestable assumptions about application code.

    Basically, ISO broke the language here, and I say it's a screwup up with
    which we need not put.

    regards, tom lane
  • Andrew Dunstan at Oct 14, 2003 at 11:37 am
    ----- Original Message -----
    From: "Tom Lane" <tgl@sss.pgh.pa.us>
    I have to agree now with Andrew's last mail that -fno-strict-aliasing is
    the only safe solution. Since gcc isn't even pretending that it can
    warn in all cases where the optimization might break things, I'm not
    sure we could ever responsibly enable this optimization. I do not feel
    this is our problem; it is the compiler hackers' fault if they need to
    make unsupportable, untestable assumptions about application code.

    Basically, ISO broke the language here, and I say it's a screwup up with
    which we need not put.
    You and Linus Torvalds ;-) I recall seeing almost this exact discussion on
    the kernel hackers list a few years ago.

    Of course, the linux kernel is aimed at a limited set of compilers - as I
    understand it basically gcc although it has been made to build with Intel
    compilers - which makes things somewhat easier for them. What is our target
    set of compilers? What is our target version of C? (being unsure on these
    issues I gave my initdb.c a tour through "gcc -ansi -pedantic" at one
    stage).

    Also note that uninhibited casting between types can still cause alignment
    problems, quite apart from the strict aliasing issue (That was what confused
    Bruce, I think - the email on the netbsd list referred to both strict
    aliasing issues and misalignment issues). Still, that apparently hasn't been
    a problem up to now so we are probably OK on this one.

    cheers

    andrew
  • Tom Lane at Oct 14, 2003 at 2:00 pm

    "Andrew Dunstan" <andrew@dunslane.net> writes:
    Of course, the linux kernel is aimed at a limited set of compilers - as I
    understand it basically gcc although it has been made to build with Intel
    compilers - which makes things somewhat easier for them. What is our target
    set of compilers? What is our target version of C?
    "Pretty much anything that speaks ANSI C" is my usual feeling about
    that. As yet we have not heard of any non-gcc compilers in which this
    is a problem, although you have a point that some compiler somewhere may
    do this and not have a way to turn it off :-(
    Also note that uninhibited casting between types can still cause alignment
    problems,
    We understand that issue, we solved it years ago.

    BTW, I haven't looked at the problem spots in detail. How many of them
    are due to the use of MemSet in conjunction with other access to a chunk
    of memory? ISTM that we need not worry about code motion around a
    MemSet call, since that would require the compiler to prove that the
    memset() path through the macro wouldn't be affected, which I doubt it
    would think.

    regards, tom lane
  • Andrew Dunstan at Oct 14, 2003 at 2:31 pm

    Tom Lane wrote:
    BTW, I haven't looked at the problem spots in detail. How many of them
    are due to the use of MemSet in conjunction with other access to a chunk
    of memory? ISTM that we need not worry about code motion around a
    MemSet call, since that would require the compiler to prove that the
    memset() path through the macro wouldn't be affected, which I doubt it
    would think.
    there were 3 calls to MemSet it complained about - all in
    src/backend/storage/lmgr/proc.c, and all zeroing out the timeval
    structure. (is MemSet actually a gain in this instance?)

    there was the very odd one in src/bin/psql/command.c, which seems to me
    to be bogus

    there were 3 in src/backend/commands/tablecmds.c and
    src/backend/executor/execQual.c complaining about casting things to (Node *)

    finally, there was a warning about incompatible pointer types (i.e. not
    a type-pun warning) in the (uncast) call to shmdt in
    src/backend/port/sysv_shmem.c

    cheers

    andrew
  • Andrew Dunstan at Oct 14, 2003 at 2:56 pm

    Andrew Dunstan wrote:
    there were 3 calls to MemSet it complained about - all in
    src/backend/storage/lmgr/proc.c, and all zeroing out the timeval
    structure. (is MemSet actually a gain in this instance?)

    And looking at it even closer, 2 of the 3 cases of calling MemSet appear
    to be unnecessary, as the zeroed out values are immediately overwritten.

    cheers

    andrew
  • Peter Eisentraut at Oct 14, 2003 at 4:04 pm

    Andrew Dunstan writes:

    And looking at it even closer, 2 of the 3 cases of calling MemSet appear
    to be unnecessary, as the zeroed out values are immediately overwritten.
    We need to zero out the holes in the structures so that hash functions
    work correctly.

    --
    Peter Eisentraut peter_e@gmx.net
  • Andrew Dunstan at Oct 14, 2003 at 5:54 pm

    Peter Eisentraut wrote:
    Andrew Dunstan writes:


    And looking at it even closer, 2 of the 3 cases of calling MemSet appear
    to be unnecessary, as the zeroed out values are immediately overwritten.
    We need to zero out the holes in the structures so that hash functions
    work correctly.

    I suspect we are both wrong :-) These structures aren't used in any
    hashed structure that I can see, but the effect of the MemSet is to zero
    out the it_interval value of the itimer, making sure the timer is a "one
    shot" timer - so they are necessary after all, but for a different
    reason (it's been a while since I used setitimer, and the man page is
    less than a model of clarity).

    cheers

    andrew
  • Manfred Spraul at Oct 14, 2003 at 7:01 pm
    I've asked the question on the gcc devel list. The first reply was that
    MemSet violates strict aliasing rules:

    http://gcc.gnu.org/ml/gcc/2003-10/msg00524.html

    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.

    --
    Manfred
  • Andrew Dunstan at Oct 14, 2003 at 7:18 pm

    Manfred Spraul wrote:

    I've asked the question on the gcc devel list. The first reply was
    that MemSet violates strict aliasing rules:

    http://gcc.gnu.org/ml/gcc/2003-10/msg00524.html

    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.
    The concensus appears to be -fno-strict-aliasing

    cheers

    andrew
  • Neil Conway at Oct 14, 2003 at 7:20 pm

    On Tue, 2003-10-14 at 15:00, Manfred Spraul wrote:
    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.
    The last time we did some benchmarking, using the builtin memset()
    imposes a significant performance penalty on plenty of different
    platforms.

    -Neil
  • Peter Eisentraut at Oct 14, 2003 at 8:30 pm

    Neil Conway writes:
    On Tue, 2003-10-14 at 15:00, Manfred Spraul wrote:
    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.
    The last time we did some benchmarking, using the builtin memset()
    imposes a significant performance penalty on plenty of different
    platforms.
    The last time I did some testing, the builtin memset() was significantly
    faster on plenty of different platforms.

    --
    Peter Eisentraut peter_e@gmx.net
  • Neil Conway at Oct 14, 2003 at 8:38 pm

    On Tue, 2003-10-14 at 16:29, Peter Eisentraut wrote:
    The last time I did some testing, the builtin memset() was significantly
    faster on plenty of different platforms.
    Oh? Which platforms are you referring to, and what tests were performed?

    You can find the benchmark results I'm referring to in the archives
    here:

    http://archives.postgresql.org/pgsql-hackers/2002-08/msg02116.php

    (In some hypothetical world in which MemSet() didn't offer a significant
    performance improvement, there is no reason to keep it around.)

    -Neil
  • Peter Eisentraut at Oct 14, 2003 at 8:45 pm

    Neil Conway writes:

    Oh? Which platforms are you referring to, and what tests were performed?
    http://archives.postgresql.org/pgsql-patches/2002-10/msg00085.php

    --
    Peter Eisentraut peter_e@gmx.net
  • Tom Lane at Oct 14, 2003 at 7:32 pm

    Manfred Spraul writes:
    I've asked the question on the gcc devel list. The first reply was that
    MemSet violates strict aliasing rules:
    No doubt it does, but that is not really the issue here; the issue IMHO
    is whether there is any real risk involved. Remember that the macro is
    really of the form

    if (blah blah)
    {
    // unsafe code is here
    }
    else
    {
    memset(...);
    }

    Given that gcc is smart enough not to move any code across the memset()
    call, I doubt that it would be moving anything across the whole if()
    construct. Now if the if-condition were such that the memset code path
    could be optimized away, then we'd have a problem, but in practice I do
    not believe gcc is smart enough to realize that the alignment check is
    always true.

    We do have to be wary of MemSetAligned and MemSetLoop, but these are
    only used in constrained places (routines that do nothing else with
    the memory in question) so I think they are not a problem.
    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.
    We will not be doing the latter, for certain.

    regards, tom lane
  • Bruce Momjian at Oct 14, 2003 at 7:38 pm

    Tom Lane wrote:
    I think we must either add -fno-strict-aliasing, or switch to the c
    compiler memset functions for gcc.
    We will not be doing the latter, for certain.
    OK, what gcc versions support -fno-strict-aliasing? Do we need a
    configure test for it? Would someone profile PostgreSQL with
    -fno-strict-aliasing and see if you can see a larger performance hit?

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Tom Lane at Oct 14, 2003 at 7:53 pm

    Bruce Momjian writes:
    OK, what gcc versions support -fno-strict-aliasing? Do we need a
    configure test for it?
    Perhaps ... although it is recognized in 2.95.3 and probably for a good
    ways before that.

    It looks to me like what has changed in gcc 3.3 is not the existence
    of the flag, but the fact that -O2 now turns it on where it did not
    before.

    regards, tom lane
  • Bruce Momjian at Oct 14, 2003 at 8:02 pm

    Tom Lane wrote:
    Bruce Momjian <pgman@candle.pha.pa.us> writes:
    OK, what gcc versions support -fno-strict-aliasing? Do we need a
    configure test for it?
    Perhaps ... although it is recognized in 2.95.3 and probably for a good
    ways before that.

    It looks to me like what has changed in gcc 3.3 is not the existence
    of the flag, but the fact that -O2 now turns it on where it did not
    before.
    Right. I am just not sure how old our gcc compilers are in the field.
    We can do it unconditionally and wait for a failure report.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073
  • Andrew Dunstan at Oct 14, 2003 at 8:24 pm

    Tom Lane wrote:
    Bruce Momjian <pgman@candle.pha.pa.us> writes:

    OK, what gcc versions support -fno-strict-aliasing? Do we need a
    configure test for it?
    Perhaps ... although it is recognized in 2.95.3 and probably for a good
    ways before that.

    It looks to me like what has changed in gcc 3.3 is not the existence
    of the flag, but the fact that -O2 now turns it on where it did not
    before.

    Yes, exactly - I have just confirmed this on the archived manuals on the
    gcc site.

    But turning it off for earlier releases does no harm.

    It is known to break 2.7.1 and I believe it will work with 2.95 and later.

    cheers

    andrew
  • Peter Eisentraut at Oct 14, 2003 at 8:19 pm

    Tom Lane writes:

    Given that gcc is smart enough not to move any code across the memset()
    call,
    Is it? If you violate the aliasing rules, all bets are off.

    --
    Peter Eisentraut peter_e@gmx.net
  • Tom Lane at Oct 14, 2003 at 8:23 pm

    Peter Eisentraut writes:
    Tom Lane writes:
    Given that gcc is smart enough not to move any code across the memset()
    call,
    Is it?
    It had better be.

    regards, tom lane
  • Manfred Spraul at Oct 14, 2003 at 8:33 pm

    Tom Lane wrote:
    Given that gcc is smart enough not to move any code across the memset()
    call, I doubt that it would be moving anything across the whole if()
    construct. Now if the if-condition were such that the memset code path
    could be optimized away, then we'd have a problem, but in practice I do
    not believe gcc is smart enough to realize that the alignment check is
    always true.
    gcc-3.2.2 optimizes the memset away - that's a simple exercise for gcc.

    gcc-3.2.2 isn't smart enough to replace everything - it didn't like the
    pointer arithmetics.
    After some massaging, I've succeeded in generating bad code using a
    slightly modified MemSetAligned macro (parameters -O2
    -fstrict-aliasing): gcc pipelined the x*x around the memset.

    Annotated asm output with gcc -O99 -fomit-frame-pointer -fstrict-aliasing:
    08048328 <test2>:
    8048328: 83 ec 18 sub $0x18,%esp
    stack setup for automatic variables.
    804832b: c7 44 24 0c 00 00 00 movl $0x0,0xc(%esp,1)
    8048332: 00
    8048333: c7 44 24 10 00 00 00 movl $0x40000000,0x10(%esp,1)
    804833a: 40
    x = 1.0;
    804833b: dd 44 24 0c fldl 0xc(%esp,1)
    804833f: d8 c8 fmul %st(0),%st
    x = x*x;
    8048341: c7 44 24 0c 00 00 00 movl $0x0,0xc(%esp,1)
    8048348: 00
    8048349: c7 44 24 10 00 00 00 movl $0x0,0x10(%esp,1)
    8048350: 00
    MemSetAligned(): optimized to storing two ints.
    8048351: dd 54 24 0c fstl 0xc(%esp,1)
    write back the result of x*x to the stack
    8048355: dd 1c 24 fstpl (%esp,1)
    push x*x for printf call
    8048358: 68 54 84 04 08 push $0x8048454
    push pointer to "square is %f.\n"
    804835d: e8 06 ff ff ff call 8048268 <_init+0x38>
    call printf
    8048362: 83 c4 1c add $0x1c,%esp
    8048365: c3 ret
    and exit.
    8048366: 89 f6 mov %esi,%esi

    To paraphrase the ISO C line: gcc is not your grandfather's gcc. It's
    within 10% of the best compilers for SpecInt - the propagation and
    analysis of constants it quite good, and several bugs were fixed sinced
    3.2.2.

    What is the oldest gcc versions still supported by postgres? It seems
    that the strict alias analysis is from the egcs tree. Probably first
    supported by egcs-1.1.2 - is that gcc-2.91?

    http://groups.google.de/groups?q=g:thl2087564510d&dq=&hl=de&lr=&ie=UTF-8&oe=UTF-8&selm=fa.fjlldvv.l7m7hk%40ifi.uio.no

    --
    Manfred
  • Tom Lane at Oct 14, 2003 at 8:43 pm

    Manfred Spraul writes:
    After some massaging, I've succeeded in generating bad code using a
    slightly modified MemSetAligned macro (parameters -O2
    -fstrict-aliasing): gcc pipelined the x*x around the memset.
    As I already explained, we do not care about the MemSetAligned case.
    Is gcc 3.3 smart enough to optimize away the pointer alignment test
    in the full macro?

    (Hm, if so, maybe that explains Bruce's observation that the warning
    only shows up with Node structs that are allocated as local
    variables, rather than palloc'd ... that would be the only case where
    gcc could possibly optimize away the alignment test ...)

    regards, tom lane
  • Manfred Spraul at Oct 14, 2003 at 9:01 pm

    Tom Lane wrote:
    Manfred Spraul <manfred@colorfullife.com> writes:

    After some massaging, I've succeeded in generating bad code using a
    slightly modified MemSetAligned macro (parameters -O2
    -fstrict-aliasing): gcc pipelined the x*x around the memset.
    As I already explained, we do not care about the MemSetAligned case.
    Is gcc 3.3 smart enough to optimize away the pointer alignment test
    in the full macro?
    3.2 optimizes away the pointer alignment test, but then doesn't pipeline
    the "x*x" calculation. It might be due to a known (and now fixed) bug in
    gcc where is lost track of constants, and thus didn't succeed in
    optimizing long calculations.

    I don't have gcc 3.3 installed, but IMHO it would be insane to leave
    strict alias analysis enabled - writing to *(int32*)addr violates the
    alias rules, the bad code generated with MemSetAligned proved that.
    Is someone around with 3.3 who could test MemSet?

    --
    Manfred
  • Andrew Dunstan at Oct 14, 2003 at 9:19 pm

    Manfred Spraul wrote:

    Tom Lane wrote:
    Manfred Spraul <manfred@colorfullife.com> writes:

    After some massaging, I've succeeded in generating bad code using a
    slightly modified MemSetAligned macro (parameters -O2
    -fstrict-aliasing): gcc pipelined the x*x around the memset.

    As I already explained, we do not care about the MemSetAligned case.
    Is gcc 3.3 smart enough to optimize away the pointer alignment test
    in the full macro?
    3.2 optimizes away the pointer alignment test, but then doesn't
    pipeline the "x*x" calculation. It might be due to a known (and now
    fixed) bug in gcc where is lost track of constants, and thus didn't
    succeed in optimizing long calculations.

    I don't have gcc 3.3 installed, but IMHO it would be insane to leave
    strict alias analysis enabled - writing to *(int32*)addr violates the
    alias rules, the bad code generated with MemSetAligned proved that.
    Is someone around with 3.3 who could test MemSet?

    I have it on my cygwin installation - that's how this whole mess arose
    in the first place ;-)

    All this is interesting, but the real problem remains that we don't know
    what else might be affected because gcc apparently doesn't promise to
    tell us. IMO the gcc team made a bad mistake by turning this on by
    default for -O2 without reliable accompanying diagnostics.

    cheers

    andrew
  • Tom Lane at Oct 15, 2003 at 3:25 pm

    Andrew Dunstan writes:
    All this is interesting, but the real problem remains that we don't know
    what else might be affected because gcc apparently doesn't promise to
    tell us. IMO the gcc team made a bad mistake by turning this on by
    default for -O2 without reliable accompanying diagnostics.
    Yeah, this seems to be the killer point. It *might* be safe to leave
    -fstrict-aliasing on; but we can't tell.

    I think we can safely just make the default switches for gcc be
    CFLAGS="-O2 -fno-strict-aliasing". If I understand correctly, this will
    work with all gcc versions back to 2.7.something, which is pretty much
    ancient history now --- and if anyone needs to make it work with such an
    old gcc, all they need do is specify CFLAGS to configure instead of
    letting it default.

    Will commit this change unless I hear objections soon.

    regards, tom lane
  • Tom Lane at Oct 14, 2003 at 9:23 pm

    Manfred Spraul writes:
    Tom Lane wrote:
    Is gcc 3.3 smart enough to optimize away the pointer alignment test
    in the full macro?
    3.2 optimizes away the pointer alignment test, but then doesn't pipeline
    the "x*x" calculation.
    Hm, confirmed here. So indeed it seems that Bruce was on the right
    track --- setting up a Node structure as a local variable may be a
    contributing factor.
    I don't have gcc 3.3 installed, but IMHO it would be insane to leave
    strict alias analysis enabled - writing to *(int32*)addr violates the
    alias rules, the bad code generated with MemSetAligned proved that.
    While I don't really disagree, I am curious as to whether we are
    actually forestalling any bugs; so far I'm not convinced that the
    reported warnings correspond to real risks ...

    regards, tom lane
  • Andrew Dunstan at Oct 15, 2003 at 11:57 am
    gcc 3.3.1/cygwin

    MemSetAligned: prints "square is 4.000000"
    MemSet: prints "square is 0.000000"

    Interestingly, a lot of the comparison and call to memset() still seem to be
    optimised away, but the loop from MemSet is left, so the multiplication is
    also not optimised away.

    here's the assembler for test2/MemSet:

    subl $28, %esp
    leal 16(%esp), %eax
    movl $0, 16(%esp)
    leal 24(%esp), %edx
    cmpl %edx, %eax
    movl $1073741824, 20(%esp)
    jae L21
    .align 16
    L26:
    movl $0, (%eax)
    addl $4, %eax
    cmpl %edx, %eax
    jb L26
    L21:
    fldl 16(%esp)
    movl $LC1, (%esp)
    fmul %st(0), %st
    fstl 16(%esp)
    fstpl 4(%esp)
    call _printf
    addl $28, %esp
    ret

    cheers

    andrew





    ----- Original Message -----
    From: "Manfred Spraul" <manfred@colorfullife.com>
    To: "Tom Lane" <tgl@sss.pgh.pa.us>
    Cc: "Andrew Dunstan" <andrew@dunslane.net>; "Patches (PostgreSQL)"
    <pgsql-patches@postgresql.org>
    Sent: Tuesday, October 14, 2003 5:01 PM
    Subject: Re: [PATCHES] fix for strict-alias warnings

    Tom Lane wrote:
    Manfred Spraul <manfred@colorfullife.com> writes:

    After some massaging, I've succeeded in generating bad code using a
    slightly modified MemSetAligned macro (parameters -O2
    -fstrict-aliasing): gcc pipelined the x*x around the memset.
    As I already explained, we do not care about the MemSetAligned case.
    Is gcc 3.3 smart enough to optimize away the pointer alignment test
    in the full macro?
    3.2 optimizes away the pointer alignment test, but then doesn't pipeline
    the "x*x" calculation. It might be due to a known (and now fixed) bug in
    gcc where is lost track of constants, and thus didn't succeed in
    optimizing long calculations.

    I don't have gcc 3.3 installed, but IMHO it would be insane to leave
    strict alias analysis enabled - writing to *(int32*)addr violates the
    alias rules, the bad code generated with MemSetAligned proved that.
    Is someone around with 3.3 who could test MemSet?

    --
    Manfred
  • Tom Lane at Oct 15, 2003 at 3:49 pm

    "Andrew Dunstan" <andrew@dunslane.net> writes:
    Interestingly, a lot of the comparison and call to memset() still seem to be
    optimised away, but the loop from MemSet is left, so the multiplication is
    also not optimised away.
    Yeah, I saw the same in gcc 3.2 (on Intel) yesterday. I thought maybe
    3.3 would fix that, since it sure looks like an optimization bug, but
    evidently not.

    regards, tom lane
  • Manfred Spraul at Oct 14, 2003 at 4:27 pm

    Tom Lane wrote:
    "Andrew Dunstan" <andrew@dunslane.net> writes:

    Of course, the linux kernel is aimed at a limited set of compilers - as I
    understand it basically gcc although it has been made to build with Intel
    compilers
    icc once compiled the kernel. But they had to teach it quite a lots of
    gccisms.
    - which makes things somewhat easier for them. What is our target
    set of compilers? What is our target version of C?
    "Pretty much anything that speaks ANSI C" is my usual feeling about
    that. As yet we have not heard of any non-gcc compilers in which this
    is a problem, although you have a point that some compiler somewhere may
    do this and not have a way to turn it off :-(
    Intel's icc compiler supports strict alias analysis, but the default was
    off.
    Also note that uninhibited casting between types can still cause alignment
    problems,
    We understand that issue, we solved it years ago.

    BTW, I haven't looked at the problem spots in detail. How many of them
    are due to the use of MemSet in conjunction with other access to a chunk
    of memory? ISTM that we need not worry about code motion around a
    MemSet call, since that would require the compiler to prove that the
    memset() path through the macro wouldn't be affected, which I doubt it
    would think.
    gcc is quite good at propagating constants around. This is heavily used
    in the linux-kernel: __builtin_constant(x), and then large switch
    statements that are completely evaluated at compile time. There is a
    good chance that gcc figures out that MemSet(,0,sizeof(double)) are two
    writes to two integer values, and then decides that they can't alias
    with reads/write to the double.

    I'll search for a suitable gcc list and post the memset macro - that
    might give a definitive answer.

    --
    Manfred
  • Bruce Momjian at Oct 14, 2003 at 7:10 pm

    Tom Lane wrote:
    "Andrew Dunstan" <andrew@dunslane.net> writes:
    Of course, the linux kernel is aimed at a limited set of compilers - as I
    understand it basically gcc although it has been made to build with Intel
    compilers - which makes things somewhat easier for them. What is our target
    set of compilers? What is our target version of C?
    "Pretty much anything that speaks ANSI C" is my usual feeling about
    that. As yet we have not heard of any non-gcc compilers in which this
    is a problem, although you have a point that some compiler somewhere may
    do this and not have a way to turn it off :-(
    I now understand the gcc issues. Sorry for getting confused.

    I have removed the void* cast from MemSet until we understand this issue
    better.

    --
    Bruce Momjian | http://candle.pha.pa.us
    pgman@candle.pha.pa.us | (610) 359-1001
    + If your life is a hard drive, | 13 Roberts Road
    + Christ can be your backup. | Newtown Square, Pennsylvania 19073

Related Discussions