Thread: fix for strict-alias warnings

fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
 
This patch will stop gcc from issuing warnings about type-punned objects when -fstrict-aliasing is turned on, as it is in the latest gcc when you use -O2
 
enjoy
 
andrew
 
 
Attachment

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Patch applied.  Thanks.

---------------------------------------------------------------------------


Andrew Dunstan wrote:
>
> This patch will stop gcc from issuing warnings about type-punned objects when -fstrict-aliasing is turned on, as it
isin the latest gcc when you use -O2 
>
> enjoy
>
> andrew
>

[ Attachment, skipping... ]

>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Patch applied.  Thanks.

I hope you applied it with the additional changes you asked for ---
at the very least, cast to (void*) and then to the destination type.
As-is, the patch simply suppresses all error detection for these
conversions, which seems a bad move.

            regards, tom lane

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Patch applied.  Thanks.
>
> I hope you applied it with the additional changes you asked for ---
> at the very least, cast to (void*) and then to the destination type.
> As-is, the patch simply suppresses all error detection for these
> conversions, which seems a bad move.

I don't have a version that does the double-cast, but I still have the
patch to back out and put in a new one.  Andrew's point was that we cast
to void * in many places, so this case is not unique.  Is that wrong?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I don't have a version that does the double-cast, but I still have the
> patch to back out and put in a new one.  Andrew's point was that we cast
> to void * in many places, so this case is not unique.  Is that wrong?

I do not like code that uses cast to void* as a substitute for casting
to the real destination type.  I think it's a lazy substitute for
providing the correct cast, and it renders the code more fragile because
there is *no* possibility of the compiler detecting a problem should you
change the source or destination datatype in a way that renders the cast
wrong.

I have not gone around and tried to fix all the places that are lazy in
this way, but I don't want to introduce more, and for sure I don't want
to set a precedent that we'll weaken our type checking any time gcc
burps for ill-defined reasons.

I agree completely with all of the objections you raised in your
original comment on the patch.  In particular, I don't think we
understand why gcc is complaining about these few places and not any of
the thousands of other casts in our code.  Until we understand that
difference completely, we are not "fixing a bug" by introducing void*
casts.  I'd have to call it cargo-cult programming instead.

I am perfectly content to leave the warnings in place until we have a
satisfactory explanation.

            regards, tom lane

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
Tough words! :-)

ISTM the best thing would be to back out the patch, add -fno-strict-aliasing
for gcc, and add a TODO to fix this thoroughly.

Having -fstrict-aliasing on and ignoring the warnings doesn't seem like a
sound strategy. I think we should fix it or turn it off. The web is littered
with projects that got bizzare happenings when they turned it on without any
accompanying code changes.

I agree with Tom that my patch isn't ideal  (I thought I said as much).
Fixing it thoroughly will require some significant code changes, though. We
seem to be far too close to 7.4 release to contemplate that.

cheers

andrew


----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
To: "Bruce Momjian" <pgman@candle.pha.pa.us>
Cc: "Andrew Dunstan" <andrew@dunslane.net>; "PG Patches"
<pgsql-patches@postgresql.org>
Sent: Saturday, October 11, 2003 1:29 PM
Subject: Re: [PATCHES] fix for strict-alias warnings


> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I don't have a version that does the double-cast, but I still have the
> > patch to back out and put in a new one.  Andrew's point was that we cast
> > to void * in many places, so this case is not unique.  Is that wrong?
>
> I do not like code that uses cast to void* as a substitute for casting
> to the real destination type.  I think it's a lazy substitute for
> providing the correct cast, and it renders the code more fragile because
> there is *no* possibility of the compiler detecting a problem should you
> change the source or destination datatype in a way that renders the cast
> wrong.
>
> I have not gone around and tried to fix all the places that are lazy in
> this way, but I don't want to introduce more, and for sure I don't want
> to set a precedent that we'll weaken our type checking any time gcc
> burps for ill-defined reasons.
>
> I agree completely with all of the objections you raised in your
> original comment on the patch.  In particular, I don't think we
> understand why gcc is complaining about these few places and not any of
> the thousands of other casts in our code.  Until we understand that
> difference completely, we are not "fixing a bug" by introducing void*
> casts.  I'd have to call it cargo-cult programming instead.
>
> I am perfectly content to leave the warnings in place until we have a
> satisfactory explanation.
>
> regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)


Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> Tough words! :-)
>
> ISTM the best thing would be to back out the patch, add -fno-strict-aliasing
> for gcc, and add a TODO to fix this thoroughly.
>
> Having -fstrict-aliasing on and ignoring the warnings doesn't seem like a
> sound strategy. I think we should fix it or turn it off. The web is littered
> with projects that got bizzare happenings when they turned it on without any
> accompanying code changes.
>
> I agree with Tom that my patch isn't ideal  (I thought I said as much).
> Fixing it thoroughly will require some significant code changes, though. We
> seem to be far too close to 7.4 release to contemplate that.

I have backed out the patch.

Looking at the case in tablecmds.c and proc.c, the first was assigning a
struct with a NodeTag pointer as its first element to another struct
with NodeTag as its first element.  In fact, we do this all over the
place, having different structure pointers with a start element of
NodeTag.   The proc.c cases were using MemSet, which was checking if the
int* as aligned for int* access.  In fact, we could change MemSet to
always take a void *, and do the int* casting when we access it after
testing for alignment.

The big question in my mind is whether there there is other struct *
passing that could be masked right now by void* casting, and if so, do
they have different first elements?  This determined whether we do
-fstrict-aliasing for gcc, or fix just these few cases.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
>
> I have backed out the patch.
>
> Looking at the case in tablecmds.c and proc.c, the first was assigning a
> struct with a NodeTag pointer as its first element to another struct
> with NodeTag as its first element.  In fact, we do this all over the
> place, having different structure pointers with a start element of
> NodeTag.

Right - and it is what would have to change if you really want to obey the
ISO C rules, I believe. This is handled easily in other languages using
variant records, but C is kinda primitive here :-)

As I understand it, instead of

struct foo {
  int tag
  foostuff f;
}

struct bar {
  int tag;
  barstuff b;
}

you would need to do something like

struct foo {
  foostuff f;
};

struct bar {
  barstuff b;
};

struct foobar {
  int tag;
  union {
    struct foo foo;
    struct bar bar;
 } v;
};



>   The proc.c cases were using MemSet, which was checking if the
> int* as aligned for int* access.  In fact, we could change MemSet to
> always take a void *, and do the int* casting when we access it after
> testing for alignment.
>

Since MemSet is generic, that is probably a good idea.

> The big question in my mind is whether there there is other struct *
> passing that could be masked right now by void* casting, and if so, do
> they have different first elements?  This determined whether we do
> -fstrict-aliasing for gcc, or fix just these few cases.

Just analysing this is a non-trivial piece of work, I think.

cheers

andrew


Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> >   The proc.c cases were using MemSet, which was checking if the
> > int* as aligned for int* access.  In fact, we could change MemSet to
> > always take a void *, and do the int* casting when we access it after
> > testing for alignment.
> >
>
> Since MemSet is generic, that is probably a good idea.

I have applied the following patch to add an additional void* cast to
MemSet, and added a comment that this was safe because alignment is
checked below --- patch attached.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/include/c.h
===================================================================
RCS file: /cvsroot/pgsql-server/src/include/c.h,v
retrieving revision 1.153
diff -c -c -r1.153 c.h
*** src/include/c.h    21 Sep 2003 17:57:21 -0000    1.153
--- src/include/c.h    11 Oct 2003 19:51:09 -0000
***************
*** 604,610 ****
  #define MemSet(start, val, len) \
      do \
      { \
!         int32 * _start = (int32 *) (start); \
          int        _val = (val); \
          Size    _len = (len); \
  \
--- 604,611 ----
  #define MemSet(start, val, len) \
      do \
      { \
!         /* (void *) used because we check for alignment below */ \
!         int32 * _start = (int32 *) (void *) (start); \
          int        _val = (val); \
          Size    _len = (len); \
  \

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
>
> ----- Original Message -----
> From: "Bruce Momjian" <pgman@candle.pha.pa.us>
> >
> > I have backed out the patch.
> >
> > Looking at the case in tablecmds.c and proc.c, the first was assigning a
> > struct with a NodeTag pointer as its first element to another struct
> > with NodeTag as its first element.  In fact, we do this all over the
> > place, having different structure pointers with a start element of
> > NodeTag.

I have attached and applied the following patch to use makeNode for
structures that will later be cast to Node*, rather than having them be
allocated as stack variables.

This leaves the only remaning compiler warning coming from common.c listed
below.  What is the exact warning generated --- this seems like a
different issue.

---------------------------------------------------------------------------

Index: src/bin/psql/command.c
===================================================================
RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/command.c,v
retrieving revision 1.103
diff -c -w -r1.103 command.c
*** src/bin/psql/command.c    29 Sep 2003 16:39:18 -0000    1.103
--- src/bin/psql/command.c    11 Oct 2003 13:50:15 -0000
***************
*** 1280,1286 ****
                  case '7':
                  case '8':
                  case '9':
!                     c = parse_char((char **) &p);
                      break;

                  default:
--- 1280,1286 ----
                  case '7':
                  case '8':
                  case '9':
!                     c = parse_char((void *) &p);
                      break;

                  default:

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/backend/commands/tablecmds.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/commands/tablecmds.c,v
retrieving revision 1.88
diff -c -c -r1.88 tablecmds.c
*** src/backend/commands/tablecmds.c    11 Oct 2003 18:04:25 -0000    1.88
--- src/backend/commands/tablecmds.c    12 Oct 2003 23:10:21 -0000
***************
*** 3449,3454 ****
--- 3449,3455 ----
                               Relation pkrel)
  {
      HeapScanDesc scan;
+     TriggerData *trigdata = makeNode(TriggerData); /* must be Node aligned */
      HeapTuple    tuple;
      Trigger        trig;
      List       *list;
***************
*** 3506,3512 ****
      while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
      {
          FunctionCallInfoData fcinfo;
-         TriggerData trigdata;

          /*
           * Make a call to the trigger function
--- 3507,3512 ----
***************
*** 3518,3537 ****
          /*
           * We assume RI_FKey_check_ins won't look at flinfo...
           */
!         trigdata.type = T_TriggerData;
!         trigdata.tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
!         trigdata.tg_relation = rel;
!         trigdata.tg_trigtuple = tuple;
!         trigdata.tg_newtuple = NULL;
!         trigdata.tg_trigger = &trig;

!         fcinfo.context = (Node *) &trigdata;

          RI_FKey_check_ins(&fcinfo);
      }

      heap_endscan(scan);

      pfree(trig.tgargs);
  }

--- 3518,3538 ----
          /*
           * We assume RI_FKey_check_ins won't look at flinfo...
           */
!         trigdata->type = T_TriggerData;
!         trigdata->tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
!         trigdata->tg_relation = rel;
!         trigdata->tg_trigtuple = tuple;
!         trigdata->tg_newtuple = NULL;
!         trigdata->tg_trigger = &trig;

!         fcinfo.context = (Node *) trigdata;

          RI_FKey_check_ins(&fcinfo);
      }

      heap_endscan(scan);

+     pfree(trigdata);
      pfree(trig.tgargs);
  }

Index: src/backend/executor/execQual.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/executor/execQual.c,v
retrieving revision 1.148
diff -c -c -r1.148 execQual.c
*** src/backend/executor/execQual.c    11 Oct 2003 18:04:25 -0000    1.148
--- src/backend/executor/execQual.c    12 Oct 2003 23:10:23 -0000
***************
*** 699,705 ****
      List       *arguments = fcache->args;
      Datum        result;
      FunctionCallInfoData fcinfo;
!     ReturnSetInfo rsinfo;        /* for functions returning sets */
      ExprDoneCond argDone;
      bool        hasSetArg;
      int            i;
--- 699,706 ----
      List       *arguments = fcache->args;
      Datum        result;
      FunctionCallInfoData fcinfo;
!     /* for functions returning sets, must be aligned as Node, so use makeNode */
!     ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo);
      ExprDoneCond argDone;
      bool        hasSetArg;
      int            i;
***************
*** 746,760 ****
       */
      if (fcache->func.fn_retset)
      {
!         fcinfo.resultinfo = (Node *) &rsinfo;
!         rsinfo.type = T_ReturnSetInfo;
!         rsinfo.econtext = econtext;
!         rsinfo.expectedDesc = NULL;
!         rsinfo.allowedModes = (int) SFRM_ValuePerCall;
!         rsinfo.returnMode = SFRM_ValuePerCall;
          /* isDone is filled below */
!         rsinfo.setResult = NULL;
!         rsinfo.setDesc = NULL;
      }

      /*
--- 747,761 ----
       */
      if (fcache->func.fn_retset)
      {
!         fcinfo.resultinfo = (Node *) rsinfo;
!         rsinfo->type = T_ReturnSetInfo;
!         rsinfo->econtext = econtext;
!         rsinfo->expectedDesc = NULL;
!         rsinfo->allowedModes = (int) SFRM_ValuePerCall;
!         rsinfo->returnMode = SFRM_ValuePerCall;
          /* isDone is filled below */
!         rsinfo->setResult = NULL;
!         rsinfo->setDesc = NULL;
      }

      /*
***************
*** 803,812 ****
              if (callit)
              {
                  fcinfo.isnull = false;
!                 rsinfo.isDone = ExprSingleResult;
                  result = FunctionCallInvoke(&fcinfo);
                  *isNull = fcinfo.isnull;
!                 *isDone = rsinfo.isDone;
              }
              else
              {
--- 804,813 ----
              if (callit)
              {
                  fcinfo.isnull = false;
!                 rsinfo->isDone = ExprSingleResult;
                  result = FunctionCallInvoke(&fcinfo);
                  *isNull = fcinfo.isnull;
!                 *isDone = rsinfo->isDone;
              }
              else
              {
***************
*** 903,909 ****
      TupleDesc    tupdesc = NULL;
      Oid            funcrettype;
      FunctionCallInfoData fcinfo;
!     ReturnSetInfo rsinfo;
      MemoryContext callerContext;
      MemoryContext oldcontext;
      TupleTableSlot *slot;
--- 904,910 ----
      TupleDesc    tupdesc = NULL;
      Oid            funcrettype;
      FunctionCallInfoData fcinfo;
!     ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo); /* must be Node aligned */
      MemoryContext callerContext;
      MemoryContext oldcontext;
      TupleTableSlot *slot;
***************
*** 992,1006 ****
       * doesn't actually get to see the resultinfo, but set it up anyway
       * because we use some of the fields as our own state variables.
       */
!     fcinfo.resultinfo = (Node *) &rsinfo;
!     rsinfo.type = T_ReturnSetInfo;
!     rsinfo.econtext = econtext;
!     rsinfo.expectedDesc = expectedDesc;
!     rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
!     rsinfo.returnMode = SFRM_ValuePerCall;
      /* isDone is filled below */
!     rsinfo.setResult = NULL;
!     rsinfo.setDesc = NULL;

      /*
       * Switch to short-lived context for calling the function or
--- 993,1007 ----
       * doesn't actually get to see the resultinfo, but set it up anyway
       * because we use some of the fields as our own state variables.
       */
!     fcinfo.resultinfo = (Node *) rsinfo;
!     rsinfo->type = T_ReturnSetInfo;
!     rsinfo->econtext = econtext;
!     rsinfo->expectedDesc = expectedDesc;
!     rsinfo->allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
!     rsinfo->returnMode = SFRM_ValuePerCall;
      /* isDone is filled below */
!     rsinfo->setResult = NULL;
!     rsinfo->setDesc = NULL;

      /*
       * Switch to short-lived context for calling the function or
***************
*** 1028,1044 ****
          if (direct_function_call)
          {
              fcinfo.isnull = false;
!             rsinfo.isDone = ExprSingleResult;
              result = FunctionCallInvoke(&fcinfo);
          }
          else
          {
              result = ExecEvalExpr(funcexpr, econtext,
!                                   &fcinfo.isnull, &rsinfo.isDone);
          }

          /* Which protocol does function want to use? */
!         if (rsinfo.returnMode == SFRM_ValuePerCall)
          {
              /*
               * Check for end of result set.
--- 1029,1045 ----
          if (direct_function_call)
          {
              fcinfo.isnull = false;
!             rsinfo->isDone = ExprSingleResult;
              result = FunctionCallInvoke(&fcinfo);
          }
          else
          {
              result = ExecEvalExpr(funcexpr, econtext,
!                                   &fcinfo.isnull, &rsinfo->isDone);
          }

          /* Which protocol does function want to use? */
!         if (rsinfo->returnMode == SFRM_ValuePerCall)
          {
              /*
               * Check for end of result set.
***************
*** 1047,1053 ****
               * tupdesc or tuplestore (since we can't get a tupdesc in the
               * function-returning-tuple case)
               */
!             if (rsinfo.isDone == ExprEndResult)
                  break;

              /*
--- 1048,1054 ----
               * tupdesc or tuplestore (since we can't get a tupdesc in the
               * function-returning-tuple case)
               */
!             if (rsinfo->isDone == ExprEndResult)
                  break;

              /*
***************
*** 1093,1100 ****
                  }
                  tupstore = tuplestore_begin_heap(true, false, SortMem);
                  MemoryContextSwitchTo(oldcontext);
!                 rsinfo.setResult = tupstore;
!                 rsinfo.setDesc = tupdesc;
              }

              /*
--- 1094,1101 ----
                  }
                  tupstore = tuplestore_begin_heap(true, false, SortMem);
                  MemoryContextSwitchTo(oldcontext);
!                 rsinfo->setResult = tupstore;
!                 rsinfo->setDesc = tupdesc;
              }

              /*
***************
*** 1127,1139 ****
              /*
               * Are we done?
               */
!             if (rsinfo.isDone != ExprMultipleResult)
                  break;
          }
!         else if (rsinfo.returnMode == SFRM_Materialize)
          {
              /* check we're on the same page as the function author */
!             if (!first_time || rsinfo.isDone != ExprSingleResult)
                  ereport(ERROR,
                          (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
                           errmsg("table-function protocol for materialize mode was not followed")));
--- 1128,1140 ----
              /*
               * Are we done?
               */
!             if (rsinfo->isDone != ExprMultipleResult)
                  break;
          }
!         else if (rsinfo->returnMode == SFRM_Materialize)
          {
              /* check we're on the same page as the function author */
!             if (!first_time || rsinfo->isDone != ExprSingleResult)
                  ereport(ERROR,
                          (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
                           errmsg("table-function protocol for materialize mode was not followed")));
***************
*** 1144,1150 ****
              ereport(ERROR,
                      (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
                       errmsg("unrecognized table-function returnMode: %d",
!                             (int) rsinfo.returnMode)));

          first_time = false;
      }
--- 1145,1151 ----
              ereport(ERROR,
                      (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
                       errmsg("unrecognized table-function returnMode: %d",
!                             (int) rsinfo->returnMode)));

          first_time = false;
      }
***************
*** 1152,1159 ****
      MemoryContextSwitchTo(callerContext);

      /* The returned pointers are those in rsinfo */
!     *returnDesc = rsinfo.setDesc;
!     return rsinfo.setResult;
  }


--- 1153,1160 ----
      MemoryContextSwitchTo(callerContext);

      /* The returned pointers are those in rsinfo */
!     *returnDesc = rsinfo->setDesc;
!     return rsinfo->setResult;
  }


Index: src/backend/port/sysv_shmem.c
===================================================================
RCS file: /cvsroot/pgsql-server/src/backend/port/sysv_shmem.c,v
retrieving revision 1.19
diff -c -c -r1.19 sysv_shmem.c
*** src/backend/port/sysv_shmem.c    11 Oct 2003 18:04:25 -0000    1.19
--- src/backend/port/sysv_shmem.c    12 Oct 2003 23:10:24 -0000
***************
*** 365,371 ****

      if (hdr->magic != PGShmemMagic)
      {
!         shmdt(hdr);
          return NULL;            /* segment belongs to a non-Postgres app */
      }

--- 365,371 ----

      if (hdr->magic != PGShmemMagic)
      {
!         shmdt((void *)hdr);
          return NULL;            /* segment belongs to a non-Postgres app */
      }


Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Bruce Momjian wrote:
> Andrew Dunstan wrote:
> >
> > ----- Original Message -----
> > From: "Bruce Momjian" <pgman@candle.pha.pa.us>
> > >
> > > I have backed out the patch.
> > >
> > > Looking at the case in tablecmds.c and proc.c, the first was assigning a
> > > struct with a NodeTag pointer as its first element to another struct
> > > with NodeTag as its first element.  In fact, we do this all over the
> > > place, having different structure pointers with a start element of
> > > NodeTag.
>
> I have attached and applied the following patch to use makeNode for
> structures that will later be cast to Node*, rather than having them be
> allocated as stack variables.

Oh, and thanks to everyone for doing the research on this compiler issue.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
The warning is this:

command.c: In function `unescape':
command.c:1283: warning: dereferencing type-punned pointer will break
strict-aliasing rules

p is declared thus:

        const unsigned char *p;

If I change common.c/h so that parse_char() takes an (unsigned char **)
argument, cast its 2nd argument to the call to strtol to  (char **), and
change the cast in the call to (unsigned char **) those warnings go away,
but I get one from the other place parse_char() is called, namely prompt.c,
which has similar code but this time p is not declared as unsigned. - and so
it goes.

Grrr. What a pain.

andrew


----- Original Message -----
From: "Bruce Momjian" <pgman@candle.pha.pa.us>
To: "Andrew Dunstan" <andrew@dunslane.net>
Cc: "PG Patches" <pgsql-patches@postgresql.org>
Sent: Sunday, October 12, 2003 7:18 PM
Subject: Re: [PATCHES] fix for strict-alias warnings


> Andrew Dunstan wrote:
> >
> > ----- Original Message -----
> > From: "Bruce Momjian" <pgman@candle.pha.pa.us>
> > >
> > > I have backed out the patch.
> > >
> > > Looking at the case in tablecmds.c and proc.c, the first was assigning
a
> > > struct with a NodeTag pointer as its first element to another struct
> > > with NodeTag as its first element.  In fact, we do this all over the
> > > place, having different structure pointers with a start element of
> > > NodeTag.
>
> I have attached and applied the following patch to use makeNode for
> structures that will later be cast to Node*, rather than having them be
> allocated as stack variables.
>
> This leaves the only remaning compiler warning coming from common.c listed
> below.  What is the exact warning generated --- this seems like a
> different issue.
>
> --------------------------------------------------------------------------
-
>
> Index: src/bin/psql/command.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql-server/src/bin/psql/command.c,v
> retrieving revision 1.103
> diff -c -w -r1.103 command.c
> *** src/bin/psql/command.c 29 Sep 2003 16:39:18 -0000 1.103
> --- src/bin/psql/command.c 11 Oct 2003 13:50:15 -0000
> ***************
> *** 1280,1286 ****
>   case '7':
>   case '8':
>   case '9':
> ! c = parse_char((char **) &p);
>   break;
>
>   default:
> --- 1280,1286 ----
>   case '7':
>   case '8':
>   case '9':
> ! c = parse_char((void *) &p);
>   break;
>
>   default:
>
> --
>   Bruce Momjian                        |  http://candle.pha.pa.us
>   pgman@candle.pha.pa.us               |  (610) 359-1001
>   +  If your life is a hard drive,     |  13 Roberts Road
>   +  Christ can be your backup.        |  Newtown Square, Pennsylvania
19073
>


----------------------------------------------------------------------------
----


> Index: src/backend/commands/tablecmds.c
> ===================================================================
> RCS file: /cvsroot/pgsql-server/src/backend/commands/tablecmds.c,v
> retrieving revision 1.88
> diff -c -c -r1.88 tablecmds.c
> *** src/backend/commands/tablecmds.c 11 Oct 2003 18:04:25 -0000 1.88
> --- src/backend/commands/tablecmds.c 12 Oct 2003 23:10:21 -0000
> ***************
> *** 3449,3454 ****
> --- 3449,3455 ----
>   Relation pkrel)
>   {
>   HeapScanDesc scan;
> + TriggerData *trigdata = makeNode(TriggerData); /* must be Node aligned
*/
>   HeapTuple tuple;
>   Trigger trig;
>   List    *list;
> ***************
> *** 3506,3512 ****
>   while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
>   {
>   FunctionCallInfoData fcinfo;
> - TriggerData trigdata;
>
>   /*
>   * Make a call to the trigger function
> --- 3507,3512 ----
> ***************
> *** 3518,3537 ****
>   /*
>   * We assume RI_FKey_check_ins won't look at flinfo...
>   */
> ! trigdata.type = T_TriggerData;
> ! trigdata.tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
> ! trigdata.tg_relation = rel;
> ! trigdata.tg_trigtuple = tuple;
> ! trigdata.tg_newtuple = NULL;
> ! trigdata.tg_trigger = &trig;
>
> ! fcinfo.context = (Node *) &trigdata;
>
>   RI_FKey_check_ins(&fcinfo);
>   }
>
>   heap_endscan(scan);
>
>   pfree(trig.tgargs);
>   }
>
> --- 3518,3538 ----
>   /*
>   * We assume RI_FKey_check_ins won't look at flinfo...
>   */
> ! trigdata->type = T_TriggerData;
> ! trigdata->tg_event = TRIGGER_EVENT_INSERT | TRIGGER_EVENT_ROW;
> ! trigdata->tg_relation = rel;
> ! trigdata->tg_trigtuple = tuple;
> ! trigdata->tg_newtuple = NULL;
> ! trigdata->tg_trigger = &trig;
>
> ! fcinfo.context = (Node *) trigdata;
>
>   RI_FKey_check_ins(&fcinfo);
>   }
>
>   heap_endscan(scan);
>
> + pfree(trigdata);
>   pfree(trig.tgargs);
>   }
>
> Index: src/backend/executor/execQual.c
> ===================================================================
> RCS file: /cvsroot/pgsql-server/src/backend/executor/execQual.c,v
> retrieving revision 1.148
> diff -c -c -r1.148 execQual.c
> *** src/backend/executor/execQual.c 11 Oct 2003 18:04:25 -0000 1.148
> --- src/backend/executor/execQual.c 12 Oct 2003 23:10:23 -0000
> ***************
> *** 699,705 ****
>   List    *arguments = fcache->args;
>   Datum result;
>   FunctionCallInfoData fcinfo;
> ! ReturnSetInfo rsinfo; /* for functions returning sets */
>   ExprDoneCond argDone;
>   bool hasSetArg;
>   int i;
> --- 699,706 ----
>   List    *arguments = fcache->args;
>   Datum result;
>   FunctionCallInfoData fcinfo;
> ! /* for functions returning sets, must be aligned as Node, so use
makeNode */
> ! ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo);
>   ExprDoneCond argDone;
>   bool hasSetArg;
>   int i;
> ***************
> *** 746,760 ****
>   */
>   if (fcache->func.fn_retset)
>   {
> ! fcinfo.resultinfo = (Node *) &rsinfo;
> ! rsinfo.type = T_ReturnSetInfo;
> ! rsinfo.econtext = econtext;
> ! rsinfo.expectedDesc = NULL;
> ! rsinfo.allowedModes = (int) SFRM_ValuePerCall;
> ! rsinfo.returnMode = SFRM_ValuePerCall;
>   /* isDone is filled below */
> ! rsinfo.setResult = NULL;
> ! rsinfo.setDesc = NULL;
>   }
>
>   /*
> --- 747,761 ----
>   */
>   if (fcache->func.fn_retset)
>   {
> ! fcinfo.resultinfo = (Node *) rsinfo;
> ! rsinfo->type = T_ReturnSetInfo;
> ! rsinfo->econtext = econtext;
> ! rsinfo->expectedDesc = NULL;
> ! rsinfo->allowedModes = (int) SFRM_ValuePerCall;
> ! rsinfo->returnMode = SFRM_ValuePerCall;
>   /* isDone is filled below */
> ! rsinfo->setResult = NULL;
> ! rsinfo->setDesc = NULL;
>   }
>
>   /*
> ***************
> *** 803,812 ****
>   if (callit)
>   {
>   fcinfo.isnull = false;
> ! rsinfo.isDone = ExprSingleResult;
>   result = FunctionCallInvoke(&fcinfo);
>   *isNull = fcinfo.isnull;
> ! *isDone = rsinfo.isDone;
>   }
>   else
>   {
> --- 804,813 ----
>   if (callit)
>   {
>   fcinfo.isnull = false;
> ! rsinfo->isDone = ExprSingleResult;
>   result = FunctionCallInvoke(&fcinfo);
>   *isNull = fcinfo.isnull;
> ! *isDone = rsinfo->isDone;
>   }
>   else
>   {
> ***************
> *** 903,909 ****
>   TupleDesc tupdesc = NULL;
>   Oid funcrettype;
>   FunctionCallInfoData fcinfo;
> ! ReturnSetInfo rsinfo;
>   MemoryContext callerContext;
>   MemoryContext oldcontext;
>   TupleTableSlot *slot;
> --- 904,910 ----
>   TupleDesc tupdesc = NULL;
>   Oid funcrettype;
>   FunctionCallInfoData fcinfo;
> ! ReturnSetInfo *rsinfo = makeNode(ReturnSetInfo); /* must be Node aligned
*/
>   MemoryContext callerContext;
>   MemoryContext oldcontext;
>   TupleTableSlot *slot;
> ***************
> *** 992,1006 ****
>   * doesn't actually get to see the resultinfo, but set it up anyway
>   * because we use some of the fields as our own state variables.
>   */
> ! fcinfo.resultinfo = (Node *) &rsinfo;
> ! rsinfo.type = T_ReturnSetInfo;
> ! rsinfo.econtext = econtext;
> ! rsinfo.expectedDesc = expectedDesc;
> ! rsinfo.allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
> ! rsinfo.returnMode = SFRM_ValuePerCall;
>   /* isDone is filled below */
> ! rsinfo.setResult = NULL;
> ! rsinfo.setDesc = NULL;
>
>   /*
>   * Switch to short-lived context for calling the function or
> --- 993,1007 ----
>   * doesn't actually get to see the resultinfo, but set it up anyway
>   * because we use some of the fields as our own state variables.
>   */
> ! fcinfo.resultinfo = (Node *) rsinfo;
> ! rsinfo->type = T_ReturnSetInfo;
> ! rsinfo->econtext = econtext;
> ! rsinfo->expectedDesc = expectedDesc;
> ! rsinfo->allowedModes = (int) (SFRM_ValuePerCall | SFRM_Materialize);
> ! rsinfo->returnMode = SFRM_ValuePerCall;
>   /* isDone is filled below */
> ! rsinfo->setResult = NULL;
> ! rsinfo->setDesc = NULL;
>
>   /*
>   * Switch to short-lived context for calling the function or
> ***************
> *** 1028,1044 ****
>   if (direct_function_call)
>   {
>   fcinfo.isnull = false;
> ! rsinfo.isDone = ExprSingleResult;
>   result = FunctionCallInvoke(&fcinfo);
>   }
>   else
>   {
>   result = ExecEvalExpr(funcexpr, econtext,
> !   &fcinfo.isnull, &rsinfo.isDone);
>   }
>
>   /* Which protocol does function want to use? */
> ! if (rsinfo.returnMode == SFRM_ValuePerCall)
>   {
>   /*
>   * Check for end of result set.
> --- 1029,1045 ----
>   if (direct_function_call)
>   {
>   fcinfo.isnull = false;
> ! rsinfo->isDone = ExprSingleResult;
>   result = FunctionCallInvoke(&fcinfo);
>   }
>   else
>   {
>   result = ExecEvalExpr(funcexpr, econtext,
> !   &fcinfo.isnull, &rsinfo->isDone);
>   }
>
>   /* Which protocol does function want to use? */
> ! if (rsinfo->returnMode == SFRM_ValuePerCall)
>   {
>   /*
>   * Check for end of result set.
> ***************
> *** 1047,1053 ****
>   * tupdesc or tuplestore (since we can't get a tupdesc in the
>   * function-returning-tuple case)
>   */
> ! if (rsinfo.isDone == ExprEndResult)
>   break;
>
>   /*
> --- 1048,1054 ----
>   * tupdesc or tuplestore (since we can't get a tupdesc in the
>   * function-returning-tuple case)
>   */
> ! if (rsinfo->isDone == ExprEndResult)
>   break;
>
>   /*
> ***************
> *** 1093,1100 ****
>   }
>   tupstore = tuplestore_begin_heap(true, false, SortMem);
>   MemoryContextSwitchTo(oldcontext);
> ! rsinfo.setResult = tupstore;
> ! rsinfo.setDesc = tupdesc;
>   }
>
>   /*
> --- 1094,1101 ----
>   }
>   tupstore = tuplestore_begin_heap(true, false, SortMem);
>   MemoryContextSwitchTo(oldcontext);
> ! rsinfo->setResult = tupstore;
> ! rsinfo->setDesc = tupdesc;
>   }
>
>   /*
> ***************
> *** 1127,1139 ****
>   /*
>   * Are we done?
>   */
> ! if (rsinfo.isDone != ExprMultipleResult)
>   break;
>   }
> ! else if (rsinfo.returnMode == SFRM_Materialize)
>   {
>   /* check we're on the same page as the function author */
> ! if (!first_time || rsinfo.isDone != ExprSingleResult)
>   ereport(ERROR,
>   (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
>   errmsg("table-function protocol for materialize mode was not
followed")));
> --- 1128,1140 ----
>   /*
>   * Are we done?
>   */
> ! if (rsinfo->isDone != ExprMultipleResult)
>   break;
>   }
> ! else if (rsinfo->returnMode == SFRM_Materialize)
>   {
>   /* check we're on the same page as the function author */
> ! if (!first_time || rsinfo->isDone != ExprSingleResult)
>   ereport(ERROR,
>   (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
>   errmsg("table-function protocol for materialize mode was not
followed")));
> ***************
> *** 1144,1150 ****
>   ereport(ERROR,
>   (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
>   errmsg("unrecognized table-function returnMode: %d",
> ! (int) rsinfo.returnMode)));
>
>   first_time = false;
>   }
> --- 1145,1151 ----
>   ereport(ERROR,
>   (errcode(ERRCODE_E_R_I_E_SRF_PROTOCOL_VIOLATED),
>   errmsg("unrecognized table-function returnMode: %d",
> ! (int) rsinfo->returnMode)));
>
>   first_time = false;
>   }
> ***************
> *** 1152,1159 ****
>   MemoryContextSwitchTo(callerContext);
>
>   /* The returned pointers are those in rsinfo */
> ! *returnDesc = rsinfo.setDesc;
> ! return rsinfo.setResult;
>   }
>
>
> --- 1153,1160 ----
>   MemoryContextSwitchTo(callerContext);
>
>   /* The returned pointers are those in rsinfo */
> ! *returnDesc = rsinfo->setDesc;
> ! return rsinfo->setResult;
>   }
>
>
> Index: src/backend/port/sysv_shmem.c
> ===================================================================
> RCS file: /cvsroot/pgsql-server/src/backend/port/sysv_shmem.c,v
> retrieving revision 1.19
> diff -c -c -r1.19 sysv_shmem.c
> *** src/backend/port/sysv_shmem.c 11 Oct 2003 18:04:25 -0000 1.19
> --- src/backend/port/sysv_shmem.c 12 Oct 2003 23:10:24 -0000
> ***************
> *** 365,371 ****
>
>   if (hdr->magic != PGShmemMagic)
>   {
> ! shmdt(hdr);
>   return NULL; /* segment belongs to a non-Postgres app */
>   }
>
> --- 365,371 ----
>
>   if (hdr->magic != PGShmemMagic)
>   {
> ! shmdt((void *)hdr);
>   return NULL; /* segment belongs to a non-Postgres app */
>   }
>
>


----------------------------------------------------------------------------
----


>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>


Re: fix for strict-alias warnings

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I have attached and applied the following patch to use makeNode for
> structures that will later be cast to Node*, rather than having them be
> allocated as stack variables.

AFAICT, this adds unnecessary palloc overhead without actually reducing
the risk of optimization problems.

            regards, tom lane

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I have attached and applied the following patch to use makeNode for
> > structures that will later be cast to Node*, rather than having them be
> > allocated as stack variables.
>
> AFAICT, this adds unnecessary palloc overhead without actually reducing
> the risk of optimization problems.
>

Even without the extra overhead, the danger of strict-aliasing is not just
related to alignment. As I understand it, given strict-aliasing assumptions
the compiler is free to reorder some operations on things it thinks can't be
the same thing, or even optimise them away because they can have no effect.
I'm not 100% sure we have avoided that danger.

Quote from nice page on this subject: "ISO C is not your grandfather's C".
:-)

cheers

andrew


Re: fix for strict-alias warnings

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> Even without the extra overhead, the danger of strict-aliasing is not just
> related to alignment.

If I understand the issue at all, it has *nothing* to do with alignment.

> As I understand it, given strict-aliasing assumptions
> the compiler is free to reorder some operations on things it thinks can't be
> the same thing, or even optimise them away because they can have no effect.

Yah...

> I'm not 100% sure we have avoided that danger.

I don't think we understand the dangers quite yet, and I think the
patches applied to date constitute useless thrashing rather than fixes.
I'd like to see less quick-hack patching and more discussion.

In particular, given that there is as yet no demonstrated effect other
than mere warnings issued by a possibly-buggy gcc release, I think it's
premature to be hacking our sources at all.

            regards, tom lane

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
> > Even without the extra overhead, the danger of strict-aliasing is not just
> > related to alignment.
>
> If I understand the issue at all, it has *nothing* to do with alignment.
>
> > As I understand it, given strict-aliasing assumptions
> > the compiler is free to reorder some operations on things it thinks can't be
> > the same thing, or even optimise them away because they can have no effect.
>
> Yah...
>
> > I'm not 100% sure we have avoided that danger.
>
> I don't think we understand the dangers quite yet, and I think the
> patches applied to date constitute useless thrashing rather than fixes.
> I'd like to see less quick-hack patching and more discussion.
>
> In particular, given that there is as yet no demonstrated effect other
> than mere warnings issued by a possibly-buggy gcc release, I think it's
> premature to be hacking our sources at all.

OK, patch removed.  When no one commented after 24 hours on my
makeNode() idea, I though I was on to something.  :-(

In reading http://www.gnu.org/software/gcc/bugs.html#nonbugs_c and the
link it references,
http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html, they seem
to be talking about any pointer casting.

It also has this gem:

    I have seen some commits that "fix" gcc 3.3 alias warnings, that does
    not give me warm fuzzy feelings (the commits that is), and I have alse
    seen a lot of confusion about aliasing (and ISO C in general) on
    different mailing lists, so I have tried to explain some of the issues
    that I know have/will bite us.

indicating they might remove these warnings soon anyway.

I am not even going to point this gcc issue on the 7.4 open items list.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> I don't think we understand the dangers quite yet, and I think the
>> patches applied to date constitute useless thrashing rather than fixes.

> In reading http://www.gnu.org/software/gcc/bugs.html#nonbugs_c and the
> link it references,
> http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html, they seem
> to be talking about any pointer casting.

The latter reference makes it absolutely crystal-clear that inserting
void* casts does *not* fix the issue.  Also observe the statement

> gcc may warn for some constructs that break the aliasing rules, but not
> all of them (or not even most of them!), so a warning-free source code
> does not give you any guarantee.

I have to agree now with Andrew's last mail that -fno-strict-aliasing is
the only safe solution.  Since gcc isn't even pretending that it can
warn in all cases where the optimization might break things, I'm not
sure we could ever responsibly enable this optimization.  I do not feel
this is our problem; it is the compiler hackers' fault if they need to
make unsupportable, untestable assumptions about application code.

Basically, ISO broke the language here, and I say it's a screwup up with
which we need not put.

            regards, tom lane

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
----- Original Message -----
From: "Tom Lane" <tgl@sss.pgh.pa.us>
>
> I have to agree now with Andrew's last mail that -fno-strict-aliasing is
> the only safe solution.  Since gcc isn't even pretending that it can
> warn in all cases where the optimization might break things, I'm not
> sure we could ever responsibly enable this optimization.  I do not feel
> this is our problem; it is the compiler hackers' fault if they need to
> make unsupportable, untestable assumptions about application code.
>
> Basically, ISO broke the language here, and I say it's a screwup up with
> which we need not put.
>

You and Linus Torvalds ;-) I recall seeing almost this exact discussion on
the kernel hackers list a few years ago.

Of course, the linux kernel is aimed at a limited set of compilers - as I
understand it basically gcc although it has been made to build with Intel
compilers - which makes things somewhat easier for them. What is our target
set of compilers? What is our target version of C? (being unsure on these
issues I gave my initdb.c a tour through "gcc -ansi -pedantic" at one
stage).

Also note that uninhibited casting between types can still cause alignment
problems, quite apart from the strict aliasing issue (That was what confused
Bruce, I think - the email on the netbsd list referred to both strict
aliasing issues and misalignment issues). Still, that apparently hasn't been
a problem up to now so we are probably OK on this one.

cheers

andrew


Re: fix for strict-alias warnings

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> Of course, the linux kernel is aimed at a limited set of compilers - as I
> understand it basically gcc although it has been made to build with Intel
> compilers - which makes things somewhat easier for them. What is our target
> set of compilers? What is our target version of C?

"Pretty much anything that speaks ANSI C" is my usual feeling about
that.  As yet we have not heard of any non-gcc compilers in which this
is a problem, although you have a point that some compiler somewhere may
do this and not have a way to turn it off :-(

> Also note that uninhibited casting between types can still cause alignment
> problems,

We understand that issue, we solved it years ago.

BTW, I haven't looked at the problem spots in detail.  How many of them
are due to the use of MemSet in conjunction with other access to a chunk
of memory?  ISTM that we need not worry about code motion around a
MemSet call, since that would require the compiler to prove that the
memset() path through the macro wouldn't be affected, which I doubt it
would think.

            regards, tom lane

Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Tom Lane wrote:

>BTW, I haven't looked at the problem spots in detail.  How many of them
>are due to the use of MemSet in conjunction with other access to a chunk
>of memory?  ISTM that we need not worry about code motion around a
>MemSet call, since that would require the compiler to prove that the
>memset() path through the macro wouldn't be affected, which I doubt it
>would think.
>
>

there were 3 calls to MemSet it complained about - all in
src/backend/storage/lmgr/proc.c, and all zeroing out the timeval
structure. (is MemSet actually a gain in this instance?)

there was the very odd one in src/bin/psql/command.c, which seems to me
to be bogus

there were 3 in src/backend/commands/tablecmds.c and
src/backend/executor/execQual.c complaining about casting things to (Node *)

finally, there was a warning about incompatible pointer types (i.e. not
a type-pun warning) in the (uncast) call to shmdt in
src/backend/port/sysv_shmem.c

cheers

andrew


Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Andrew Dunstan wrote:

>
> there were 3 calls to MemSet it complained about - all in
> src/backend/storage/lmgr/proc.c, and all zeroing out the timeval
> structure. (is MemSet actually a gain in this instance?)


And looking at it even closer, 2 of the 3 cases of calling MemSet appear
to be unnecessary, as the zeroed out values are immediately overwritten.

cheers

andrew




Re: fix for strict-alias warnings

From
Peter Eisentraut
Date:
Andrew Dunstan writes:

> And looking at it even closer, 2 of the 3 cases of calling MemSet appear
> to be unnecessary, as the zeroed out values are immediately overwritten.

We need to zero out the holes in the structures so that hash functions
work correctly.

--
Peter Eisentraut   peter_e@gmx.net


Re: fix for strict-alias warnings

From
Manfred Spraul
Date:
Tom Lane wrote:

>"Andrew Dunstan" <andrew@dunslane.net> writes:
>
>
>>Of course, the linux kernel is aimed at a limited set of compilers - as I
>>understand it basically gcc although it has been made to build with Intel
>>compilers
>>
icc once compiled the kernel. But they had to teach it quite a lots of
gccisms.

>> - which makes things somewhat easier for them. What is our target
>>set of compilers? What is our target version of C?
>>
>>
>
>"Pretty much anything that speaks ANSI C" is my usual feeling about
>that.  As yet we have not heard of any non-gcc compilers in which this
>is a problem, although you have a point that some compiler somewhere may
>do this and not have a way to turn it off :-(
>
>
Intel's icc compiler supports strict alias analysis, but the default was
off.

>>Also note that uninhibited casting between types can still cause alignment
>>problems,
>>
>>
>
>We understand that issue, we solved it years ago.
>
>BTW, I haven't looked at the problem spots in detail.  How many of them
>are due to the use of MemSet in conjunction with other access to a chunk
>of memory?  ISTM that we need not worry about code motion around a
>MemSet call, since that would require the compiler to prove that the
>memset() path through the macro wouldn't be affected, which I doubt it
>would think.
>
>
gcc is quite good at propagating constants around. This is heavily used
in the linux-kernel: __builtin_constant(x), and then large switch
statements that are completely evaluated at compile time. There is a
good chance that gcc figures out that MemSet(,0,sizeof(double)) are two
writes to two integer values, and then decides that they can't alias
with reads/write to the double.

I'll search for a suitable gcc list and post the memset macro - that
might give a definitive answer.

--
    Manfred


Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Peter Eisentraut wrote:

>Andrew Dunstan writes:
>
>
>
>>And looking at it even closer, 2 of the 3 cases of calling MemSet appear
>>to be unnecessary, as the zeroed out values are immediately overwritten.
>>
>>
>
>We need to zero out the holes in the structures so that hash functions
>work correctly.
>
>
>
I suspect we are both wrong :-) These structures aren't used in any
hashed structure that I can see, but the effect of the MemSet is to zero
out the it_interval value of the itimer, making sure the timer is a "one
shot" timer - so they are necessary after all, but for a different
reason (it's been a while since I used setitimer, and the man page is
less than a model of clarity).

cheers

andrew


Re: fix for strict-alias warnings

From
Manfred Spraul
Date:
I've asked the question on the gcc devel list. The first reply was that
MemSet violates strict aliasing rules:

http://gcc.gnu.org/ml/gcc/2003-10/msg00524.html

I think we must either add -fno-strict-aliasing, or switch to the c
compiler memset functions for gcc.

--
    Manfred


Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
> > Of course, the linux kernel is aimed at a limited set of compilers - as I
> > understand it basically gcc although it has been made to build with Intel
> > compilers - which makes things somewhat easier for them. What is our target
> > set of compilers? What is our target version of C?
>
> "Pretty much anything that speaks ANSI C" is my usual feeling about
> that.  As yet we have not heard of any non-gcc compilers in which this
> is a problem, although you have a point that some compiler somewhere may
> do this and not have a way to turn it off :-(

I now understand the gcc issues.  Sorry for getting confused.

I have removed the void* cast from MemSet until we understand this issue
better.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Manfred Spraul wrote:

> I've asked the question on the gcc devel list. The first reply was
> that MemSet violates strict aliasing rules:
>
> http://gcc.gnu.org/ml/gcc/2003-10/msg00524.html
>
> I think we must either add -fno-strict-aliasing, or switch to the c
> compiler memset functions for gcc.
>
The concensus appears to be -fno-strict-aliasing

cheers

andrew


Re: fix for strict-alias warnings

From
Neil Conway
Date:
On Tue, 2003-10-14 at 15:00, Manfred Spraul wrote:
> I think we must either add -fno-strict-aliasing, or switch to the c
> compiler memset functions for gcc.

The last time we did some benchmarking, using the builtin memset()
imposes a significant performance penalty on plenty of different
platforms.

-Neil



Re: fix for strict-alias warnings

From
Tom Lane
Date:
Manfred Spraul <manfred@colorfullife.com> writes:
> I've asked the question on the gcc devel list. The first reply was that
> MemSet violates strict aliasing rules:

No doubt it does, but that is not really the issue here; the issue IMHO
is whether there is any real risk involved.  Remember that the macro is
really of the form

    if (blah blah)
    {
        // unsafe code is here
    }
    else
    {
        memset(...);
    }

Given that gcc is smart enough not to move any code across the memset()
call, I doubt that it would be moving anything across the whole if()
construct.  Now if the if-condition were such that the memset code path
could be optimized away, then we'd have a problem, but in practice I do
not believe gcc is smart enough to realize that the alignment check is
always true.

We do have to be wary of MemSetAligned and MemSetLoop, but these are
only used in constrained places (routines that do nothing else with
the memory in question) so I think they are not a problem.

> I think we must either add -fno-strict-aliasing, or switch to the c
> compiler memset functions for gcc.

We will not be doing the latter, for certain.

            regards, tom lane

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Tom Lane wrote:
> > I think we must either add -fno-strict-aliasing, or switch to the c
> > compiler memset functions for gcc.
>
> We will not be doing the latter, for certain.

OK, what gcc versions support -fno-strict-aliasing?  Do we need a
configure test for it?  Would someone profile PostgreSQL with
-fno-strict-aliasing and see if you can see a larger performance hit?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, what gcc versions support -fno-strict-aliasing?  Do we need a
> configure test for it?

Perhaps ... although it is recognized in 2.95.3 and probably for a good
ways before that.

It looks to me like what has changed in gcc 3.3 is not the existence
of the flag, but the fact that -O2 now turns it on where it did not
before.

            regards, tom lane

Re: fix for strict-alias warnings

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, what gcc versions support -fno-strict-aliasing?  Do we need a
> > configure test for it?
>
> Perhaps ... although it is recognized in 2.95.3 and probably for a good
> ways before that.
>
> It looks to me like what has changed in gcc 3.3 is not the existence
> of the flag, but the fact that -O2 now turns it on where it did not
> before.

Right.  I am just not sure how old our gcc compilers are in the field.
We can do it unconditionally and wait for a failure report.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: fix for strict-alias warnings

From
Peter Eisentraut
Date:
Tom Lane writes:

> Given that gcc is smart enough not to move any code across the memset()
> call,

Is it?  If you violate the aliasing rules, all bets are off.

--
Peter Eisentraut   peter_e@gmx.net


Re: fix for strict-alias warnings

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> Given that gcc is smart enough not to move any code across the memset()
>> call,

> Is it?

It had better be.

            regards, tom lane

Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Tom Lane wrote:

>Bruce Momjian <pgman@candle.pha.pa.us> writes:
>
>
>>OK, what gcc versions support -fno-strict-aliasing?  Do we need a
>>configure test for it?
>>
>>
>
>Perhaps ... although it is recognized in 2.95.3 and probably for a good
>ways before that.
>
>It looks to me like what has changed in gcc 3.3 is not the existence
>of the flag, but the fact that -O2 now turns it on where it did not
>before.
>
>
>

Yes, exactly - I have just confirmed this on the archived manuals on the
gcc site.

But turning it off for earlier releases does no harm.

It is known to break 2.7.1 and I believe it will work with 2.95 and later.

cheers

andrew


Re: fix for strict-alias warnings

From
Peter Eisentraut
Date:
Neil Conway writes:

> On Tue, 2003-10-14 at 15:00, Manfred Spraul wrote:
> > I think we must either add -fno-strict-aliasing, or switch to the c
> > compiler memset functions for gcc.
>
> The last time we did some benchmarking, using the builtin memset()
> imposes a significant performance penalty on plenty of different
> platforms.

The last time I did some testing, the builtin memset() was significantly
faster on plenty of different platforms.

--
Peter Eisentraut   peter_e@gmx.net


Re: fix for strict-alias warnings

From
Manfred Spraul
Date:
Tom Lane wrote:

>Given that gcc is smart enough not to move any code across the memset()
>call, I doubt that it would be moving anything across the whole if()
>construct.  Now if the if-condition were such that the memset code path
>could be optimized away, then we'd have a problem, but in practice I do
>not believe gcc is smart enough to realize that the alignment check is
>always true.
>
gcc-3.2.2 optimizes the memset away - that's a simple exercise for gcc.

gcc-3.2.2 isn't smart enough to replace everything - it didn't like the
pointer arithmetics.
After some massaging, I've succeeded in generating bad code using a
slightly modified MemSetAligned macro (parameters -O2
-fstrict-aliasing): gcc pipelined the x*x around the memset.

Annotated asm output with gcc -O99 -fomit-frame-pointer -fstrict-aliasing:
08048328 <test2>:
 8048328:       83 ec 18                sub    $0x18,%esp
 >>>>> stack setup for automatic variables.
 804832b:       c7 44 24 0c 00 00 00    movl   $0x0,0xc(%esp,1)
 8048332:       00
 8048333:       c7 44 24 10 00 00 00    movl   $0x40000000,0x10(%esp,1)
 804833a:       40
 >>>> x = 1.0;
 804833b:       dd 44 24 0c             fldl   0xc(%esp,1)
 804833f:       d8 c8                   fmul   %st(0),%st
 >>>> x = x*x;
 8048341:       c7 44 24 0c 00 00 00    movl   $0x0,0xc(%esp,1)
 8048348:       00
 8048349:       c7 44 24 10 00 00 00    movl   $0x0,0x10(%esp,1)
 8048350:       00
 >>>>> MemSetAligned(): optimized to storing two ints.
 8048351:       dd 54 24 0c             fstl   0xc(%esp,1)
 >>> write back the result of x*x to the stack
 8048355:       dd 1c 24                fstpl  (%esp,1)
 >>> push x*x for printf call
 8048358:       68 54 84 04 08          push   $0x8048454
 >>> push pointer to "square is %f.\n"
 804835d:       e8 06 ff ff ff          call   8048268 <_init+0x38>
 >>>> call printf
 8048362:       83 c4 1c                add    $0x1c,%esp
 8048365:       c3                      ret
 >>>>> and exit.
 8048366:       89 f6                   mov    %esi,%esi

To paraphrase the ISO C line: gcc is not your grandfather's gcc. It's
within 10% of the best compilers for SpecInt - the propagation and
analysis of constants it quite good, and several bugs were fixed sinced
3.2.2.

What is the oldest gcc versions still supported by postgres? It seems
that the strict alias analysis is from the egcs tree. Probably first
supported by egcs-1.1.2 - is that gcc-2.91?

http://groups.google.de/groups?q=g:thl2087564510d&dq=&hl=de&lr=&ie=UTF-8&oe=UTF-8&selm=fa.fjlldvv.l7m7hk%40ifi.uio.no

--
    Manfred
#include <stdio.h>
#include <stdlib.h>

typedef signed int int32;
typedef int Size;

#define INT_ALIGN_MASK    (sizeof(int)-1)
#define MEMSET_LOOP_LIMIT  1024

#define MemSetAligned(start, val, len) \
    do \
    { \
        int32 * _start = (int32 *) (start); \
        int        _val = (val); \
        Size    _len = (len); \
\
        if ((_len & INT_ALIGN_MASK) == 0 && \
            _val == 0 && \
            _len <= MEMSET_LOOP_LIMIT) \
        { \
            Size    offset; \
            _len = _len/sizeof(int); \
            for (offset=0;offset<_len;offset++) \
                _start[offset] = 0; \
        } \
        else \
            memset_is_this_function_optimized_away_question_mark((char *) _start, _val, _len); \
    } while (0)

void test2(void)
{
    double x;

    x = 2.0;

    MemSetAligned(&x, 0, sizeof(double));

    x = x*x;

    printf("square is %f.\n", x);
}

int main(void)
{
    test2();
    return 1;
}

Re: fix for strict-alias warnings

From
Neil Conway
Date:
On Tue, 2003-10-14 at 16:29, Peter Eisentraut wrote:
> The last time I did some testing, the builtin memset() was significantly
> faster on plenty of different platforms.

Oh? Which platforms are you referring to, and what tests were performed?

You can find the benchmark results I'm referring to in the archives
here:

http://archives.postgresql.org/pgsql-hackers/2002-08/msg02116.php

(In some hypothetical world in which MemSet() didn't offer a significant
performance improvement, there is no reason to keep it around.)

-Neil



Re: fix for strict-alias warnings

From
Tom Lane
Date:
Manfred Spraul <manfred@colorfullife.com> writes:
> After some massaging, I've succeeded in generating bad code using a
> slightly modified MemSetAligned macro (parameters -O2
> -fstrict-aliasing): gcc pipelined the x*x around the memset.

As I already explained, we do not care about the MemSetAligned case.
Is gcc 3.3 smart enough to optimize away the pointer alignment test
in the full macro?

(Hm, if so, maybe that explains Bruce's observation that the warning
only shows up with Node structs that are allocated as local
variables, rather than palloc'd ... that would be the only case where
gcc could possibly optimize away the alignment test ...)

            regards, tom lane

Re: fix for strict-alias warnings

From
Peter Eisentraut
Date:
Neil Conway writes:

> Oh? Which platforms are you referring to, and what tests were performed?

http://archives.postgresql.org/pgsql-patches/2002-10/msg00085.php

--
Peter Eisentraut   peter_e@gmx.net


Re: fix for strict-alias warnings

From
Manfred Spraul
Date:
Tom Lane wrote:

>Manfred Spraul <manfred@colorfullife.com> writes:
>
>
>>After some massaging, I've succeeded in generating bad code using a
>>slightly modified MemSetAligned macro (parameters -O2
>>-fstrict-aliasing): gcc pipelined the x*x around the memset.
>>
>>
>
>As I already explained, we do not care about the MemSetAligned case.
>Is gcc 3.3 smart enough to optimize away the pointer alignment test
>in the full macro?
>
3.2 optimizes away the pointer alignment test, but then doesn't pipeline
the "x*x" calculation. It might be due to a known (and now fixed) bug in
gcc where is lost track of constants, and thus didn't succeed in
optimizing long calculations.

I don't have gcc 3.3 installed, but IMHO it would be insane to leave
strict alias analysis enabled - writing to *(int32*)addr violates the
alias rules, the bad code generated with MemSetAligned proved that.
Is someone around with 3.3 who could test MemSet?

--
    Manfred



Re: fix for strict-alias warnings

From
Andrew Dunstan
Date:
Manfred Spraul wrote:

> Tom Lane wrote:
>
>> Manfred Spraul <manfred@colorfullife.com> writes:
>>
>>
>>> After some massaging, I've succeeded in generating bad code using a
>>> slightly modified MemSetAligned macro (parameters -O2
>>> -fstrict-aliasing): gcc pipelined the x*x around the memset.
>>>
>>
>>
>> As I already explained, we do not care about the MemSetAligned case.
>> Is gcc 3.3 smart enough to optimize away the pointer alignment test
>> in the full macro?
>>
> 3.2 optimizes away the pointer alignment test, but then doesn't
> pipeline the "x*x" calculation. It might be due to a known (and now
> fixed) bug in gcc where is lost track of constants, and thus didn't
> succeed in optimizing long calculations.
>
> I don't have gcc 3.3 installed, but IMHO it would be insane to leave
> strict alias analysis enabled - writing to *(int32*)addr violates the
> alias rules, the bad code generated with MemSetAligned proved that.
> Is someone around with 3.3 who could test MemSet?


I have it on my cygwin installation - that's how this whole mess arose
in the first place ;-)

All this is interesting, but the real problem remains that we don't know
what else might be affected because gcc apparently doesn't promise to
tell us. IMO the gcc team made a bad mistake by turning this on by
default for -O2 without reliable accompanying diagnostics.

cheers

andrew


Re: fix for strict-alias warnings

From
Tom Lane
Date:
Manfred Spraul <manfred@colorfullife.com> writes:
> Tom Lane wrote:
>> Is gcc 3.3 smart enough to optimize away the pointer alignment test
>> in the full macro?
>>
> 3.2 optimizes away the pointer alignment test, but then doesn't pipeline
> the "x*x" calculation.

Hm, confirmed here.  So indeed it seems that Bruce was on the right
track --- setting up a Node structure as a local variable may be a
contributing factor.

> I don't have gcc 3.3 installed, but IMHO it would be insane to leave
> strict alias analysis enabled - writing to *(int32*)addr violates the
> alias rules, the bad code generated with MemSetAligned proved that.

While I don't really disagree, I am curious as to whether we are
actually forestalling any bugs; so far I'm not convinced that the
reported warnings correspond to real risks ...

            regards, tom lane

Re: fix for strict-alias warnings

From
"Andrew Dunstan"
Date:
gcc 3.3.1/cygwin

MemSetAligned: prints "square is 4.000000"
MemSet: prints "square is 0.000000"

Interestingly, a lot of the comparison and call to memset() still seem to be
optimised away, but the loop from MemSet is left, so the multiplication is
also not optimised away.

here's the assembler for test2/MemSet:

        subl    $28, %esp
        leal    16(%esp), %eax
        movl    $0, 16(%esp)
        leal    24(%esp), %edx
        cmpl    %edx, %eax
        movl    $1073741824, 20(%esp)
        jae     L21
        .align 16
L26:
        movl    $0, (%eax)
        addl    $4, %eax
        cmpl    %edx, %eax
        jb      L26
L21:
        fldl    16(%esp)
        movl    $LC1, (%esp)
        fmul    %st(0), %st
        fstl    16(%esp)
        fstpl   4(%esp)
        call    _printf
        addl    $28, %esp
        ret

cheers

andrew





----- Original Message -----
From: "Manfred Spraul" <manfred@colorfullife.com>
To: "Tom Lane" <tgl@sss.pgh.pa.us>
Cc: "Andrew Dunstan" <andrew@dunslane.net>; "Patches (PostgreSQL)"
<pgsql-patches@postgresql.org>
Sent: Tuesday, October 14, 2003 5:01 PM
Subject: Re: [PATCHES] fix for strict-alias warnings


> Tom Lane wrote:
>
> >Manfred Spraul <manfred@colorfullife.com> writes:
> >
> >
> >>After some massaging, I've succeeded in generating bad code using a
> >>slightly modified MemSetAligned macro (parameters -O2
> >>-fstrict-aliasing): gcc pipelined the x*x around the memset.
> >>
> >>
> >
> >As I already explained, we do not care about the MemSetAligned case.
> >Is gcc 3.3 smart enough to optimize away the pointer alignment test
> >in the full macro?
> >
> 3.2 optimizes away the pointer alignment test, but then doesn't pipeline
> the "x*x" calculation. It might be due to a known (and now fixed) bug in
> gcc where is lost track of constants, and thus didn't succeed in
> optimizing long calculations.
>
> I don't have gcc 3.3 installed, but IMHO it would be insane to leave
> strict alias analysis enabled - writing to *(int32*)addr violates the
> alias rules, the bad code generated with MemSetAligned proved that.
> Is someone around with 3.3 who could test MemSet?
>
> --
>     Manfred
>


Re: fix for strict-alias warnings

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> All this is interesting, but the real problem remains that we don't know
> what else might be affected because gcc apparently doesn't promise to
> tell us. IMO the gcc team made a bad mistake by turning this on by
> default for -O2 without reliable accompanying diagnostics.

Yeah, this seems to be the killer point.  It *might* be safe to leave
-fstrict-aliasing on; but we can't tell.

I think we can safely just make the default switches for gcc be
CFLAGS="-O2 -fno-strict-aliasing".  If I understand correctly, this will
work with all gcc versions back to 2.7.something, which is pretty much
ancient history now --- and if anyone needs to make it work with such an
old gcc, all they need do is specify CFLAGS to configure instead of
letting it default.

Will commit this change unless I hear objections soon.

            regards, tom lane

Re: fix for strict-alias warnings

From
Tom Lane
Date:
"Andrew Dunstan" <andrew@dunslane.net> writes:
> Interestingly, a lot of the comparison and call to memset() still seem to be
> optimised away, but the loop from MemSet is left, so the multiplication is
> also not optimised away.

Yeah, I saw the same in gcc 3.2 (on Intel) yesterday.  I thought maybe
3.3 would fix that, since it sure looks like an optimization bug, but
evidently not.

            regards, tom lane