Thread: pgsql: Clarify coding of .exe patch

pgsql: Clarify coding of .exe patch

From
momjian@svr1.postgresql.org (Bruce Momjian)
Date:
Log Message:
-----------
Clarify coding of .exe patch

Modified Files:
--------------
    pgsql/src/port:
        path.c (r1.38 -> r1.39)
        (http://developer.postgresql.org/cvsweb.cgi/pgsql/src/port/path.c.diff?r1=1.38&r2=1.39)

Re: pgsql: Clarify coding of .exe patch

From
Neil Conway
Date:
On Mon, 2004-11-01 at 15:25, Bruce Momjian wrote:
> Clarify coding of .exe patch

This change looks wrong: sizeof(".exe") is 5, not 4 (it includes the NUL
terminator).

-Neil



Re: pgsql: Clarify coding of .exe patch

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Mon, 2004-11-01 at 15:25, Bruce Momjian wrote:
> > Clarify coding of .exe patch
>
> This change looks wrong: sizeof(".exe") is 5, not 4 (it includes the NUL
> terminator).

Oh, I didn't realize they had the dot in there.  Let me add "-1".
Thanks.  Attached.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
Index: src/port/path.c
===================================================================
RCS file: /cvsroot/pgsql/src/port/path.c,v
retrieving revision 1.39
diff -c -c -r1.39 path.c
*** src/port/path.c    1 Nov 2004 04:25:18 -0000    1.39
--- src/port/path.c    1 Nov 2004 04:45:55 -0000
***************
*** 195,202 ****

  #if defined(__CYGWIN__) || defined(WIN32)
      /* strip .exe suffix, regardless of case */
!     if (strlen(nodir_name) > sizeof(EXE) &&
!         pg_strcasecmp(nodir_name + strlen(nodir_name) - sizeof(EXE), EXE) == 0)
      {
          char *progname;

--- 195,202 ----

  #if defined(__CYGWIN__) || defined(WIN32)
      /* strip .exe suffix, regardless of case */
!     if (strlen(nodir_name) > sizeof(EXE) - 1 &&
!         pg_strcasecmp(nodir_name + strlen(nodir_name)-sizeof(EXE)-1, EXE) == 0)
      {
          char *progname;

***************
*** 206,212 ****
              fprintf(stderr, "%s: out of memory\n", nodir_name);
              exit(1);
          }
!         progname[strlen(progname) - sizeof(EXE)] = '\0';
          nodir_name = progname;
      }
  #endif
--- 206,212 ----
              fprintf(stderr, "%s: out of memory\n", nodir_name);
              exit(1);
          }
!         progname[strlen(progname) - sizeof(EXE) - 1] = '\0';
          nodir_name = progname;
      }
  #endif

Re: pgsql: Clarify coding of .exe patch

From
Neil Conway
Date:
On Mon, 2004-11-01 at 15:47, Bruce Momjian wrote:
> Oh, I didn't realize they had the dot in there.  Let me add "-1".
> Thanks.  Attached.

I wonder if it wouldn't be cleaner to #define SIZE_OF_EXE or similar
that includes the "-1". Alternatively, we could just use strlen(): I
believe a decent compiler should be able to evaluate strlen("str") at
compile-time.

-Neil



Re: pgsql: Clarify coding of .exe patch

From
Bruce Momjian
Date:
Neil Conway wrote:
> On Mon, 2004-11-01 at 15:47, Bruce Momjian wrote:
> > Oh, I didn't realize they had the dot in there.  Let me add "-1".
> > Thanks.  Attached.
>
> I wonder if it wouldn't be cleaner to #define SIZE_OF_EXE or similar
> that includes the "-1". Alternatively, we could just use strlen(): I
> believe a decent compiler should be able to evaluate strlen("str") at
> compile-time.

I prefer strlen() myself but Tom prefers sizeof().  In the long run I
would like to create a 'sizeof' that doesn't include the null as a
global define that we can use for any hard-coded string.  I have not
thought of a name yet and it would be for 8.1.  I count perhaps 84
places it could be used:

    $ rgrepc 'sizeof.*- \?1'|wc -l
          84

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: pgsql: Clarify coding of .exe patch

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> !     if (strlen(nodir_name) > sizeof(EXE) - 1 &&
> !         pg_strcasecmp(nodir_name + strlen(nodir_name)-sizeof(EXE)-1, EXE) == 0)

This is clearer than "4"?

If there were any remote chance that Microsoft would invent some new,
different-length spelling of the ".exe" suffix in the foreseeable
future, I might agree that this was good code.  As is, it looks like
unhelpful obscurantism, because it requires a very close reading of the
details of the C spec to confirm that it is correct and not wrong.

Code that looks wrong on its face is not an improvement over code that
is readable but doesn't cover changes that won't ever happen.

            regards, tom lane

Re: pgsql: Clarify coding of .exe patch

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > !     if (strlen(nodir_name) > sizeof(EXE) - 1 &&
> > !         pg_strcasecmp(nodir_name + strlen(nodir_name)-sizeof(EXE)-1, EXE) == 0)
>
> This is clearer than "4"?
>
> If there were any remote chance that Microsoft would invent some new,
> different-length spelling of the ".exe" suffix in the foreseeable
> future, I might agree that this was good code.  As is, it looks like
> unhelpful obscurantism, because it requires a very close reading of the
> details of the C spec to confirm that it is correct and not wrong.
>
> Code that looks wrong on its face is not an improvement over code that
> is readable but doesn't cover changes that won't ever happen.

The '4' to me just looks too arbitrary --- the EXE reference makes it
clear to me what it.  I don't like arbitrary constants in the code and
especially not repeated several places.

As I just posted I would like us to create a global define that is
sizeof()-1 so it would be clear, perhaps CC_STRLEN() as a define name.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: pgsql: Clarify coding of .exe patch

From
Neil Conway
Date:
On Mon, 2004-11-01 at 16:04, Bruce Momjian wrote:
> I prefer strlen() myself but Tom prefers sizeof().  In the long run I
> would like to create a 'sizeof' that doesn't include the null as a
> global define that we can use for any hard-coded string.

Tom, can you elaborate on why you prefer sizeof()? ISTM that a decent
compiler will evaluate the strlen() at compile-time if the argument is a
compile-time constant. GCC will do this (even at -O0!), for example.

-Neil



Re: pgsql: Clarify coding of .exe patch

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Tom, can you elaborate on why you prefer sizeof()? ISTM that a decent
> compiler will evaluate the strlen() at compile-time if the argument is a
> compile-time constant. GCC will do this (even at -O0!), for example.

sizeof() is defined as a compile-time constant by the C language
specification.  strlen() is not a compile-time constant, and in my
judgement a compiler that evaluates it as such is exceeding its
authority.  The *linker* could in many cases reduce the call to a
constant with certainty, because it could know whether the function
reference was being resolved to the normal C-library definition or not.
(But it would not have very much scope to do anything with the knowledge
:-(.)  gcc's behavior is completely indefensible; it is quite analogous
to my writing some hexadecimal bit-pattern as a substitute for
upper("foo") because I think I know what upper() will behave as.

This is not an argument against our replacing strlen(".exe") by 4
when we know that we intend to invoke a version of strlen() that will
act that way.  I am just pointing out that the C compiler has no
business doing this for us; we'd be squawking very loudly if it made
entirely-similar assumptions that we chanced not to like as much.

            regards, tom lane

Re: pgsql: Clarify coding of .exe patch

From
Neil Conway
Date:
On Mon, 2004-11-01 at 16:47, Tom Lane wrote:
> sizeof() is defined as a compile-time constant by the C language
> specification.  strlen() is not a compile-time constant, and in my
> judgement a compiler that evaluates it as such is exceeding its
> authority.

Well, I suppose it's a little weird, yes. Per my reading of C99
(5.1.2.3), this optimization is not disallowed by the standard. Also,
GCC is hardly the only compiler that does this: the use of compiler
instrinsics to implement various libc functions (e.g. memset, math
functions, and so on) is quite widespread.

> The *linker* could in many cases reduce the call to a
> constant with certainty, because it could know whether the function
> reference was being resolved to the normal C-library definition or not.

(FWIW GCC does the strlen() optimization before invoking the linker.) In
any case, wouldn't this still be broken in the presence of LD_PRELOAD?

While I still think we're on pretty solid ground assuming this
optimization is going to be made, I'm fine with defining something like
const_strlen() that uses sizeof(). Also, we can guard against programmer
mistakes via __builtin_constant_p() when using GCC.

-Neil



Re: pgsql: Clarify coding of .exe patch

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> While I still think we're on pretty solid ground assuming this
> optimization is going to be made, I'm fine with defining something like
> const_strlen() that uses sizeof(). Also, we can guard against programmer
> mistakes via __builtin_constant_p() when using GCC.

I think both you and Bruce are missing the really fundamental point
here.  You are both optimizing on the grounds that there is no god but
RMS and his prophet is GCC.  I have a somewhat wider view of which
compilers we want to target.

            regards, tom lane

Re: pgsql: Clarify coding of .exe patch

From
Neil Conway
Date:
On Mon, 2004-11-01 at 17:29, Tom Lane wrote:
> I think both you and Bruce are missing the really fundamental point
> here.  You are both optimizing on the grounds that there is no god but
> RMS and his prophet is GCC.  I have a somewhat wider view of which
> compilers we want to target.

:-)

Regarding GCC and its optimization of strlen(), the point I made earlier
is that this is not an idiosyncratic feature of GCC, this is a feature
of essentially *any* modern optimizing compiler.

Regarding __builtin_constant_p(), it is obviously the case that we need
to ensure the code compiles on a wide variety of C compilers. But I
don't see the harm in using GCC-specific features where (a) they can be
easily #ifdef'd away when not using GCC (b) they have a very limited
impact -- i.e. they do not result in using a lot of #ifdefs, or
significantly changing the behavior of the code. Using
__builtin_constant_p() would be a matter of literally 3 lines of code in
a header file somewhere that would not effect correctness, it would
merely serve as a programming tool to catch bugs.

It's one thing to recognize we should keep the code portable across
compilers; it is entirely another to be aware of the fact that the vast
majority of our users and developers use GCC, and to take limited
advantage of GCC features where it is appropriate to do so.

-Neil



Re: pgsql: Clarify coding of .exe patch

From
Bruce Momjian
Date:
Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
> > While I still think we're on pretty solid ground assuming this
> > optimization is going to be made, I'm fine with defining something like
> > const_strlen() that uses sizeof(). Also, we can guard against programmer
> > mistakes via __builtin_constant_p() when using GCC.
>
> I think both you and Bruce are missing the really fundamental point
> here.  You are both optimizing on the grounds that there is no god but
> RMS and his prophet is GCC.  I have a somewhat wider view of which
> compilers we want to target.

Just to clarify, I am not against the use of sizeof() and I understand
Tom's point.  My issue was the use of a constant that doesn't clearly
document.

As I remember the ability of the compiler to evaluate strlen is that if
there is a prototype in scope for strlen() that is somehow different
from the compilers then the compiler can not make the optimization.  I
see my BSD using:

    size_t   strlen __P((const char *));

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: pgsql: Clarify coding of .exe patch

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Regarding __builtin_constant_p(), it is obviously the case that we need
> to ensure the code compiles on a wide variety of C compilers. But I
> don't see the harm in using GCC-specific features where (a) they can be
> easily #ifdef'd away when not using GCC (b) they have a very limited
> impact -- i.e. they do not result in using a lot of #ifdefs, or
> significantly changing the behavior of the code.

No argument on either of those --- just on the conclusion that we can
optimize on the assumption that that optimization will happen.

            regards, tom lane