Thread: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio 11

8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio 11

From
Andreas Lange
Date:
Hi,

I have problems building 8.2beta2 on a Solaris 10 x86-64 machine:

gmake[4]: Entering directory
`/files/dsk1/lsw/src/postgresql/postgresql-8.2beta2/src/backend/utils/adt'
/sw/sun-studio-11/SUNWspro/bin/cc -Xa -fast -fns=no -fsimple=1
-xtarget=opteron -xarch=amd64a -I../../../../src/include   -c -o float.o
float.c
"float.c", line 113: identifier redeclared: cbrt
        current : static function(double) returning double
        previous: function(double) returning double :
"/usr/include/iso/math_c99.h", line 126
cc: acomp failed for float.c
gmake[4]: *** [float.o] Error 2

This is the code in question:

#ifndef HAVE_CBRT
static double cbrt(double x);
#endif   /* HAVE_CBRT */

And here is from configure:

checking whether gettimeofday takes only one argument... no
checking for cbrt... no
checking for dlopen... yes

8.1.5 configured and built (with slock backported) on the same machine
finds cbrt and passes the 'gmake check' just fine.

Is this a known error or due to some intentional change that I've missed?

    Regards,
       Andreas

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio 11

From
Alvaro Herrera
Date:
Andreas Lange wrote:

> And here is from configure:
>
> checking whether gettimeofday takes only one argument... no
> checking for cbrt... no

Undoubtely this is the problem.  Can you show the relevant config.log
extract?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Andreas Lange
Date:
Alvaro Herrera wrote:

>Andreas Lange wrote:
>
>
>
>>And here is from configure:
>>
>>checking whether gettimeofday takes only one argument... no
>>checking for cbrt... no
>>
>>
>
>Undoubtely this is the problem.  Can you show the relevant config.log
>extract?
>
>
>
Ok, here we go:

configure:13462: checking for cbrt
configure:13519: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
-fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c -lz
-lrt -lsocket  >&5
"conftest.c", line 104: warning: statement not reached
Undefined                       first referenced
 symbol                             in file
cbrt                                conftest.o
ld: fatal: Symbol referencing errors. No output written to conftest
configure:13525: $? = 1
configure: failed program was:
| /* confdefs.h.  */
|
| #define PACKAGE_NAME "PostgreSQL"
| #define PACKAGE_TARNAME "postgresql"
| #define PACKAGE_VERSION "8.2beta2"
| #define PACKAGE_STRING "PostgreSQL 8.2beta2"
| #define PACKAGE_BUGREPORT "pgsql-bugs@postgresql.org"
| #define PG_VERSION "8.2beta2"
| #define DEF_PGPORT 5432
| #define DEF_PGPORT_STR "5432"
| #define PG_VERSION_STR "PostgreSQL 8.2beta2 on i386-pc-solaris2.10,
compiled by /sw/sun-studio-11/SUNWspro/bin/cc -Xa"
| #define PG_KRB_SRVNAM "postgres"
| #define PG_VERSION_NUM 80200
| #define HAVE_LIBZ 1
| #define HAVE_SPINLOCKS 1
| #define STDC_HEADERS 1
| #define HAVE_SYS_TYPES_H 1
| #define HAVE_SYS_STAT_H 1
| #define HAVE_STDLIB_H 1
| #define HAVE_STRING_H 1
| #define HAVE_MEMORY_H 1
| #define HAVE_STRINGS_H 1
| #define HAVE_INTTYPES_H 1
| #define HAVE_STDINT_H 1
| #define HAVE_UNISTD_H 1
| #define HAVE_CRYPT_H 1
| #define HAVE_GETOPT_H 1
| #define HAVE_IEEEFP_H 1
| #define HAVE_LANGINFO_H 1
| #define HAVE_POLL_H 1
| #define HAVE_PWD_H 1
| #define HAVE_SYS_IPC_H 1
| #define HAVE_SYS_POLL_H 1
| #define HAVE_SYS_RESOURCE_H 1
| #define HAVE_SYS_SELECT_H 1
| #define HAVE_SYS_SEM_H 1
| #define HAVE_SYS_SOCKET_H 1
| #define HAVE_SYS_SHM_H 1
| #define HAVE_SYS_TIME_H 1
| #define HAVE_SYS_UN_H 1
| #define HAVE_TERMIOS_H 1
| #define HAVE_UTIME_H 1
| #define HAVE_WCHAR_H 1
| #define HAVE_WCTYPE_H 1
| #define HAVE_NETINET_IN_H 1
| #define HAVE_NETINET_TCP_H 1
| #define HAVE_STRINGIZE 1
| #define HAVE_FUNCNAME__FUNC 1
| #define HAVE_TZNAME 1
| #define HAVE_STRUCT_SOCKADDR_UN 1
| #define HAVE_UNIX_SOCKETS 1
| #define HAVE_STRUCT_SOCKADDR_STORAGE 1
| #define HAVE_STRUCT_SOCKADDR_STORAGE_SS_FAMILY 1
| #define HAVE_STRUCT_ADDRINFO 1
| #define HAVE_STRUCT_OPTION 1
| #define HAVE_INT_TIMEZONE
| #define ACCEPT_TYPE_RETURN int
| #define ACCEPT_TYPE_ARG1 int
| #define ACCEPT_TYPE_ARG2 struct sockaddr *
| #define ACCEPT_TYPE_ARG3 int
| /* end confdefs.h.  */
| /* Define cbrt to an innocuous variant, in case <limits.h> declares cbrt.
|    For example, HP-UX 11i <limits.h> declares gettimeofday.  */
| #define cbrt innocuous_cbrt
|
| /* System header to define __stub macros and hopefully few prototypes,
|     which can conflict with char cbrt (); below.
|     Prefer <limits.h> to <assert.h> if __STDC__ is defined, since
|     <limits.h> exists even on freestanding compilers.  */
|
| #ifdef __STDC__
| # include <limits.h>
| #else
| # include <assert.h>
| #endif
|
| #undef cbrt
|
| /* Override any gcc2 internal prototype to avoid an error.  */
| #ifdef __cplusplus
| extern "C"
| {
| #endif
| /* We use char because int might match the return type of a gcc2
|    builtin and then its argument prototype would still apply.  */
| char cbrt ();
| /* The GNU C library defines this for functions which it implements
|     to always fail with ENOSYS.  Some functions are actually named
|     something starting with __ and the normal name is an alias.  */
| #if defined (__stub_cbrt) || defined (__stub___cbrt)
| choke me
| #else
| char (*f) () = cbrt;
| #endif
| #ifdef __cplusplus
| }
| #endif
|
| int
| main ()
| {
| return f != cbrt;
|   ;
|   return 0;
| }
configure:13550: result: no

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Alvaro Herrera
Date:
Andreas Lange wrote:

> configure:13462: checking for cbrt
> configure:13519: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
> -fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c -lz
> -lrt -lsocket  >&5
> "conftest.c", line 104: warning: statement not reached
> Undefined                       first referenced
>  symbol                             in file
> cbrt                                conftest.o
> ld: fatal: Symbol referencing errors. No output written to conftest
> configure:13525: $? = 1
> configure: failed program was:

Huh, long shot: maybe cbrt is a macro on that platform?

Can you find where and how is cbrt declared and defined on your system
headers?

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Tom Lane
Date:
Andreas Lange <anlan@ida.liu.se> writes:
> configure:13462: checking for cbrt
> configure:13519: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
> -fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c -lz
> -lrt -lsocket  >&5
> "conftest.c", line 104: warning: statement not reached
> Undefined                       first referenced
>  symbol                             in file
> cbrt                                conftest.o
> ld: fatal: Symbol referencing errors. No output written to conftest

Presumably the problem is that the cc call lacks "-lm".

Checking back against 8.1, I see that 8.1's configure has

    AC_CHECK_LIB(m, main)

where 8.2 tries to do

    AC_SEARCH_LIBS(pow, m)

I suppose there is something funny about pow() on your platform
causing that probe to fail.  What does config.log have at the
"checking for library containing pow" step?

My inclination is to undo this particular change, and thus to
unconditionally include libm whenever it can be found.  I can't imagine
there are any platforms where we don't need libm, given the fairly
extensive demands of utils/adt/float.c; furthermore, given the frequency
with with some of these functions are macro-ized or otherwise diddled
with, relying on an AC_SEARCH_LIBS test seems risky.

            regards, tom lane

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Andreas Lange
Date:
Alvaro Herrera wrote:

>Andreas Lange wrote:
>
>
>
>>configure:13462: checking for cbrt
>>configure:13519: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
>>-fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c -lz
>>-lrt -lsocket  >&5
>>"conftest.c", line 104: warning: statement not reached
>>Undefined                       first referenced
>> symbol                             in file
>>cbrt                                conftest.o
>>ld: fatal: Symbol referencing errors. No output written to conftest
>>configure:13525: $? = 1
>>configure: failed program was:
>>
>>
>
>Huh, long shot: maybe cbrt is a macro on that platform?
>
>Can you find where and how is cbrt declared and defined on your system
>headers?
>
>
>
I don't think that is the issue since 8.1.5 works with the same
env/configure arguments. I began to suspect that I was chasing the
symptoms and not the cause, making me diff the conftest from 8.1.5 and
8.2b2:

--- conftest.cbrt_8_1.c fre nov  3 16:14:40 2006
+++ conftest.cbrt_8_2.c fre nov  3 16:12:05 2006
@@ -2,20 +2,15 @@

 #define PACKAGE_NAME "PostgreSQL"
 #define PACKAGE_TARNAME "postgresql"
-#define PACKAGE_VERSION "8.1.5"
-#define PACKAGE_STRING "PostgreSQL 8.1.5"
+#define PACKAGE_VERSION "8.2beta2"
+#define PACKAGE_STRING "PostgreSQL 8.2beta2"
 #define PACKAGE_BUGREPORT "pgsql-bugs@postgresql.org"
-#define PG_VERSION "8.1.5"
+#define PG_VERSION "8.2beta2"
 #define DEF_PGPORT 5432
 #define DEF_PGPORT_STR "5432"
-#define PG_VERSION_STR "PostgreSQL 8.1.5 on i386-pc-solaris2.10,
compiled by /sw/sun-studio-11/SUNWspro/bin/cc -Xa"
+#define PG_VERSION_STR "PostgreSQL 8.2beta2 on i386-pc-solaris2.10,
compiled by /sw/sun-studio-11/SUNWspro/bin/cc -Xa"
 #define PG_KRB_SRVNAM "postgres"
-#define HAVE_LIBM 1
-#define HAVE_LIBDL 1
-#define HAVE_LIBNSL 1
-#define HAVE_LIBSOCKET 1
-#define HAVE_LIBGEN 1
-#define HAVE_LIBRESOLV 1
+#define PG_VERSION_NUM 80200
 #define HAVE_LIBZ 1
 #define HAVE_SPINLOCKS 1
 #define STDC_HEADERS 1
@@ -36,6 +31,7 @@
 #define HAVE_PWD_H 1
 #define HAVE_SYS_IPC_H 1
 #define HAVE_SYS_POLL_H 1
+#define HAVE_SYS_RESOURCE_H 1
 #define HAVE_SYS_SELECT_H 1
 #define HAVE_SYS_SEM_H 1
 #define HAVE_SYS_SOCKET_H 1
@@ -57,7 +53,6 @@
 #define HAVE_STRUCT_SOCKADDR_STORAGE_SS_FAMILY 1
 #define HAVE_STRUCT_ADDRINFO 1
 #define HAVE_STRUCT_OPTION 1
-#define HAVE_DECL_F_FULLFSYNC 0
 #define HAVE_INT_TIMEZONE
 #define ACCEPT_TYPE_RETURN int
 #define ACCEPT_TYPE_ARG1 int

Huh? No LIBM?

> cc conftest.cbrt_8_2.c
"conftest.cbrt_8_2.c", line 104: warning: statement not reached
Undefined                       first referenced
 symbol                             in file
cbrt                                conftest.cbrt_8_2.o
ld: fatal: Symbol referencing errors. No output written to a.out
> cc -lm conftest.cbrt_8_2.c
"conftest.cbrt_8_2.c", line 104: warning: statement not reached
>

So, it seems I need '-lm', but that is no longer tested in configure.


   //Andreas

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Andreas Lange
Date:
Tom Lane wrote:

>>configure:13462: checking for cbrt
>>configure:13519: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
>>-fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c -lz
>>-lrt -lsocket  >&5
>>"conftest.c", line 104: warning: statement not reached
>>Undefined                       first referenced
>> symbol                             in file
>>cbrt                                conftest.o
>>ld: fatal: Symbol referencing errors. No output written to conftest
>>
>>
>
>Presumably the problem is that the cc call lacks "-lm".
>
>
Indeed. Just took me a bit longer to get that. :-)

>Checking back against 8.1, I see that 8.1's configure has
>
>    AC_CHECK_LIB(m, main)
>
>where 8.2 tries to do
>
>    AC_SEARCH_LIBS(pow, m)
>
>I suppose there is something funny about pow() on your platform
>causing that probe to fail.  What does config.log have at the
>"checking for library containing pow" step?
>
>
configure:5168: checking for library containing pow
configure:5198: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
-fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c  >&5
configure:5204: $? = 0
configure:5208: test -z
             || test ! -s conftest.err
configure:5211: $? = 0
configure:5214: test -s conftest
configure:5217: $? = 0
configure:5287: result: none required


  regards,
     Andreas Lange

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Tom Lane
Date:
Andreas Lange <anlan@ida.liu.se> writes:
> Tom Lane wrote:
>> I suppose there is something funny about pow() on your platform
>> causing that probe to fail.  What does config.log have at the
>> "checking for library containing pow" step?
>>
> configure:5168: checking for library containing pow
> configure:5198: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
> -fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c  >&5
> configure:5204: $? = 0
> configure:5208: test -z
>              || test ! -s conftest.err
> configure:5211: $? = 0
> configure:5214: test -s conftest
> configure:5217: $? = 0
> configure:5287: result: none required

Interesting.  Could pow() actually be in libc on your machine?
The other possible explanation is that it's a macro, but the
AC_SEARCH_LIBS code seems to go out of its way to fail if that's
the case.

Anyway this illustrates the dilemma we face in trying to do a real probe
for libm: the common functions (pow) are likely to be macro-ized, while
uncommon ones might not be there at all (cbrt).  Anyone have a better
idea than reverting to the unconditional AC_CHECK_LIB(m, main) call?

            regards, tom lane

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Zdenek Kotala
Date:
Tom Lane wrote:
> Andreas Lange <anlan@ida.liu.se> writes:
>> Tom Lane wrote:
>>> I suppose there is something funny about pow() on your platform
>>> causing that probe to fail.  What does config.log have at the
>>> "checking for library containing pow" step?
>>>
>> configure:5168: checking for library containing pow
>> configure:5198: /sw/sun-studio-11/SUNWspro/bin/cc -Xa -o conftest -fast
>> -fns=no -fsimple=1 -xtarget=opteron -xarch=amd64a     conftest.c  >&5
>> configure:5204: $? = 0
>> configure:5208: test -z
>>              || test ! -s conftest.err
>> configure:5211: $? = 0
>> configure:5214: test -s conftest
>> configure:5217: $? = 0
>> configure:5287: result: none required
>
> Interesting.  Could pow() actually be in libc on your machine?
> The other possible explanation is that it's a macro, but the
> AC_SEARCH_LIBS code seems to go out of its way to fail if that's
> the case.
>
> Anyway this illustrates the dilemma we face in trying to do a real probe
> for libm: the common functions (pow) are likely to be macro-ized, while
> uncommon ones might not be there at all (cbrt).  Anyone have a better
> idea than reverting to the unconditional AC_CHECK_LIB(m, main) call?
>

Main problem is -fast switch. It modifies behavior of floating point
operation (it is reason why It is not good option for postgres) and use
another floating point libraries and some function are inlined. It is
reason why pow test passed with -fast switch without -lm switch.

Detail description of -fast you can found on
http://docs.sun.com/source/819-3688/cc_ops.app.html

        Zdenek

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Andreas Lange
Date:
Zdenek Kotala wrote:

>
> Main problem is -fast switch. It modifies behavior of floating point
> operation (it is reason why It is not good option for postgres) and
> use another floating point libraries and some function are inlined. It
> is reason why pow test passed with -fast switch without -lm switch.
>
> Detail description of -fast you can found on
> http://docs.sun.com/source/819-3688/cc_ops.app.html
>

I noticed that the Sun FAQ now has changed from hinting that -fast might
be very beneficial to recomend staying away from it.

Using -fast is an old habit, has been building with it for years. I've
seen that the testsuite breaks (in date/time) with only -fast, but it
seems the only option one has to disable to normalize floating point
enough is -fns. I hope passing the testsuite really means that  fp math
behaves correctly. If  I'm wrong about that, I'll have to change our
build routine.

Beeing lazy, it is a good bit easier to go with -fast and turn of the
problematic optimization with:
-fast -fns=no
than expanding the -fast macro and having to add all parameters:
-dalign -nofstore -fsimple=2 -fsingle -xalias_level=basic -native
-xdepend -xlibmil -xlibmopt -xO5 -xregs=frameptr

I do understand the recomendation to avoid -fast, the tweaking is both
compiler version and hardware architecture dependant. Doing a make check
is always advisable.

   regards,
          Andreas

Re: 8.2bet2 failed build on Solaris 10 / x86-64 / SUN Studio

From
Zdenek Kotala
Date:
Andreas Lange wrote:
> Zdenek Kotala wrote:
>
>> Main problem is -fast switch. It modifies behavior of floating point
>> operation (it is reason why It is not good option for postgres) and
>> use another floating point libraries and some function are inlined. It
>> is reason why pow test passed with -fast switch without -lm switch.
>>
>> Detail description of -fast you can found on
>> http://docs.sun.com/source/819-3688/cc_ops.app.html
>>
>
> I noticed that the Sun FAQ now has changed from hinting that -fast might
> be very beneficial to recomend staying away from it.

Yes, because there was some problem with regression test.

> Using -fast is an old habit, has been building with it for years. I've
> seen that the testsuite breaks (in date/time) with only -fast, but it
> seems the only option one has to disable to normalize floating point
> enough is -fns. I hope passing the testsuite really means that  fp math
> behaves correctly. If  I'm wrong about that, I'll have to change our
> build routine.

I little bit played with compiler switches and only -xO5 had significant
deal for postgres. But I only tested it with pgbench.

Very important thing is that backend sends floating point number in
binary form. It means that you must compile also client library and
client application with -fast switch. If you don't do this, the result
should be nonsense.

> Beeing lazy, it is a good bit easier to go with -fast and turn of the
> problematic optimization with:
> -fast -fns=no
> than expanding the -fast macro and having to add all parameters:
> -dalign -nofstore -fsimple=2 -fsingle -xalias_level=basic -native
> -xdepend -xlibmil -xlibmopt -xO5 -xregs=frameptr


Parameters -fsimple=2 -xlibmopt -xlibmil also break IEEE floating point
arithmetic and also break errno behavior (does not report errno). If you
look in adt/float.c source code, you can see comment from Tom about
problems with errno on Linux many years ago. This should happen also
with -xlibmil and -xlibmopt switch.

My suggestion is do not use -fast anyway. Let me know if I'm not correct.

        Zdenek