Thread: Sun performance - Major discovery!
Well, as you guys know I've been tinkering with sun-vs-linux postgres for a while trying to come up with reasons for the HUGE performance differences. We've all had our anecdotal thoughts (fork sucks, ipc sucks, ufs sucks, etc) and I've had a breakthrough. Knowing that GCC only produces good code on x86 (and powerpc with apple's mods, but it is doubtful that is as good as ibm's power compiler) I decided to try out Sunsoft CC. I'd heard from more than one person/place that gcc makes abysmal sparc code. Given that the performance profiles for both the linux and sun boxes showed the same functions taking up most of the time I thought I'd see what a difference sunsoft could give me. So - hardware - Sun E450 4x400mhz ultrasparc IIi, 4GB ram, scsi soemthing disk. (not raid) solaris 2.6 Linux - 2xP3 500mhz, 2GB, scsi disk of some flavor (not raid) linux 2.2.17 (old I know!) So here's the results using my load tester (single connection per beater, repeats the query 1000 times with different input each time (we'll get ~20k rows back), the query is a common query around here. I discounted the first run of the test as caches populated. Linux - 1x - 35 seconds, 20x - 180 seconds Sun - gcc - 1x 60 seconds 20x 245 seconds Sun - sunsoft defaults - 1x 52 seonds 20x [similar to gcc most likely] Sun - sunsoft -fast - 1x 28 seconds 20x 164 seconds As you math guru's can probably deduce - that is a rather large improvement. And by rather large I mean hugely significant. With results like this, I think it warrants mentioning in the FAQ_Solaris, and probably the performance guide. Connecting will always be a bit slower. But I think most people realize that connecting to a db is not cheap. I think update/etc will cause more locking, but I think IO will become the bottle neck much sooner than lock/unlock will. (This is mostly anecdotal given how fast solaris can lock/unlock a semaphore and how much IO I know I have) Oh yes, with was with 7.3.4 and sunsoft cc Sun WorkShop 6 update 1 C 5.2 2000/09/11 (which is old, perhaps newer ones make even better code?) I'm not sure of PG's policy of non-gcc things in configure, but perhaps if we detect sunsoft we toss in the -fast flag and maybe make it the preferred one on sun? [btw, it compiled with no changes but it did spew out tons of warnings] comments? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Wed, Oct 08, 2003 at 08:36:56AM -0400, Jeff wrote: > > So here's the results using my load tester (single connection per beater, > repeats the query 1000 times with different input each time (we'll get > ~20k rows back), the query is a common query around here. My worry about this test is that it gives us precious little knowledge about concurrent connection slowness, which is where I find the most significant problems. When we tried a Sunsoft cc vs gcc 2.95 on Sol 7 about 1 1/2 years ago, we found more or less no difference once we added more than 5 connections (and we always have more than 5 connections). It might be worth trying again, though, since we moved to Sol 8. Thanks for the result. -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Wed, 2003-10-08 at 08:36, Jeff wrote: > So here's the results using my load tester (single connection per beater, > repeats the query 1000 times with different input each time (we'll get > ~20k rows back), the query is a common query around here. What is the query? > Linux - 1x - 35 seconds, 20x - 180 seconds "20x" means 20 concurrent testing processes, right? > Sun - gcc - 1x 60 seconds 20x 245 seconds > Sun - sunsoft defaults - 1x 52 seonds 20x [similar to gcc most likely] > Sun - sunsoft -fast - 1x 28 seconds 20x 164 seconds Interesting (and surprising that the performance differential is that large, to me at least). Can you tell if the performance gain comes from an improvement in a particular subsystem? (i.e. could you get a profile of Sun/gcc and compare it with Sun/sunsoft). -Neil
On Wed, 8 Oct 2003, Andrew Sullivan wrote: > My worry about this test is that it gives us precious little > knowledge about concurrent connection slowness, which is where I find > the most significant problems. When we tried a Sunsoft cc vs gcc 2.95 > on Sol 7 about 1 1/2 years ago, we found more or less no difference > once we added more than 5 connections (and we always have more than 5 > connections). It might be worth trying again, though, since we moved > to Sol 8. > The 20x column are the results when I fired up 20 beater concurrently. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Wed, 8 Oct 2003, Neil Conway wrote: > What is the query? > It retrieves an index listing for our boards. The boards are flat (not threaded) and messages are numbered starting at 1 for each board. If you pass in 0 for the start_from it assumes the latest 60. And it should be noted - in some cases some boards have nearly 2M posts. Index on board_name, number. I cannot give out too too much stuff ;) create or replace function get_index2(integer, varchar, varchar) returns setof snippet as ' DECLARE p_start alias for $1; p_board alias for $2; v_start integer; v_num integer; v_body text; v_sender varchar(35); v_time timestamptz; v_finish integer; v_row record; v_ret snippet; BEGIN v_start := p_start; if v_start = 0 then select * into v_start from get_high_msg(p_board); v_start := v_start - 59; end if; v_finish := v_start + 60; for v_row in select number, substr(body, 0, 50) as snip, member_handle, timestamp from posts where board_name = p_board and number >= v_start and number < v_finish order by number desc LOOP return next v_row; END LOOP; return; END; ' language 'plpgsql'; > Interesting (and surprising that the performance differential is that > large, to me at least). Can you tell if the performance gain comes from > an improvement in a particular subsystem? (i.e. could you get a profile > of Sun/gcc and compare it with Sun/sunsoft). > I'll get these later today. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Wed, 8 Oct 2003, Neil Conway wrote: > Interesting (and surprising that the performance differential is that > large, to me at least). Can you tell if the performance gain comes from > an improvement in a particular subsystem? (i.e. could you get a profile > of Sun/gcc and compare it with Sun/sunsoft). > Yeah - like I expected it was able to generate much better code for _bt_checkkeys which was the #1 function in gcc on both sun & linux. and as you can see, suncc was just able to generate much nicer code. I'd look at the assembler output but that won't be useful since I am very unfamiliar with the [ultra]sparc instruction set.. Here's the prof and gprof output for the latest run: GCC: % cumulative self self total time seconds seconds calls ms/call ms/call name 31.52 19.44 19.44 internal_mcount 20.28 31.95 12.51 8199466 0.00 0.00 _bt_checkkeys 5.61 35.41 3.46 8197422 0.00 0.00 _bt_step 5.01 38.50 3.09 24738620 0.00 0.00 FunctionCall2 3.00 40.35 1.85 8194186 0.00 0.00 varchareq 2.61 41.96 1.61 24309 0.07 0.28 _bt_next 2.42 43.45 1.49 1003 1.49 1.51 AtEOXact_Buffers 2.37 44.91 1.46 12642 0.12 0.12 _read 2.33 46.35 1.44 16517771 0.00 0.00 pg_detoast_datum 2.08 47.63 1.28 8193186 0.00 0.00 int4lt 1.35 48.46 0.83 8237204 0.00 0.00 BufferGetBlockNumber 1.35 49.29 0.83 8193888 0.00 0.00 int4ge 1.35 50.12 0.83 _mcount SunCC -pg -fast. %Time Seconds Cumsecs #Calls msec/call Name 23.2 4.27 4.27108922056 0.0000 _mcount 20.7 3.82 8.09 8304052 0.0005 _bt_checkkeys 13.7 2.53 10.6225054788 0.0001 FunctionCall2 5.1 0.94 11.56 24002 0.0392 _bt_next 4.4 0.81 12.37 8301867 0.0001 _bt_step 3.4 0.63 13.00 8298219 0.0001 varchareq 2.7 0.50 13.5016726855 0.0000 pg_detoast_datum 2.4 0.45 13.95 8342464 0.0001 BufferGetBlockNumber 2.4 0.44 14.39 8297941 0.0001 int4ge 2.2 0.41 14.80 1003 0.409 AtEOXact_Buffers 2.0 0.37 15.17 4220349 0.0001 lc_collate_is_c 2.0 0.37 15.54 8297219 0.0000 int4lt 1.6 0.29 15.83 26537 0.0109 AllocSetContextCreate 0.9 0.16 15.99 1887 0.085 pglz_decompress 0.7 0.13 16.12 159966 0.0008 nocachegetattr 0.7 0.13 16.25 4220349 0.0000 varstr_cmp 0.6 0.11 16.36 937576 0.0001 MemoryContextAlloc 0.5 0.09 16.45 150453 0.0006 hash_search > -Neil > > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > > -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Wed, 2003-10-08 at 10:48, Andrew Sullivan wrote: > My worry about this test is that it gives us precious little > knowledge about concurrent connection slowness, which is where I find > the most significant problems. As Jeff points out, the second set of results is for 20 concurrent connections. Note that the advantage sunsoft cc has over gcc decreases as the number of connections increases (which makes sense, as the 20x workload is likely to be more I/O bound). -Neil
On Wed, 2003-10-08 at 11:46, Jeff wrote: > Yeah - like I expected it was able to generate much better code for > _bt_checkkeys which was the #1 function in gcc on both sun & linux. > > and as you can see, suncc was just able to generate much nicer code. What CFLAGS does configure pick for gcc? From src/backend/template/solaris, I'd guess it's not enabling any optimization. Is that the case? If so, some gcc numbers with -O and -O2 would be useful. -Neil
On Wed, 8 Oct 2003, Neil Conway wrote: > > What CFLAGS does configure pick for gcc? From > src/backend/template/solaris, I'd guess it's not enabling any > optimization. Is that the case? If so, some gcc numbers with -O and -O2 > would be useful. > I can't believe I didn't think of this before! heh. Turns out gcc was getting nothing for flags. I added -O2 to CFLAGS and my 60 seconds went down to 21. A rather mild improvment huh? I did a few more tests and suncc still beats it out - but not by too much now (Not enought to justify buying a license just for compiling pg) I'll go run the regression test suite with my gcc -O2 pg and the suncc pg. See if they pass the test. If they do we should consider adding -O2 and -fast to the CFLAGS. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff wrote: > On Wed, 8 Oct 2003, Neil Conway wrote: > > > > > What CFLAGS does configure pick for gcc? From > > src/backend/template/solaris, I'd guess it's not enabling any > > optimization. Is that the case? If so, some gcc numbers with -O and -O2 > > would be useful. > > > > I can't believe I didn't think of this before! heh. > Turns out gcc was getting nothing for flags. > > I added -O2 to CFLAGS and my 60 seconds went down to 21. A rather mild > improvment huh? > > I did a few more tests and suncc still beats it out - but not by too much > now (Not enought to justify buying a license just for compiling pg) > > I'll go run the regression test suite with my gcc -O2 pg and the suncc pg. > See if they pass the test. > > If they do we should consider adding -O2 and -fast to the CFLAGS. [ CC added for hackers.] Well, this is really embarassing. I can't imagine why we would not set at least -O on all platforms. Looking at the template files, I see these have no optimization set: darwin dgux freebsd (non-alpha) irix5 nextstep osf (gcc) qnx4 solaris sunos4 svr4 ultrix4 I thought we used to have code that did -O for any platforms that set no cflags, but I don't see that around anywhere. I recommend adding -O2, or at leaset -O to all these platforms --- we can then use platform testing to make sure they are working. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote: > Well, this is really embarassing. I can't imagine why we would not set > at least -O on all platforms. ISTM the most legitimate reason for not enabling compilater optimizations on a given compiler/OS/architecture combination is might cause compiler errors / bad code generation. Can we get these optimizations enabled in time for the next 7.4 beta? It might also be good to add an item in the release notes about it. -Neil
On Wed, 8 Oct 2003, Neil Conway wrote: > ISTM the most legitimate reason for not enabling compilater > optimizations on a given compiler/OS/architecture combination is might > cause compiler errors / bad code generation. > > Can we get these optimizations enabled in time for the next 7.4 beta? It > might also be good to add an item in the release notes about it. > > -Neil > I just ran make check for sun with gcc -O2 and suncc -fast and both passed. We'll need other arguments to suncc to supress some warnings, etc. (-fast generates a warning for every file compiled telling you it will only run on ultrasparc machines) -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
In message <Pine.BSF.4.44.0310081408370.64781-100000@torgo.978.org>, Jeff writes: I'll go run the regression test suite with my gcc -O2 pg and the suncc pg. See if they pass the test. My default set of gcc optimization flags is: -O3 -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -falign-functions -mcpu=i686 -march=i686 Obviously the last two flags product CPU specific code, so would have to differ...autoconf is always possible, but so is just lopping them off. I have found these flags to produce faster code that a simple -O2, but I understand the exact combination which is best for you is code-dependent. Of course, if you are getting really excited, you can use -fbranch-probabilities, but as you will see if you investigate that requires some profiling information, so is not very easy to actually practically use. -Seth Robertson
Neil Conway <neilc@samurai.com> writes: > On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote: >> Well, this is really embarassing. I can't imagine why we would not set >> at least -O on all platforms. I believe that autoconf will automatically select -O2 (when CFLAGS isn't already set) *if* it's chosen gcc. It won't select anything for vendor ccs. > Can we get these optimizations enabled in time for the next 7.4 beta? I think it's too late in the beta cycle to add optimization flags except for platforms we can get specific success results for. (Solaris is probably okay for instance.) The risk of breaking things seems too high. regards, tom lane
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote: > >> Well, this is really embarassing. I can't imagine why we would not set > >> at least -O on all platforms. > > I believe that autoconf will automatically select -O2 (when CFLAGS isn't > already set) *if* it's chosen gcc. It won't select anything for vendor > ccs. I think the problem is that template/solaris overrides that with: CFLAGS= > > Can we get these optimizations enabled in time for the next 7.4 beta? > > I think it's too late in the beta cycle to add optimization flags except > for platforms we can get specific success results for. (Solaris is > probably okay for instance.) The risk of breaking things seems too > high. Agreed. Do we set them all to -O2, then remove it from the ones we don't get successful reports on? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote: > >> Well, this is really embarassing. I can't imagine why we would not set > >> at least -O on all platforms. > > I believe that autoconf will automatically select -O2 (when CFLAGS isn't > already set) *if* it's chosen gcc. It won't select anything for vendor > ccs. > > > Can we get these optimizations enabled in time for the next 7.4 beta? > > I think it's too late in the beta cycle to add optimization flags except > for platforms we can get specific success results for. (Solaris is > probably okay for instance.) The risk of breaking things seems too > high. OK, patch attached and applied. It centralizes the optimization defaults into configure.in, rather than having CFLAGS= in the template files. It used -O2 for gcc (generated automatically by autoconf), and -O for non-gcc, unless the template overrides it. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073 Index: configure =================================================================== RCS file: /cvsroot/pgsql-server/configure,v retrieving revision 1.302 diff -c -c -r1.302 configure *** configure 3 Oct 2003 03:08:14 -0000 1.302 --- configure 9 Oct 2003 03:16:44 -0000 *************** *** 2393,2398 **** --- 2393,2402 ---- if test "$ac_env_CFLAGS_set" = set; then CFLAGS=$ac_env_CFLAGS_value fi + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc + if test x"$CFLAGS" = x""; then + CFLAGS="-O" + fi if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then CFLAGS="$CFLAGS -g" fi Index: configure.in =================================================================== RCS file: /cvsroot/pgsql-server/configure.in,v retrieving revision 1.293 diff -c -c -r1.293 configure.in *** configure.in 3 Oct 2003 03:08:14 -0000 1.293 --- configure.in 9 Oct 2003 03:16:46 -0000 *************** *** 238,243 **** --- 238,247 ---- if test "$ac_env_CFLAGS_set" = set; then CFLAGS=$ac_env_CFLAGS_value fi + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc + if test x"$CFLAGS" = x""; then + CFLAGS="-O" + fi if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then CFLAGS="$CFLAGS -g" fi Index: src/template/beos =================================================================== RCS file: /cvsroot/pgsql-server/src/template/beos,v retrieving revision 1.6 diff -c -c -r1.6 beos *** src/template/beos 21 Oct 2000 22:36:13 -0000 1.6 --- src/template/beos 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS='-O2' --- 0 ---- Index: src/template/bsdi =================================================================== RCS file: /cvsroot/pgsql-server/src/template/bsdi,v retrieving revision 1.16 diff -c -c -r1.16 bsdi *** src/template/bsdi 27 Sep 2003 16:24:44 -0000 1.16 --- src/template/bsdi 9 Oct 2003 03:16:51 -0000 *************** *** 5,13 **** esac case $host_os in ! bsdi2.0 | bsdi2.1 | bsdi3*) ! CC=gcc2 ! ;; esac THREAD_SUPPORT=yes --- 5,11 ---- esac case $host_os in ! bsdi2.0 | bsdi2.1 | bsdi3*) CC=gcc2;; esac THREAD_SUPPORT=yes Index: src/template/cygwin =================================================================== RCS file: /cvsroot/pgsql-server/src/template/cygwin,v retrieving revision 1.2 diff -c -c -r1.2 cygwin *** src/template/cygwin 9 Oct 2003 02:37:09 -0000 1.2 --- src/template/cygwin 9 Oct 2003 03:16:51 -0000 *************** *** 1,2 **** - CFLAGS='-O2' SRCH_LIB='/usr/local/lib' --- 1 ---- Index: src/template/dgux =================================================================== RCS file: /cvsroot/pgsql-server/src/template/dgux,v retrieving revision 1.10 diff -c -c -r1.10 dgux *** src/template/dgux 21 Oct 2000 22:36:13 -0000 1.10 --- src/template/dgux 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS= --- 0 ---- Index: src/template/freebsd =================================================================== RCS file: /cvsroot/pgsql-server/src/template/freebsd,v retrieving revision 1.23 diff -c -c -r1.23 freebsd *** src/template/freebsd 27 Sep 2003 16:24:44 -0000 1.23 --- src/template/freebsd 9 Oct 2003 03:16:51 -0000 *************** *** 1,17 **** - CFLAGS='-pipe' - case $host_cpu in ! alpha*) CFLAGS="$CFLAGS -O" ;; esac THREAD_SUPPORT=yes NEED_REENTRANT_FUNCS=yes THREAD_CPPFLAGS="-D_THREAD_SAFE" case $host_os in ! freebsd2*|freebsd3*|freebsd4*) ! THREAD_LIBS="-pthread" ! ;; ! *) ! THREAD_LIBS="-lc_r" ! ;; esac --- 1,11 ---- case $host_cpu in ! alpha*) CFLAGS="-O";; esac THREAD_SUPPORT=yes NEED_REENTRANT_FUNCS=yes THREAD_CPPFLAGS="-D_THREAD_SAFE" case $host_os in ! freebsd2*|freebsd3*|freebsd4*) THREAD_LIBS="-pthread";; ! *) THREAD_LIBS="-lc_r";; esac Index: src/template/hpux =================================================================== RCS file: /cvsroot/pgsql-server/src/template/hpux,v retrieving revision 1.7 diff -c -c -r1.7 hpux *** src/template/hpux 2 Apr 2003 00:49:28 -0000 1.7 --- src/template/hpux 9 Oct 2003 03:16:51 -0000 *************** *** 1,8 **** ! if test "$GCC" = yes ; then ! CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED" ! CFLAGS="-O2" ! else CC="$CC -Ae" - CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED" CFLAGS="+O2" fi --- 1,6 ---- ! CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED" ! ! if test "$GCC" != yes ; then CC="$CC -Ae" CFLAGS="+O2" fi Index: src/template/irix5 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/irix5,v retrieving revision 1.9 diff -c -c -r1.9 irix5 *** src/template/irix5 21 Oct 2000 22:36:13 -0000 1.9 --- src/template/irix5 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS= --- 0 ---- Index: src/template/linux =================================================================== RCS file: /cvsroot/pgsql-server/src/template/linux,v retrieving revision 1.18 diff -c -c -r1.18 linux *** src/template/linux 27 Sep 2003 22:23:35 -0000 1.18 --- src/template/linux 9 Oct 2003 03:16:51 -0000 *************** *** 1,4 **** - CFLAGS=-O2 # Force _GNU_SOURCE on; plperl is broken with Perl 5.8.0 otherwise CPPFLAGS="-D_GNU_SOURCE" --- 1,3 ---- *************** *** 6,9 **** NEED_REENTRANT_FUNCS=yes # Debian kernel 2.2 2003-09-27 THREAD_CPPFLAGS="-D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS" THREAD_LIBS="-lpthread" - --- 5,7 ---- Index: src/template/netbsd =================================================================== RCS file: /cvsroot/pgsql-server/src/template/netbsd,v retrieving revision 1.13 diff -c -c -r1.13 netbsd *** src/template/netbsd 27 Sep 2003 16:24:44 -0000 1.13 --- src/template/netbsd 9 Oct 2003 03:16:51 -0000 *************** *** 1,4 **** - CFLAGS='-O2 -pipe' - THREAD_SUPPORT=yes NEED_REENTRANT_FUNCS=yes # 1.6 2003-09-14 --- 1,2 ---- Index: src/template/nextstep =================================================================== RCS file: /cvsroot/pgsql-server/src/template/nextstep,v retrieving revision 1.7 diff -c -c -r1.7 nextstep *** src/template/nextstep 15 Jul 2000 15:54:52 -0000 1.7 --- src/template/nextstep 9 Oct 2003 03:16:51 -0000 *************** *** 1,4 **** AROPT=rc - CFLAGS= SHARED_LIB= DLSUFFIX=.o --- 1,3 ---- Index: src/template/openbsd =================================================================== RCS file: /cvsroot/pgsql-server/src/template/openbsd,v retrieving revision 1.8 diff -c -c -r1.8 openbsd *** src/template/openbsd 21 Oct 2000 22:36:14 -0000 1.8 --- src/template/openbsd 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS='-O2 -pipe' --- 0 ---- Index: src/template/osf =================================================================== RCS file: /cvsroot/pgsql-server/src/template/osf,v retrieving revision 1.10 diff -c -c -r1.10 osf *** src/template/osf 27 Sep 2003 16:24:45 -0000 1.10 --- src/template/osf 9 Oct 2003 03:16:51 -0000 *************** *** 1,6 **** ! if test "$GCC" = yes ; then ! CFLAGS= ! else CC="$CC -std" CFLAGS='-O4 -Olimit 2000' fi --- 1,4 ---- ! if test "$GCC" != yes ; then CC="$CC -std" CFLAGS='-O4 -Olimit 2000' fi Index: src/template/qnx4 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/qnx4,v retrieving revision 1.4 diff -c -c -r1.4 qnx4 *** src/template/qnx4 24 May 2001 22:33:18 -0000 1.4 --- src/template/qnx4 9 Oct 2003 03:16:51 -0000 *************** *** 1,2 **** ! CFLAGS=-I/usr/local/include ! LIBS=-lunix --- 1,2 ---- ! CFLAGS="-O2 -I/usr/local/include" ! LIBS="-lunix" Index: src/template/sco =================================================================== RCS file: /cvsroot/pgsql-server/src/template/sco,v retrieving revision 1.10 diff -c -c -r1.10 sco *** src/template/sco 11 Dec 2002 22:27:26 -0000 1.10 --- src/template/sco 9 Oct 2003 03:16:51 -0000 *************** *** 1,7 **** - if test "$GCC" = yes; then - CFLAGS=-O2 - else - CFLAGS=-O - fi CC="$CC -b elf" --- 1,2 ---- Index: src/template/solaris =================================================================== RCS file: /cvsroot/pgsql-server/src/template/solaris,v retrieving revision 1.5 diff -c -c -r1.5 solaris *** src/template/solaris 27 Sep 2003 16:24:45 -0000 1.5 --- src/template/solaris 9 Oct 2003 03:16:51 -0000 *************** *** 1,8 **** ! if test "$GCC" = yes ; then ! CFLAGS= ! else CC="$CC -Xa" # relaxed ISO C mode ! CFLAGS=-v # -v is like gcc -Wall fi THREAD_SUPPORT=yes --- 1,6 ---- ! if test "$GCC" != yes ; then CC="$CC -Xa" # relaxed ISO C mode ! CFLAGS="-O -v" # -v is like gcc -Wall fi THREAD_SUPPORT=yes Index: src/template/sunos4 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/sunos4,v retrieving revision 1.2 diff -c -c -r1.2 sunos4 *** src/template/sunos4 21 Oct 2000 22:36:14 -0000 1.2 --- src/template/sunos4 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS= --- 0 ---- Index: src/template/svr4 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/svr4,v retrieving revision 1.10 diff -c -c -r1.10 svr4 *** src/template/svr4 21 Oct 2000 22:36:14 -0000 1.10 --- src/template/svr4 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS= --- 0 ---- Index: src/template/ultrix4 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/ultrix4,v retrieving revision 1.10 diff -c -c -r1.10 ultrix4 *** src/template/ultrix4 21 Oct 2000 22:36:14 -0000 1.10 --- src/template/ultrix4 9 Oct 2003 03:16:51 -0000 *************** *** 1 **** - CFLAGS= --- 0 ---- Index: src/template/univel =================================================================== RCS file: /cvsroot/pgsql-server/src/template/univel,v retrieving revision 1.13 diff -c -c -r1.13 univel *** src/template/univel 21 Oct 2000 22:36:14 -0000 1.13 --- src/template/univel 9 Oct 2003 03:16:51 -0000 *************** *** 1,2 **** CFLAGS='-v -O -K i486,host,inline,loop_unroll -Dsvr4' ! LIBS=-lc89 --- 1,2 ---- CFLAGS='-v -O -K i486,host,inline,loop_unroll -Dsvr4' ! LIBS="-lc89" Index: src/template/unixware =================================================================== RCS file: /cvsroot/pgsql-server/src/template/unixware,v retrieving revision 1.24 diff -c -c -r1.24 unixware *** src/template/unixware 27 Sep 2003 16:24:45 -0000 1.24 --- src/template/unixware 9 Oct 2003 03:16:51 -0000 *************** *** 1,5 **** if test "$GCC" = yes; then - CFLAGS=-O2 THREAD_CPPFLAGS="-pthread" else # the -Kno_host is temporary for a bug in the compiler. See -hackers --- 1,4 ---- Index: src/template/win =================================================================== RCS file: /cvsroot/pgsql-server/src/template/win,v retrieving revision 1.5 diff -c -c -r1.5 win *** src/template/win 8 Oct 2003 18:23:08 -0000 1.5 --- src/template/win 9 Oct 2003 03:16:51 -0000 *************** *** 1,3 **** - if test "$GCC" = yes; then - CFLAGS="-O2" - fi --- 0 ---- Index: src/template/win32 =================================================================== RCS file: /cvsroot/pgsql-server/src/template/win32,v retrieving revision 1.1 diff -c -c -r1.1 win32 *** src/template/win32 15 May 2003 16:35:30 -0000 1.1 --- src/template/win32 9 Oct 2003 03:16:51 -0000 *************** *** 1,3 **** - if test "$GCC" = yes; then - CFLAGS="-O2" - fi --- 0 ----
On Wed, 8 Oct 2003, Neil Conway wrote: > Hey Jeff, > > On Wed, 2003-10-08 at 11:46, Jeff wrote: > > Yeah - like I expected it was able to generate much better code for > > _bt_checkkeys which was the #1 function in gcc on both sun & linux. > > If you get a minute, would it be possible to compare the performance of > your benchmark under linux/gcc and solaris/gcc when PostgreSQL is > compiled with "-O3"? > Sun: gcc: none: 60 seconds -O: 21 seconds -O2: 20 seconds -O3: 19 seconds suncc: none: 52 seconds -fast: 20 secondsish. -fast is actually a macro that expands to the "best settings" for the platform that is doing the compilation. Linux: -O2: 35 -O3: 40 Odd.. I wonder why it took longer. Perhaps gcc built some bad code? I thought the results were odd there so I ran the test many times.. same results! Swapped the binaries back (so -O2 was running) and boom. back to 35. Sun gcc -O2 and suncc -fast both pass make check. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
So you want -fast added as default for non-gcc Solaris? You mentioned there is a warning generated that we have to deal with? --------------------------------------------------------------------------- Jeff wrote: > On Wed, 8 Oct 2003, Neil Conway wrote: > > > Hey Jeff, > > > > On Wed, 2003-10-08 at 11:46, Jeff wrote: > > > Yeah - like I expected it was able to generate much better code for > > > _bt_checkkeys which was the #1 function in gcc on both sun & linux. > > > > If you get a minute, would it be possible to compare the performance of > > your benchmark under linux/gcc and solaris/gcc when PostgreSQL is > > compiled with "-O3"? > > > Sun: > gcc: > none: 60 seconds > -O: 21 seconds > -O2: 20 seconds > -O3: 19 seconds > > suncc: > none: 52 seconds > -fast: 20 secondsish. > > -fast is actually a macro that expands to the "best settings" for the > platform that is doing the compilation. > > > Linux: > -O2: 35 > -O3: 40 > Odd.. I wonder why it took longer. Perhaps gcc built some bad code? > I thought the results were odd there so I ran the test many times.. same > results! Swapped the binaries back (so -O2 was running) and boom. back to > 35. > > Sun gcc -O2 and suncc -fast both pass make check. > > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, Oct 08, 2003 at 02:31:29PM -0400, Bruce Momjian wrote: > Well, this is really embarassing. I can't imagine why we would not set > at least -O on all platforms. Looking at the template files, I see > these have no optimization set: I think gcc _used_ to generate bad code on SPARC if you set any optimisation. We tested it on Sol7 with gcc 2.95 more than a year ago, and tried various settings. -O2 worked, but other items were really bad. Some of them would pass regression but cause strange behaviour, random coredumps, &c. A little digging demonstrated that anything beyond -O2 just didn't work for gcc at the time. A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
On Thu, 9 Oct 2003, Bruce Momjian wrote: > > So you want -fast added as default for non-gcc Solaris? You mentioned > there is a warning generated that we have to deal with? > Yeah, suncc generates a warning for _every_ file that says: Warning: -xarch=native has been explicitly specified, or implicitly specified by a macro option, -xarch=native on this architecture implies -xarch=v8plusa which generates code that does not run on pre-UltraSPARC processors And then I get various warnings here and there... lots of "statement not reached" as in ecpg's type.c module The offending code is a big switch statement like: case ECPGt_bool: return ("ECPGt_bool"); break; And then any functiont aht uses PG_RETURN_NULL generates " warning: end-of-loop code not reached" and a bunch of "constant promoted to unsigned long long" And some places such as in fe-exec.c have code like this: buflen = strlen(strtext); /* will shrink, also we discover if where strtext is an unsigned char * which generates warning: argument #1 is incompatible with prototype: and then various other type mismatches here and there. I skimmed through the manpage.. it doesn't look like we can supress these.. Not sure we want it to look like we have bad code if someone uses cc. perhaps issue a ./configure notice or something? gcc compiles things fine. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
What is the performance win for the -fast flag again? --------------------------------------------------------------------------- Jeff wrote: > On Thu, 9 Oct 2003, Bruce Momjian wrote: > > > > > So you want -fast added as default for non-gcc Solaris? You mentioned > > there is a warning generated that we have to deal with? > > > > Yeah, suncc generates a warning for _every_ file that says: > Warning: -xarch=native has been explicitly specified, or implicitly > specified by a macro option, -xarch=native on this architecture implies > -xarch=v8plusa which generates code that does not run on pre-UltraSPARC > processors > > And then I get various warnings here and there... > > lots of "statement not reached" as in ecpg's type.c module > The offending code is a big switch statement like: > case ECPGt_bool: > return ("ECPGt_bool"); > break; > > And then any functiont aht uses PG_RETURN_NULL generates " warning: > end-of-loop code not reached" > > and a bunch of "constant promoted to unsigned long long" > > > And some places such as in fe-exec.c have code like this: > buflen = strlen(strtext); /* will shrink, also we discover > if > > where strtext is an unsigned char * which generates warning: argument #1 > is incompatible with prototype: > > and then various other type mismatches here and there. > > I skimmed through the manpage.. it doesn't look like we can supress > these.. > > > Not sure we want it to look like we have bad code if someone uses cc. > perhaps issue a ./configure notice or something? > > gcc compiles things fine. > > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Thu, 9 Oct 2003, Bruce Momjian wrote: > > What is the performance win for the -fast flag again? > > --------------------------------------------------------------------------- > 52 seconds to 19-20 seconds -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff wrote: > On Thu, 9 Oct 2003, Bruce Momjian wrote: > > > > > What is the performance win for the -fast flag again? > > > > --------------------------------------------------------------------------- > > > 52 seconds to 19-20 seconds Wow, that's dramatic. Do you want to propose some flags for non-gcc Solaris? Is -fast the only one? Is there one that suppresses those warnings or are they OK? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Thu, 9 Oct 2003, Bruce Momjian wrote: > > 52 seconds to 19-20 seconds > > Wow, that's dramatic. Do you want to propose some flags for non-gcc > Solaris? Is -fast the only one? Is there one that suppresses those > warnings or are they OK? > Well. As I said, I didn't see an obvious way to hide those warnings. I'd love to make those warnings go away. That is why I suggested perhaps printing a message to ensure the user knows that warnings may be printed when using sunsoft. -fast should be all you need - it picks the "best settings" to use for the platform that is doing the compile. -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
On Thu, 9 Oct 2003, Kenneth Marshall wrote: > Jeff, > > My first concern with the -fast option is that it makes an executable > that is specific for the platform on which the compilation is run > unless other flags are given. My second concern is the effect it has > on IEEE floating point behavior w.r.t. rounding, error handling, .... > And my third concern is that if you use -fast, all other code must > be compiled and linked with the -fast option for correct operation, > this includes any functional languages such as perl, python, R,... > That is a pretty big requirement for a default compilation flag. > > Ken Marshall > So you think we should leave PG alone and let it run horrifically slowly? Do you have a better idea of how to do this? And do you have evidence apps compiled with -fast linked to non -fast (or gcc compiled) have problems? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff wrote: > On Thu, 9 Oct 2003, Kenneth Marshall wrote: > > > Jeff, > > > > My first concern with the -fast option is that it makes an executable > > that is specific for the platform on which the compilation is run > > unless other flags are given. My second concern is the effect it has > > on IEEE floating point behavior w.r.t. rounding, error handling, .... > > And my third concern is that if you use -fast, all other code must > > be compiled and linked with the -fast option for correct operation, > > this includes any functional languages such as perl, python, R,... > > That is a pretty big requirement for a default compilation flag. > > > > Ken Marshall > > > > So you think we should leave PG alone and let it run horrifically slowly? > Do you have a better idea of how to do this? > > And do you have evidence apps compiled with -fast linked to non -fast > (or gcc compiled) have problems? I have updated the Solaris FAQ: 5) How can I compile for optimum performance? Try using the "-fast" compile flag. The binaries might not be portable to other Solaris systems, and you might need to compile everything that links to PostgreSQL with "-fast", but PostgreSQL will run significantly faster, 50% faster on some tests. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Thu, Oct 09, 2003 at 01:04:23PM -0400, Jeff wrote: > > So you think we should leave PG alone and let it run horrifically slowly? > Do you have a better idea of how to do this? Given the point in the release cycle, mightn't the FAQ_Solaris or some other place be better for this for now? I agree with the concern. I'd rather have slow'n'stable than fast-but-broken. A -- ---- Andrew Sullivan 204-4141 Yonge Street Afilias Canada Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Andrew Sullivan wrote: > On Thu, Oct 09, 2003 at 01:04:23PM -0400, Jeff wrote: > > > > So you think we should leave PG alone and let it run horrifically slowly? > > Do you have a better idea of how to do this? > > Given the point in the release cycle, mightn't the FAQ_Solaris or > some other place be better for this for now? I agree with the > concern. I'd rather have slow'n'stable than fast-but-broken. FAQ added. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
We're keeping the -O2 for gcc in the template and moving the mention of -fast to the FAQ, correct? -- Jeff Trout <jeff@jefftrout.com> http://www.jefftrout.com/ http://www.stuarthamm.net/
Jeff wrote: > We're keeping the -O2 for gcc in the template and moving the mention of > -fast to the FAQ, correct? gcc gets -O2, non-gcc gets -O, and -fast is in the FAQ, yea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
pgman@candle.pha.pa.us (Bruce Momjian) writes: > 5) How can I compile for optimum performance? > > Try using the "-fast" compile flag. The binaries might not be portable to > other Solaris systems, and you might need to compile everything that links > to PostgreSQL with "-fast", but PostgreSQL will run significantly faster, > 50% faster on some tests. You might also mention something like the following: If you are compiling using GCC, you will quite likely want to add in the "-O2" compile flag. -- let name="cbbrowne" and tld="libertyrms.info" in String.concat "@" [name;tld];; <http://dev6.int.libertyrms.com/> Christopher Browne (416) 646 3304 x124 (land)
Christopher Browne wrote: > pgman@candle.pha.pa.us (Bruce Momjian) writes: > > 5) How can I compile for optimum performance? > > > > Try using the "-fast" compile flag. The binaries might not be portable to > > other Solaris systems, and you might need to compile everything that links > > to PostgreSQL with "-fast", but PostgreSQL will run significantly faster, > > 50% faster on some tests. > > You might also mention something like the following: > > If you are compiling using GCC, you will quite likely want to add in > the "-O2" compile flag. We already do that by default in current CVS for gcc, and -O for non-gcc. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Wed, 2003-10-08 at 21:44, Bruce Momjian wrote: > Agreed. Do we set them all to -O2, then remove it from the ones we > don't get successful reports on? I took the time to compile CVS tip with a few different machines from HP's TestDrive program, to see if there were any regressions using the new optimization flags: (1) (my usual dev machine) $ uname -a Linux tokyo 2.4.19-xfs #1 Mon Jan 20 19:12:29 EST 2003 i686 GNU/Linux $ gcc --version gcc (GCC) 3.3.2 20031005 (Debian prerelease) 'make check' passes (2) $ uname -a Linux spe161 2.4.18-smp #1 SMP Sat Apr 6 21:42:22 EST 2002 alpha unknown $ gcc --version gcc (GCC) 3.3.1 'make check' passes (3) $ uname -a Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown $ gcc --version 3.0.4 'make check' passes BTW, this platform doesn't have any code written for native spinlocks. (4) $ uname -a Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002 ia64 unknown $ gcc --version When you compile PostgreSQL without changing the CFLAGS configure picks, the initdb required for 'make check' fails with: [...] initializing pg_depend... ok creating system views... ok loading pg_description... ok creating conversions... ERROR: could not identify operator 679 I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above resulted from an optimization-induced compiler error, but I got the following error: $ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations -I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c ../../../../src/include/storage/s_lock.h: In function `tas': ../../../../src/include/storage/s_lock.h:125: error: inconsistent operand constraints in an `asm' Whereas this works fine: $ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations -I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c $ BTW, line 138 of s_lock.h is: #if defined(__arm__) || defined(__arm__) That seems a little redundant. Anyway, I tried running initdb after compiling all of pgsql with "-O0', except for the files that included s_lock.h, but make check still failed: creating information schema... ok vacuuming database template1... /house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: line 882: 22035 Segmentation fault (core dumped) "$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF ANALYZE; VACUUM FULL FREEZE; EOF The core file seems to indicate a stack overflow due to an infinitely recursive function: (gdb) bt 25 #0 0x4000000000645dc0 in hash_search () #1 0x4000000000616930 in RelationSysNameCacheGetRelation () #2 0x4000000000616db0 in RelationSysNameGetRelation () #3 0x4000000000082e40 in relation_openr () #4 0x4000000000083910 in heap_openr () #5 0x400000000060e6b0 in ScanPgRelation () #6 0x4000000000611d60 in RelationBuildDesc () #7 0x4000000000616e70 in RelationSysNameGetRelation () #8 0x4000000000082e40 in relation_openr () #9 0x4000000000083910 in heap_openr () #10 0x400000000060e6b0 in ScanPgRelation () #11 0x4000000000611d60 in RelationBuildDesc () #12 0x4000000000616e70 in RelationSysNameGetRelation () #13 0x4000000000082e40 in relation_openr () #14 0x4000000000083910 in heap_openr () #15 0x400000000060e6b0 in ScanPgRelation () #16 0x4000000000611d60 in RelationBuildDesc () #17 0x4000000000616e70 in RelationSysNameGetRelation () #18 0x4000000000082e40 in relation_openr () #19 0x4000000000083910 in heap_openr () #20 0x400000000060e6b0 in ScanPgRelation () #21 0x4000000000611d60 in RelationBuildDesc () #22 0x4000000000616e70 in RelationSysNameGetRelation () #23 0x4000000000082e40 in relation_openr () #24 0x4000000000083910 in heap_openr () (More stack frames follow...) (It also dumps core in the same place during initdb if CFLAGS='-O' is specified.) So it looks like the Itanium port is a little broken. Does anyone have an idea what needs to be done to fix it? -Neil
Isn't it great how you have the same directory on every host so you can download once and run the same tests easily. Neil Conway wrote: > $ uname -a > Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown > $ gcc --version > 3.0.4 > > 'make check' passes I didn't know there was a pa-risc-64 chip. > BTW, this platform doesn't have any code written for native spinlocks. > > (4) > > $ uname -a > Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002 > ia64 unknown > $ gcc --version > > When you compile PostgreSQL without changing the CFLAGS configure picks, > the initdb required for 'make check' fails with: > > [...] > initializing pg_depend... ok > creating system views... ok > loading pg_description... ok > creating conversions... ERROR: could not identify operator 679 > > I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above > resulted from an optimization-induced compiler error, but I got the > following error: > > $ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations > -I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c > ../../../../src/include/storage/s_lock.h: In function `tas': > ../../../../src/include/storage/s_lock.h:125: error: inconsistent > operand constraints in an `asm' > > Whereas this works fine: > > $ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations > -I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c > $ > > BTW, line 138 of s_lock.h is: > > #if defined(__arm__) || defined(__arm__) Fix just committed. Thanks. > That seems a little redundant. > > Anyway, I tried running initdb after compiling all of pgsql with "-O0', > except for the files that included s_lock.h, but make check still > failed: > > creating information schema... ok > vacuuming database template1... > /house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: line 882: 22035 Segmentation fault (core dumped) "$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF > ANALYZE; > VACUUM FULL FREEZE; > EOF > > The core file seems to indicate a stack overflow due to an infinitely > recursive function: > > (gdb) bt 25 > #0 0x4000000000645dc0 in hash_search () > #1 0x4000000000616930 in RelationSysNameCacheGetRelation () > #2 0x4000000000616db0 in RelationSysNameGetRelation () > #3 0x4000000000082e40 in relation_openr () > #4 0x4000000000083910 in heap_openr () > #5 0x400000000060e6b0 in ScanPgRelation () > #6 0x4000000000611d60 in RelationBuildDesc () > #7 0x4000000000616e70 in RelationSysNameGetRelation () > #8 0x4000000000082e40 in relation_openr () > #9 0x4000000000083910 in heap_openr () > #10 0x400000000060e6b0 in ScanPgRelation () > #11 0x4000000000611d60 in RelationBuildDesc () > #12 0x4000000000616e70 in RelationSysNameGetRelation () > #13 0x4000000000082e40 in relation_openr () > #14 0x4000000000083910 in heap_openr () > #15 0x400000000060e6b0 in ScanPgRelation () > #16 0x4000000000611d60 in RelationBuildDesc () > #17 0x4000000000616e70 in RelationSysNameGetRelation () > #18 0x4000000000082e40 in relation_openr () > #19 0x4000000000083910 in heap_openr () > #20 0x400000000060e6b0 in ScanPgRelation () > #21 0x4000000000611d60 in RelationBuildDesc () > #22 0x4000000000616e70 in RelationSysNameGetRelation () > #23 0x4000000000082e40 in relation_openr () > #24 0x4000000000083910 in heap_openr () > (More stack frames follow...) > > (It also dumps core in the same place during initdb if CFLAGS='-O' is > specified.) > > So it looks like the Itanium port is a little broken. Does anyone have > an idea what needs to be done to fix it? My guess is that the compiler itself is broken --- what else could it be? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > OK, patch attached and applied. It centralizes the optimization > defaults into configure.in, rather than having CFLAGS= in the template > files. I think there's a problem here: > + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc > + if test x"$CFLAGS" = x""; then > + CFLAGS="-O" > + fi > if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then > CFLAGS="$CFLAGS -g" > fi since this will cause "configure --enable-debug" to default to selecting CFLAGS="-O -g" for non-gcc compilers. On a lot of compilers that combination does not work, and will generate tons of useless warnings. I think it might be better to do if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then CFLAGS="$CFLAGS -g" + else + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc + if test x"$CFLAGS" = x""; then + CFLAGS="-O" + fi fi regards, tom lane
Done as you suggested. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > OK, patch attached and applied. It centralizes the optimization > > defaults into configure.in, rather than having CFLAGS= in the template > > files. > > I think there's a problem here: > > > + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc > > + if test x"$CFLAGS" = x""; then > > + CFLAGS="-O" > > + fi > > if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then > > CFLAGS="$CFLAGS -g" > > fi > > since this will cause "configure --enable-debug" to default to selecting > CFLAGS="-O -g" for non-gcc compilers. On a lot of compilers that > combination does not work, and will generate tons of useless warnings. > I think it might be better to do > > if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then > CFLAGS="$CFLAGS -g" > + else > + # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc > + if test x"$CFLAGS" = x""; then > + CFLAGS="-O" > + fi > fi > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Jeff, My first concern with the -fast option is that it makes an executable that is specific for the platform on which the compilation is run unless other flags are given. My second concern is the effect it has on IEEE floating point behavior w.r.t. rounding, error handling, .... And my third concern is that if you use -fast, all other code must be compiled and linked with the -fast option for correct operation, this includes any functional languages such as perl, python, R,... That is a pretty big requirement for a default compilation flag. Ken Marshall On Thu, Oct 09, 2003 at 12:07:20PM -0400, Jeff wrote: > On Thu, 9 Oct 2003, Bruce Momjian wrote: > > > > 52 seconds to 19-20 seconds > > > > Wow, that's dramatic. Do you want to propose some flags for non-gcc > > Solaris? Is -fast the only one? Is there one that suppresses those > > warnings or are they OK? > > > > Well. As I said, I didn't see an obvious way to hide those warnings. > I'd love to make those warnings go away. That is why I suggested perhaps > printing a message to ensure the user knows that warnings may be printed > when using sunsoft. > > -fast should be all you need - it picks the "best settings" to use for the > platform that is doing the compile. > > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings
On 8.10.2003, at 21:31, Bruce Momjian wrote: > Well, this is really embarassing. I can't imagine why we would not set > at least -O on all platforms. Looking at the template files, I see > these have no optimization set: > > darwin Regarding Darwin optimizations, Apple has introduced a "-fast" flag in their GCC 3.3 version that they recommend when compiling code for their new G5 systems. Because of this, I foresee a lot of people defining CFLAGS="-fast" on their systems. This is problematic for PostgreSQL, however, since the -fast flag is the equivalent of: -O3 -falign-loops-max-skip=15 -falign-jumps-max-skip=15 -falign-loops=16 -falign-jumps=16 -falign-functions=16 -malign-natural -ffast-math -fstrict-aliasing -frelax-aliasing -fgcse-mem-alias -funroll-loops -floop-transpose -floop-to-memset -finline-floor -mcpu=G5 -mpowerpc64 -mpowerpc-gpopt -mtune=G5 -fsched-interblock -fload-after-store --param max-gcse-passes=3 -fno-gcse-sm -fgcse-loop-depth -funit-at-a-time -fcallgraph-inlining -fdisable-typechecking-for-spec At least the --fast-math part causes problems, seeing that PostgreSQL actually checks for the __FAST_MATH__ macro to make sure that it isn't turned on. There might be other problems with Apple's flags, but I think that the __FAST_MATH__ check should be altered. As you know, setting --fast-math in GCC is the equivalent of setting -fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math, -ffinite-math-only and -fno-signaling-nans. What really should be done, I think, is adding the opposites of these flags (-fmath-errno, -fno-unsafe-math-optimizations, -ftrapping_math, -fno-finite-math-only and -fsignaling-nans) to the command line if __FAST_MATH__ is detected. This would allow people to use CFLAGS="-fast" on their G5s, beat some Xeon speed records, and not worry about esoteric IEEE math standards. What do you guys think? GCC sets __FAST_MATH__ even if you counter a -ffast-math with the negating flags above. This means that it is not currently possible to use the -fast flag when compiling PostgreSQL at all. Instead, you have to go through all the flags Apple is setting and only pass on those that don't break pg. mk
Bruce Momjian writes: > Well, this is really embarassing. I can't imagine why we would not set > at least -O on all platforms. Looking at the template files, I see > these have no optimization set: > freebsd (non-alpha) I'm wondering what that had in mind: http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/template/freebsd.diff?r1=1.10&r2=1.11 -- Peter Eisentraut peter_e@gmx.net
I would use a simple -xO2 or -xO3 instead as the default with an -fsimple=2. --Ken -x02 -xbuiltin=%all On Thu, Oct 09, 2003 at 01:04:23PM -0400, Jeff wrote: > On Thu, 9 Oct 2003, Kenneth Marshall wrote: > > > Jeff, > > > > My first concern with the -fast option is that it makes an executable > > that is specific for the platform on which the compilation is run > > unless other flags are given. My second concern is the effect it has > > on IEEE floating point behavior w.r.t. rounding, error handling, .... > > And my third concern is that if you use -fast, all other code must > > be compiled and linked with the -fast option for correct operation, > > this includes any functional languages such as perl, python, R,... > > That is a pretty big requirement for a default compilation flag. > > > > Ken Marshall > > > > So you think we should leave PG alone and let it run horrifically slowly? > Do you have a better idea of how to do this? > > And do you have evidence apps compiled with -fast linked to non -fast > (or gcc compiled) have problems? > > > -- > Jeff Trout <jeff@jefftrout.com> > http://www.jefftrout.com/ > http://www.stuarthamm.net/ > > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html
Marko Karppinen <marko@karppinen.fi> writes: > At least the --fast-math part causes problems, seeing that PostgreSQL > actually checks for the __FAST_MATH__ macro to make sure that it isn't > turned on. There might be other problems with Apple's flags, but I > think that the __FAST_MATH__ check should be altered. Removing the check is not acceptable --- we spent far too much time fighting bug reports that turned out to trace to -ffast-math. See for example http://archives.postgresql.org/pgsql-bugs/2002-09/msg00169.php > As you know, setting --fast-math in GCC is the equivalent of setting > -fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math, > -ffinite-math-only and -fno-signaling-nans. I suspect that -funsafe-math-optimizations is the only one of those that really affects the datetime code, but I would be quite worried about the side-effects of any of them on the float8 arithmetic routines. Also I think the behavior of -ffast-math has changed over time; in the gcc 2.95.3 manual I see none of the above and only the description `-ffast-math' This option allows GCC to violate some ANSI or IEEE rules and/or specifications in the interest of optimizing code for speed. For example, it allows the compiler to assume arguments to the `sqrt' function are non-negative numbers and that no floating-point values are NaNs. Since we certainly do use NaNs, it would be very bad to allow -ffast-math in gcc 2.95. gcc 3.2 has some but not all of the sub-flags you list above, so apparently the behavior changed again as of gcc 3.3. This means that relaxing the check would require (a) finding out which of the sub-flags break our code and which don't; (b) finding out how the answer to (a) has varied with gcc release; and (c) finding out how we can test whether a given sub-flag is set --- are there #defines for each of them in gcc 3? This does not sound real practical to me... > This would allow people to use CFLAGS="-fast" on their G5s, beat some > Xeon speed records, and not worry about esoteric IEEE math standards. In the words of the sage, "I can make this code *arbitrarily* fast ... if it doesn't have to give the right answer." Those "esoteric" standards make the difference between printing 5:00:00 and printing 4:59:60. regards, tom lane