Thread: Re: [COMMITTERS] pgsql: Improved parallel make support
Andrew Dunstan <andrew@dunslane.net> writes: > On 11/12/2010 03:16 PM, Peter Eisentraut wrote: >> Improved parallel make support > Looks like this patch has pretty comprehensively broken the MSVC build > system. I'll see what I can recover from the wreckage. There are also at least three non-Windows buildfarm members failing like so: gmake -C src all gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src' gmake[1]: *** virtual memory exhausted. Stop. gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src' gmake: *** [all-src-recursive] Error 2 I think we may have pushed too far in terms of what actually works reliably across different make versions. regards, tom lane
On 11/12/2010 11:25 PM, Tom Lane wrote: > Andrew Dunstan<andrew@dunslane.net> writes: >> On 11/12/2010 03:16 PM, Peter Eisentraut wrote: >>> Improved parallel make support >> Looks like this patch has pretty comprehensively broken the MSVC build >> system. I'll see what I can recover from the wreckage. > There are also at least three non-Windows buildfarm members failing like > so: > > gmake -C src all > gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src' > gmake[1]: *** virtual memory exhausted. Stop. > gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src' > gmake: *** [all-src-recursive] Error 2 > > I think we may have pushed too far in terms of what actually works > reliably across different make versions. Yeah, possibly. And now it looks like this has broken the Solaris buildfarm members too. I'm curious to know how much all this buys us. One reason I haven't enabled parallel make in the buildfarm is that it interleaves the output, which can be a pain. And build speed isn't really the buildfarm's foremost concern anyway. I know waiting for a build can be mildly annoying (ccache can be a big help if you're building repeatedly). But I don't feel we need to squeeze every last pip out of the build system. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I'm curious to know how much all this buys us. It *would* be nice if "make -k" worked better. I frequently run into the fact that (with the pre-existing setup) a compile error in the backend prevented make from proceeding with builds of interfaces/, bin/, etc, meaning that that work still remains to be done after I've finished fixing the backend error. But having said that, I won't shed many tears if we have to revert this. It looks like all the unhappy critters are getting the same "virtual memory exhausted" error. I wonder whether they are all using make 3.80 ... regards, tom lane
BTW, there's another problem here: "make -j2" on my Mac blows up with this on stderr: ld: file not found: ../../../../../../src/backend/postgres collect2: ld returned 1 exit status make[3]: *** [ascii_and_mic.so] Error 1 make[2]: *** [all-ascii_and_mic-recursive] Error 2 make[1]: *** [all-backend/utils/mb/conversion_procs-recursive] Error 2 make[1]: *** Waiting for unfinished jobs.... In file included from gram.y:12101: scan.c: In function 'yy_try_NUL_trans': scan.c:16242: warning: unused variable 'yyg' make: *** [all-src-recursive] Error 2 Consulting stdout shows that indeed it's launched this series of jobs: make -C backend/utils/mb/conversion_procs all make -C ascii_and_mic all gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv-g -I../../../../../../src/include -c -o ascii_and_mic.o ascii_and_mic.c gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv-g -bundle -multiply_defined suppress -o ascii_and_mic.so ascii_and_mic.o -L../../../../../../src/port -Wl,-d\ ead_strip_dylibs -bundle_loader ../../../../../../src/backend/postgres immediately after completing the src/timezone build, before the backend build is even well begun let alone finished. So the parallel build dependency interlocks are basically not working. This machine has gmake 3.81. regards, tom lane
On 11/13/2010 11:12 AM, Tom Lane wrote: > It looks like all the unhappy critters are getting the same "virtual > memory exhausted" error. I wonder whether they are all using make 3.80 > ... Maybe we need to put back make version logging. Interestingly, narwhal, the mingw machine that has reported, didn't complain. It's running 3.81. cheers andrew
On Sat, Nov 13, 2010 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > It looks like all the unhappy critters are getting the same "virtual > memory exhausted" error. I wonder whether they are all using make 3.80 Both my Sparc and Intel Solaris critters have 3.80. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On lör, 2010-11-13 at 11:06 -0500, Andrew Dunstan wrote: > But I don't feel we need to squeeze every last pip out of > the build system. Probably not on the buildfarm, but when you are developing, saving 20 seconds or 2 minutes per cycle can lead to hours saved.
On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: > It looks like all the unhappy critters are getting the same "virtual > memory exhausted" error. I wonder whether they are all using make > 3.80 ... It turns out that there is an unrelated bug in 3.80 that some Linux distributions have patched around. 3.81 or 3.82 are OK.
On lör, 2010-11-13 at 11:23 -0500, Tom Lane wrote: > Consulting stdout shows that indeed it's launched this series of jobs: > > make -C backend/utils/mb/conversion_procs all > make -C ascii_and_mic all > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing > -fwrapv -g -I../../../../../../src/include -c -o ascii_and_mic.o > ascii_and_mic.c > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing > -fwrapv -g -bundle -multiply_defined suppress -o ascii_and_mic.so > ascii_and_mic.o -L../../../../../../src/port -Wl,-d\ > ead_strip_dylibs > -bundle_loader ../../../../../../src/backend/postgres > > immediately after completing the src/timezone build, before the > backend build is even well begun let alone finished. So the parallel > build dependency interlocks are basically not working. On some platforms, you need to have backend/postgres built before any dynamically loadable modules. For those platforms, additional dependencies will be necessary, I suppose.
Peter Eisentraut <peter_e@gmx.net> writes: > On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: >> It looks like all the unhappy critters are getting the same "virtual >> memory exhausted" error. I wonder whether they are all using make >> 3.80 ... > It turns out that there is an unrelated bug in 3.80 that some Linux > distributions have patched around. 3.81 or 3.82 are OK. So what do you mean by "unrelated bug"? Can we work around it? regards, tom lane
On Sat, November 13, 2010 18:15, Peter Eisentraut wrote: > On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: >> It looks like all the unhappy critters are getting the same "virtual >> memory exhausted" error. I wonder whether they are all using make >> 3.80 ... > > It turns out that there is an unrelated bug in 3.80 that some Linux > distributions have patched around. 3.81 or 3.82 are OK. > Just to mention another effect of the recent changes: make 3.81, Centos 5.5 On a dual quadcore system where I used to built with -j 16, it now only succeeds with -j 8. (I seem to remember that 16 as opposed to 8 shaved a couple of seconds off, although I'm not quite sure anymore) make -j 16 gives: cc1: error: thread.c: No such file or directory make[4]: *** [thread.o] Error 1 make[3]: *** [submake-libpq] Error 2 make[2]: *** [all-pg_ctl-recursive] Error 2 make[1]: *** [all-bin-recursive] Error 2 make[1]: *** Waiting for unfinished jobs.... Use of assignment to $[ is deprecated at ./parse.pl line 21. In file included from gram.y:12101: scan.c: In function ‘yy_try_NUL_trans’: scan.c:16242: warning: unused variable ‘yyg’ Use of assignment to $[ is deprecated at ./check_rules.pl line 18. make: *** [all-src-recursive] Error 2 ( A similar effect I see on a dual core fedora system (2.6.27.5-117.fc10.i686), where -j 16 always ran, but now it needs -j 4 or less (it also has make 3.81) ) Erik Rijkers
On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote: > Peter Eisentraut <peter_e@gmx.net> writes: > > On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: > >> It looks like all the unhappy critters are getting the same "virtual > >> memory exhausted" error. I wonder whether they are all using make > >> 3.80 ... > > > It turns out that there is an unrelated bug in 3.80 that some Linux > > distributions have patched around. 3.81 or 3.82 are OK. > > So what do you mean by "unrelated bug"? Can we work around it? The information is fuzzy, but the problem has been reported around the internet, and it appears to be related to the foreach function. I think I have an idea how to work around it, but I'll need some time.
On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote: > On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote: > > Peter Eisentraut <peter_e@gmx.net> writes: > > > On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: > > >> It looks like all the unhappy critters are getting the same "virtual > > >> memory exhausted" error. I wonder whether they are all using make > > >> 3.80 ... > > > > > It turns out that there is an unrelated bug in 3.80 that some Linux > > > distributions have patched around. 3.81 or 3.82 are OK. > > > > So what do you mean by "unrelated bug"? Can we work around it? > > The information is fuzzy, but the problem has been reported around the > internet, and it appears to be related to the foreach function. I think > I have an idea how to work around it, but I'll need some time. Well, it looks like $(eval) is pretty broken in 3.80, so either we require 3.81 or we abandon this line of thought.
Peter Eisentraut <peter_e@gmx.net> writes: > Well, it looks like $(eval) is pretty broken in 3.80, so either we > require 3.81 or we abandon this line of thought. [ emerges from some grubbing about in the gmake sources... ] It looks to me like the bug in 3.80 is only triggered when "eval" expands to a long enough string to trigger reallocation of the variable buffer. (Ergo, the reason they didn't find it sooner was they only tested on relatively short strings.) I wonder whether the bug could be worked around if you did the iteration on SUBDIRS in a foreach surrounding the eval call, so that each eval dealt with only one subdir target. This would result in a bit of redundancy in the generated rules, but that seems tolerable. regards, tom lane
On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote: >> On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote: >> > Peter Eisentraut <peter_e@gmx.net> writes: >> > > On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote: >> > >> It looks like all the unhappy critters are getting the same "virtual >> > >> memory exhausted" error. I wonder whether they are all using make >> > >> 3.80 ... >> > >> > > It turns out that there is an unrelated bug in 3.80 that some Linux >> > > distributions have patched around. 3.81 or 3.82 are OK. >> > >> > So what do you mean by "unrelated bug"? Can we work around it? >> >> The information is fuzzy, but the problem has been reported around the >> internet, and it appears to be related to the foreach function. I think >> I have an idea how to work around it, but I'll need some time. > > Well, it looks like $(eval) is pretty broken in 3.80, so either we > require 3.81 or we abandon this line of thought. 3.81 might be a problem for Solaris - unless I pay for a support contract from Oracle, I'm not going to get any updates from them, which means I'll have to install a custom build. Now that's no biggie for me, but it does see to raise the bar somewhat for users that might want to build from source. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Dave Page <dpage@pgadmin.org> writes: > On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> Well, it looks like $(eval) is pretty broken in 3.80, so either we >> require 3.81 or we abandon this line of thought. > 3.81 might be a problem for Solaris - unless I pay for a support > contract from Oracle, I'm not going to get any updates from them, > which means I'll have to install a custom build. Now that's no biggie > for me, but it does see to raise the bar somewhat for users that might > want to build from source. For another data point, I find make 3.80 in OS X 10.4, while 10.5 and 10.6 have 3.81. 10.4 is certainly behind the curve, but Apple still seem to be releasing security updates for it. I was about to draw an analogy to flex -- we are now requiring a version of flex that's roughly contemporaneous with make 3.81. However, we don't require flex to build from a tarball, so on second thought that situation isn't very comparable. Moving the goalposts for make would definitely affect more people. On the third hand, gmake is very very easy to install: if you're capable of building Postgres from source, it's hard to believe that gmake should scare you off. (I've installed multiple versions on my ancient HPUX dinosaur, and it's never been any harder than ./configure, make, make check, make install.) And on the fourth hand, what we're buying here is pretty marginal for developers and of no interest whatever for users. I still think it's worth looking into whether the bug can be dodged by shortening the eval calls. But if not, I think I'd vote for reverting. Maybe we could revisit this in a couple of years. regards, tom lane
On 11/14/2010 10:44 AM, Tom Lane wrote: > And on the fourth hand, what we're buying here is pretty marginal for > developers and of no interest whatever for users. > > I still think it's worth looking into whether the bug can be dodged > by shortening the eval calls. But if not, I think I'd vote for > reverting. Maybe we could revisit this in a couple of years. +1 cheers andrew
On Nov 14, 2010, at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I still think it's worth looking into whether the bug can be dodged > by shortening the eval calls. But if not, I think I'd vote for > reverting. Maybe we could revisit this in a couple of years. +1. The current master branch fails to build on my (rather new) Mac with make -j2. I could upgrade my toolchain but itseems like more trouble than it's worth, not to mention a possible obstacle to new users and developers. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Nov 14, 2010, at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I still think it's worth looking into whether the bug can be dodged >> by shortening the eval calls. But if not, I think I'd vote for >> reverting. Maybe we could revisit this in a couple of years. > +1. The current master branch fails to build on my (rather new) Mac > with make -j2. I complained of the same thing, but AFAICS that's not a make bug; it's a missing build dependency, which could be fixed if we choose to keep this infrastructure. It probably ought to be fixed even if we don't. regards, tom lane
I wrote: > I still think it's worth looking into whether the bug can be dodged > by shortening the eval calls. In fact, that does seem to work; I'll commit a patch after testing a bit more. We still need someone to add the missing build dependencies so that make -j is trustworthy again. regards, tom lane
On Sun, Nov 14, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> I still think it's worth looking into whether the bug can be dodged >> by shortening the eval calls. > > In fact, that does seem to work; I'll commit a patch after testing a > bit more. > > We still need someone to add the missing build dependencies so that > make -j is trustworthy again. Yes, please. This is currently failing for me: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -Werror -bundle -multiply_defined suppress -o ascii_and_mic.so ascii_and_mic.o -L../../../../../../src/port -L/opt/local/lib -Wl,-dead_strip_dylibs -Werror -bundle_loader ../../../../../../src/backend/postgres^M ld: file not found: ../../../../../../src/backend/postgres collect2: ld returned 1 exit status make[3]: *** [ascii_and_mic.so] Error 1 make[2]: *** [all-ascii_and_mic-recurse] Error 2 make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
--On 14. November 2010 11:08:13 -0500 Robert Haas <robertmhaas@gmail.com> wrote: > +1. The current master branch fails to build on my (rather new) Mac with > make -j2. I could upgrade my toolchain but it seems like more trouble > than it's worth, not to mention a possible obstacle to new users and > developers. The same here, too. And it doesn't matter if you use the shipped make (3.81) or the one from macports (currently 3.82), both are failing with: ld: file not found: ../../../../../../src/backend/postgres collect2: ld returned 1 exit status make[3]: *** [ascii_and_mic.so] Error 1 make[2]: *** [all-ascii_and_mic-recurse] Error 2 make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... -- Thanks Bernd
On mån, 2010-11-15 at 11:13 +0100, Bernd Helmle wrote: > > --On 14. November 2010 11:08:13 -0500 Robert Haas <robertmhaas@gmail.com> > wrote: > > > +1. The current master branch fails to build on my (rather new) Mac with > > make -j2. I could upgrade my toolchain but it seems like more trouble > > than it's worth, not to mention a possible obstacle to new users and > > developers. > > The same here, too. And it doesn't matter if you use the shipped make > (3.81) or the one from macports (currently 3.82), both are failing with: > > ld: file not found: ../../../../../../src/backend/postgres > collect2: ld returned 1 exit status > make[3]: *** [ascii_and_mic.so] Error 1 > make[2]: *** [all-ascii_and_mic-recurse] Error 2 > make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2 > make[1]: *** Waiting for unfinished jobs.... Untested, but the following should help you, by partially restoring the old builder order on platforms that need it.
Attachment
On Mon, Nov 15, 2010 at 4:10 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> ld: file not found: ../../../../../../src/backend/postgres >> collect2: ld returned 1 exit status >> make[3]: *** [ascii_and_mic.so] Error 1 >> make[2]: *** [all-ascii_and_mic-recurse] Error 2 >> make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2 >> make[1]: *** Waiting for unfinished jobs.... > > Untested, but the following should help you, by partially restoring the > old builder order on platforms that need it. Very odd, but this completely blew up the first time I tried it. In file included from path.c:34: pg_config_paths.h:2:11: error: missing terminating " character In file included from path.c:34: pg_config_paths.h:2: error: missing terminating " character path.c:49: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘static’ That file had a line in it that looked like this: postgresql" On a subsequent retry, I got: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -Werror -bundle -multiply_defined suppress -o dict_snowball.so dict_snowball.o api.o utilities.o stem_ISO_8859_1_danish.o stem_ISO_8859_1_dutch.o stem_ISO_8859_1_english.o stem_ISO_8859_1_finnish.o stem_ISO_8859_1_french.o stem_ISO_8859_1_german.o stem_ISO_8859_1_hungarian.o stem_ISO_8859_1_italian.o stem_ISO_8859_1_norwegian.o stem_ISO_8859_1_porter.o stem_ISO_8859_1_portuguese.o stem_ISO_8859_1_spanish.o stem_ISO_8859_1_swedish.o stem_ISO_8859_2_romanian.o stem_KOI8_R_russian.o stem_UTF_8_danish.o stem_UTF_8_dutch.o stem_UTF_8_english.o stem_UTF_8_finnish.o stem_UTF_8_french.o stem_UTF_8_german.o stem_UTF_8_hungarian.o stem_UTF_8_italian.o stem_UTF_8_norwegian.o stem_UTF_8_porter.o stem_UTF_8_portuguese.o stem_UTF_8_romanian.o stem_UTF_8_russian.o stem_UTF_8_spanish.o stem_UTF_8_swedish.o stem_UTF_8_turkish.o -L../../../src/port -L/opt/local/lib -Wl,-dead_strip_dylibs -Werror -bundle_loader ../../../src/backend/postgres ld: file not found: ../../../src/backend/postgres collect2: ld returned 1 exit status make[2]: *** [dict_snowball.so] Error 1 make[1]: *** [all-backend/snowball-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Very odd, but this completely blew up the first time I tried it. > In file included from path.c:34: > pg_config_paths.h:2:11: error: missing terminating " character FWIW, I didn't replicate that, but I did get this during one attempt with -j4: /usr/bin/ranlib: archive member: libecpg.a(typename.o) size too large (archive \ member extends past the end of the file) ar: internal ranlib command failed make[5]: *** [libecpg.a] Error 1 make[5]: *** Deleting file `libecpg.a' make[4]: *** [submake-ecpglib] Error 2 make[3]: *** [all-compatlib-recurse] Error 2 make[3]: *** Waiting for unfinished jobs.... /usr/bin/ranlib: can't stat file output file: libecpg.a (No such file or direct\ ory) ar: internal ranlib command failed make[4]: *** [libecpg.a] Error 1 make[3]: *** [all-ecpglib-recurse] Error 2 make[2]: *** [all-ecpg-recurse] Error 2 make[1]: *** [all-interfaces-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... In file included from gram.y:12101: scan.c: In function 'yy_try_NUL_trans': scan.c:16242: warning: unused variable 'yyg' make: *** [all-src-recurse] Error 2 Examination of the stdout trace makes it appear that two independent make runs were trying to build src/interfaces/ecpg/ecpglib/libecpg.a concurrently. I haven't dug into it but I suspect that there are multiple dependency chains leading to ecpg/ecpglib/. I wonder whether what you saw was also the result of multiple recursion paths leading to trying to build the same target at once. If so, that's going to put a rather serious crimp in the idea of constraining build order by adding more dependencies. > On a subsequent retry, I got: > ld: file not found: ../../../src/backend/postgres > collect2: ld returned 1 exit status > make[2]: *** [dict_snowball.so] Error 1 Yeah, I got that too, but adding all-backend/snowball-recurse to the set of dependencies proposed in Peter's patch made it go away. A cursory search for other appearances of -bundle_loader in the make output suggests that contrib/ and src/test/regress/ are also at risk. This leads me to the thought that concentrating knowledge of this issue in src/Makefile is not the right way to go at it. And, again, the more paths leading to a make attempt in the same directory, the worse off we are as far as the first problem goes. But surely the "make" guys recognized this risk and have a solution? Otherwise parallel make would be pretty useless. regards, tom lane
I tried another experiment, which was "make -j100 all" on my relatively new Linux box (2 dual-core CPUs). It blew up real good, as per attached stderr output, which shows evidence of more missing dependencies as well as some additional cases of concurrent attempts to build the same target. It's clear to me that we are very far from having a handle on what it'll really take to run parallel builds safely, and I am therefore now of the opinion that we ought to revert the patch. Hypothetical gains in parallelism are useless if we can't actually use parallel building reliably. We are currently worse off than before in terms of time to build the system. regards, tom lane /usr/bin/ld: cannot find -lpgport collect2: ld returned 1 exit status make[3]: *** [refint.so] Error 1 make[2]: *** [../../../contrib/spi/refint.so] Error 2 make[2]: *** Waiting for unfinished jobs.... path.c: In function 'get_html_path': path.c:615: error: 'HTMLDIR' undeclared (first use in this function) path.c:615: error: (Each undeclared identifier is reported only once path.c:615: error: for each function it appears in.) path.c: In function 'get_man_path': path.c:624: error: 'MANDIR' undeclared (first use in this function) make[3]: *** [path.o] Error 1 make[3]: *** Deleting file `path.o' make[3]: *** Waiting for unfinished jobs.... /usr/bin/ld: cannot find -lpgport collect2: ld returned 1 exit status make[3]: *** [autoinc.so] Error 1 make[2]: *** [../../../contrib/spi/autoinc.so] Error 2 make[2]: *** [submake-libpgport] Error 2 make[2]: *** Waiting for unfinished jobs.... ln: creating symbolic link `libpgtypes.so.3': File exists make[4]: *** [libpgtypes.so.3.2] Error 1 make[4]: *** Deleting file `libpgtypes.so.3.2' make[3]: *** [all-pgtypeslib-recurse] Error 2 make[3]: *** Waiting for unfinished jobs.... make[1]: *** [all-test/regress-recurse] Error 2 make[1]: *** Waiting for unfinished jobs.... In file included from gram.y:12102: scan.c: In function 'yy_try_NUL_trans': scan.c:16246: warning: unused variable 'yyg' ln: creating symbolic link `libpq.so.5': File exists make[4]: *** [libpq.so.5.4] Error 1 make[4]: *** Deleting file `libpq.so.5.4' make[3]: *** [submake-libpq] Error 2 make[2]: *** [all-pg_dump-recurse] Error 2 make[2]: *** Waiting for unfinished jobs.... ln: creating symbolic link `libpq.so.5': File exists make[6]: *** [libpq.so.5.4] Error 1 make[6]: *** Deleting file `libpq.so.5.4' make[5]: *** [submake-libpq] Error 2 make[4]: *** [submake-ecpglib] Error 2 make[3]: *** [all-compatlib-recurse] Error 2 /usr/bin/ld: cannot open linker script file ../../../src/interfaces/libpq/libpq.so: No such file or directory collect2: ld returned 1 exit status make[3]: *** [psql] Error 1 make[2]: *** [all-psql-recurse] Error 2 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [createdb] Error 1 make[3]: *** Waiting for unfinished jobs.... ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [createuser] Error 1 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [dropuser] Error 1 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [vacuumdb] Error 1 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [dropdb] Error 1 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [clusterdb] Error 1 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [reindexdb] Error 1 make[2]: *** [all-scripts-recurse] Error 2 ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask' ../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe': /home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask' collect2: ld returned 1 exit status make[3]: *** [pg_ctl] Error 1 make[2]: *** [all-pg_ctl-recurse] Error 2 make[1]: *** [all-bin-recurse] Error 2 make[2]: *** [all-ecpg-recurse] Error 2 make[1]: *** [all-interfaces-recurse] Error 2 make[1]: *** [all-backend-recurse] Error 2 make: *** [all-src-recurse] Error 2
On mån, 2010-11-15 at 23:34 -0500, Tom Lane wrote: > It's clear to me that we are very far from having a handle on what > it'll really take to run parallel builds safely, and I am therefore > now of the opinion that we ought to revert the patch. Hypothetical > gains in parallelism are useless if we can't actually use parallel > building reliably. We are currently worse off than before in terms of > time to build the system. We don't have to revert it, we just have to insert .NOTPARALLEL targets into some places that are not properly "parallelized", thus effectively restoring the behavior of the old for loop. I have attached a patch that gets make -j 100+ working for me. Other platforms might need more things, perhaps. Btw., my original notes for this development were labeled "make make -k work properly". So I would really like to keep that. It just turned out that parallel make could benefit from the same changes, and it's a better marketing name. ;-)
Attachment
Peter Eisentraut <peter_e@gmx.net> writes: > On mån, 2010-11-15 at 23:34 -0500, Tom Lane wrote: >> It's clear to me that we are very far from having a handle on what >> it'll really take to run parallel builds safely, and I am therefore >> now of the opinion that we ought to revert the patch. > We don't have to revert it, we just have to insert .NOTPARALLEL targets > into some places that are not properly "parallelized", thus effectively > restoring the behavior of the old for loop. I have attached a patch > that gets make -j 100+ working for me. Other platforms might need more > things, perhaps. If we don't have to revert it entirely, that's of course better. Please apply what you've got. regards, tom lane