Thread: Re: pltlc and pltlcu problems
[2002-01-19 19:40] Murray Prior Hobbs said: | | i have had no trouble loading and using the pgpgsql language - and it | lives in exactly the same place | | i've done as you suggested though - here is the output Indeed. I just got finished installing a chroot image of redhat-7.2 to test this. I am seeing the same Tcl_CreateInterp problem you mentioned earlier. The pltcl language does not work even from the 7.2b3 rpms. Can someone verify that pltcl works on their stock redhat 7.2 system? Are there a known bugs in the stock 7.2 binutils or any other part of the toolchain that might be causing this problem? Most notably is the absence of pltcl.so being linked to libtcl.so. Could this be a problem with redhat's tcl package? Monty, are you by chance running in a chroot? confounded, brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
[2002-01-20 23:24] Murray Prior Hobbs said: | Brent Verner wrote: | | >[2002-01-19 19:40] Murray Prior Hobbs said: | >| | >| i have had no trouble loading and using the pgpgsql language - and it | >| lives in exactly the same place | >| | >| i've done as you suggested though - here is the output | > | >Indeed. I just got finished installing a chroot image of | >redhat-7.2 to test this. I am seeing the same Tcl_CreateInterp | >problem you mentioned earlier. The pltcl language does not work | >even from the 7.2b3 rpms. Can someone verify that pltcl works on | >their stock redhat 7.2 system? | > | >Are there a known bugs in the stock 7.2 binutils or any other part | >of the toolchain that might be causing this problem? Most notably | >is the absence of pltcl.so being linked to libtcl.so. Could this | >be a problem with redhat's tcl package? | > | >Monty, are you by chance running in a chroot? | > | if you mean me (Murray) nope - it's a bog standard RedHat 7.2 install sorry! I know a guy named "Monty Hobbs"... I'm really too tired ;-) | but i have installed Tcl from the sources from scratch - 8.3.4 Indeed I've tracked the problem down to the line that links the pltcl.so library: make[3]: Entering directory `/usr/local/cvs/pgsql/src/pl/tcl' /bin/sh mkMakefile.tcldefs.sh '/usr/lib/tclConfig.sh' 'Makefile.tcldefs' make[3]: Leaving directory `/usr/local/cvs/pgsql/src/pl/tcl' make[3]: Entering directory `/usr/local/cvs/pgsql/src/pl/tcl' gcc -pipe -O -D__NO_STRING_INLINES -D__NO_MATH_INLINES -fPIC -I../../../src/include -DHAVE_UNISTD_H=1 -DHAVE_LIMITS_H=1-DHAVE_GETCWD=1 -DHAVE_OPENDIR=1 -DHAVE_STRSTR=1 -DHAVE_STRTOL=1 -DHAVE_TMPNAM=1 -DHAVE_WAITPID=1 -DHAVE_UNISTD_H=1-DHAVE_SYS_PARAM_H=1 -DUSE_TERMIOS=1 -DHAVE_SYS_TIME_H=1 -DTIME_WITH_SYS_TIME=1 -DHAVE_TM_ZONE=1 -DHAVE_TM_GMTOFF=1-DHAVE_TIMEZONE_VAR=1 -DHAVE_ST_BLKSIZE=1 -DSTDC_HEADERS=1 -DNEED_MATHERR=1 -DHAVE_SIGNED_CHAR=1 -DHAVE_SYS_IOCTL_H=1 -c -o pltcl.o pltcl.c gcc -pipe -shared -Wl,-soname,libtcl.so.0 -o pltcl.so pltcl.o -L/usr/lib -ltcl -ldl -lieee -lm -lc ^^^^^^^^^^^ IIRC, this was changed to workaround another problem with the tcl client library having name conflicts. This value (TCL_SHLIB_LD) comes directly from the /usr/lib/tclConfig.sh file supplied by the rpm. You can add the following line to src/pl/tcl/Makefile below "-include Makefile.tcldefs" TCL_SHLIB_LD = gcc -shared to override the erronious value supplied by the system's tclConfig.sh. | but just because i'm ignorant of many things - how would i check if i | was running in chroot environment? not sure. I always know when I am, because I setup the chroot. Some web hosts will give you a chroot as well, but if you are developing on your own workstation, there is little chance of you being in a chroot and not knowing it. hth. b -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Brent Verner <brent@rcfile.org> writes: > I am seeing the same Tcl_CreateInterp > problem you mentioned earlier. The pltcl language does not work > even from the 7.2b3 rpms. Can someone verify that pltcl works on > their stock redhat 7.2 system? Hmm, what Tcl version are you using? pltcl does not appear broken on my system, but I think what I've got installed is Tcl 8.0.5. regards, tom lane
Brent Verner <brent@rcfile.org> writes: > Can someone verify that pltcl works on > their stock redhat 7.2 system? Indeed it does not. On a straight-from-the-CD RH 7.2 install and CVS-tip Postgres, I see both of the behaviors Murray complained of. What I think is particularly nasty is that we get an exit(127) when the symbol resolution fails, leading to database restart. This will probably happen on *most* systems not only Linux, because we are specifying RTLD_LAZY in our dlopen() calls, meaning that missing symbols should be flagged when they are referenced at runtime --- and if we call a function that should be there and isn't, there's not much the dynamic loader can do except throw a signal or exit(). What we should be doing is specifying RTLD_NOW to dlopen(), so that any unresolved symbol failure occurs during dlopen(), when we are prepared to deal with it in a clean fashion. I ran into this same behavior years ago on HPUX and fixed it by using what they call BIND_IMMEDIATE mode; but I now see that most of the other ports are specifying RTLD_LAZY, and thus have this problem. Unless I hear a credible counter-argument, I am going to change RTLD_LAZY to RTLD_NOW in src/backend/port/dynloader/linux.h. I have tested that and it produces a clean error with no backend crash. What I would *like* to do is make the same change in all the port/dynloader files that reference RTLD_LAZY:src/backend/port/dynloader/aix.hsrc/backend/port/dynloader/bsdi.hsrc/backend/port/dynloader/dgux.hsrc/backend/port/dynloader/freebsd.hsrc/backend/port/dynloader/irix5.hsrc/backend/port/dynloader/linux.hsrc/backend/port/dynloader/netbsd.hsrc/backend/port/dynloader/openbsd.hsrc/backend/port/dynloader/osf.hsrc/backend/port/dynloader/sco.hsrc/backend/port/dynloader/solaris.hsrc/backend/port/dynloader/svr4.hsrc/backend/port/dynloader/univel.hsrc/backend/port/dynloader/unixware.hsrc/backend/port/dynloader/win.h However I'm a bit scared to do that at this late stage of the release cycle, because perhaps some of these platforms don't support the full dlopen() API. Comments? Can anyone test whether RTLD_NOW works on any of the above-mentioned ports? regards, tom lane
Brent Verner <brent@rcfile.org> writes: > Indeed I've tracked the problem down to the line that links > the pltcl.so library: > gcc -pipe -shared -Wl,-soname,libtcl.so.0 -o pltcl.so pltcl.o -L/usr/lib -ltcl -ldl -lieee -lm -lc > ^^^^^^^^^^^ Yeah, removing the "-Wl,-soname,libtcl.so.0" switch produces a correctly functioning pltcl. > IIRC, this was changed to workaround another problem with the > tcl client library having name conflicts. This value (TCL_SHLIB_LD) > comes directly from the /usr/lib/tclConfig.sh file supplied by the > rpm. I seem to recall that this same problem was being debated a few weeks back, but apparently we didn't actually do anything about it. Looks like we have to. Peter, didn't you have a proposal on the table to fix this? regards, tom lane
> However I'm a bit scared to do that at this late stage of the release > cycle, because perhaps some of these platforms don't support the full > dlopen() API. Comments? Can anyone test whether RTLD_NOW works on > any of the above-mentioned ports? I can confirm that RTLD_NOW exists on BSD/OS. Can we do: #ifdef RTLD_NOW use RTLD_NOW #else whatever_is_there_now#endif in those ports at least for 7.2 so we can be sure we don't get failures. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Murray Prior Hobbs <murray@efone.com> writes: > Brent sent me a fix > ----start Bent's fix ---- > rpm. You can add the following line to src/pl/tcl/Makefile > below "-include Makefile.tcldefs" > TCL_SHLIB_LD = gcc -shared > ---- end Brent's fix ----- > which i tried but did not work Works for me. Did you remove the pltcl.so file so it would be rebuilt? regards, tom lane
Tom Lane writes: > Peter, didn't you have a proposal on the table to fix this? Yeah, complain loudly to whoever dared to package a broken Tcl like that... Or we'll work with Andreas Zeugwetter's patches and eliminate the use of tclConfig.sh mostly. -- Peter Eisentraut peter_e@gmx.net
Tom Lane writes: > Unless I hear a credible counter-argument, I am going to change > RTLD_LAZY to RTLD_NOW in src/backend/port/dynloader/linux.h. I have > tested that and it produces a clean error with no backend crash. > > What I would *like* to do is make the same change in all the > port/dynloader files that reference RTLD_LAZY: RTLD_LAZY allows you to load shared library modules that contain circular references. I don't know if that's useful or just stupid, but on some systems the shared library models are pretty, um, different so that the need for this might arise from time to time. > However I'm a bit scared to do that at this late stage of the release > cycle, because perhaps some of these platforms don't support the full > dlopen() API. Comments? Can anyone test whether RTLD_NOW works on > any of the above-mentioned ports? I really don't think this is a good change to make now, as we don't know how well all of this is supported, and the failure scenario is annoying but not really that harmful. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> Unless I hear a credible counter-argument, I am going to change >> RTLD_LAZY to RTLD_NOW in src/backend/port/dynloader/linux.h. I have >> tested that and it produces a clean error with no backend crash. > RTLD_LAZY allows you to load shared library modules that contain circular > references. Does that not work with RTLD_NOW? I should think it would. In any case, I'm doubtful that we care. > I really don't think this is a good change to make now, as we don't know > how well all of this is supported, and the failure scenario is annoying > but not really that harmful. A database restart is always very bad news in my mind. You might be right that it's too risky to make such a change now for 7.2, but I still absolutely want to do it for 7.3. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: >> Peter, didn't you have a proposal on the table to fix this? > Yeah, complain loudly to whoever dared to package a broken Tcl like > that... Or we'll work with Andreas Zeugwetter's patches and eliminate the > use of tclConfig.sh mostly. Yeah, I was taking a second look at Andreas' patch myself. At the time, the report was that we were only seeing a failure in RPM-packaged Postgres and so I thought that the root problem was somewhere in our RPM script. However, I have now tried it for myself and can confirm that we fail in a Postgres source build too. The bogus soname switch might be blamable on the Tcl RPM package and not on Tcl sources, but that doesn't make a lot of difference to us either way. I'm still quite nervous about making these changes so late in the cycle. OTOH I suspect Andreas was right: we haven't been getting any pltcl portability testing from our beta testers. If it's broken now, we can hardly make it worse. regards, tom lane
Tom Lane writes: > I'm still quite nervous about making these changes so late in the cycle. > OTOH I suspect Andreas was right: we haven't been getting any pltcl > portability testing from our beta testers. This logic can also be reversed: We haven't been getting any beta testing from users of Red Hat 7.1. > If it's broken now, we can hardly make it worse. You can surely make things a lot worse for those that are using other operating systems. I certainly don't agree with making changes just because Red Hat blew it. -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > You can surely make things a lot worse for those that are using other > operating systems. I certainly don't agree with making changes just > because Red Hat blew it. It does appear that the problem can be blamed entirely on the RPM packaging of Tcl. I tried configuring from source on RHL 7.2, and neither tcl 8.3.2 nor 8.3.4 produce a "soname" switch in TCL_SHLIB_LD. In fact, grep can't find any occurrence of "soname" anywhere in the Tcl source distribution. Nonetheless, I'm not sure that "do nothing" is an acceptable response on our part. I tried setting up pltcl's makefile to dike out the offending switch: override TCL_SHLIB_LD := $(patsubst %soname%, , $(TCL_SHLIB_LD)) but could not get it to work --- gmake's pattern matching logic seems to be too brain-dead to cope with more than one % in a pattern. And override TCL_SHLIB_LD := $(patsubst -Wl,-soname%, , $(TCL_SHLIB_LD)) doesn't work either; apparently there's no way to escape the comma. Anyone know a cute hack to get gmake to do this? regards, tom lane
[2002-01-20 17:52] Tom Lane said: | Peter Eisentraut <peter_e@gmx.net> writes: | > You can surely make things a lot worse for those that are using other | > operating systems. I certainly don't agree with making changes just | > because Red Hat blew it. | | It does appear that the problem can be blamed entirely on the RPM | packaging of Tcl. I tried configuring from source on RHL 7.2, and | neither tcl 8.3.2 nor 8.3.4 produce a "soname" switch in TCL_SHLIB_LD. | In fact, grep can't find any occurrence of "soname" anywhere in the | Tcl source distribution. | | Nonetheless, I'm not sure that "do nothing" is an acceptable response | on our part. Agreed. I think working around this borkenness in the Makefile is the best solution; I don't think switching from RTLD_LAZY is good right now. | I tried setting up pltcl's makefile to dike out the offending switch: | | override TCL_SHLIB_LD := $(patsubst %soname%, , $(TCL_SHLIB_LD)) | | but could not get it to work --- gmake's pattern matching logic seems | to be too brain-dead to cope with more than one % in a pattern. And | | override TCL_SHLIB_LD := $(patsubst -Wl,-soname%, , $(TCL_SHLIB_LD)) | | doesn't work either; apparently there's no way to escape the comma. | Anyone know a cute hack to get gmake to do this? It seems that substvar operates on each " " separated token in the string. The following works for me. override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname.*//') cheers. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Brent Verner <brent@rcfile.org> writes: > It seems that substvar operates on each " " separated token in the > string. The following works for me. > override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname.*//') I suspect that the above works only because -Wl,-soname is the last switch in TCL_SHLIB_LD; any following switches would be removed too. Perhaps better override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' regards, tom lane
[2002-01-20 19:16] Tom Lane said: | Brent Verner <brent@rcfile.org> writes: | > It seems that substvar operates on each " " separated token in the | > string. The following works for me. | | > override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname.*//') | | I suspect that the above works only because -Wl,-soname is the last | switch in TCL_SHLIB_LD; any following switches would be removed too. | Perhaps better | | override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' Yes, much better. cheers. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Murray Prior Hobbs <murray@efone.com> writes: > tell me oh mighty guru's > what linux distribution could i use to make me a happy happy man Just apply this patch and RHL should work. *** src/pl/tcl/Makefile.orig Sat Oct 13 00:23:50 2001 --- src/pl/tcl/Makefile Sun Jan 20 21:57:45 2002 *************** *** 49,54 **** --- 49,58 ---- endif endif + # Suppress bogus soname switch that RedHat RPMs put into tclConfig.sh + override TCL_SHLIB_LD := $(shell echo "$(TCL_SHLIB_LD)" | sed 's/-Wl,-soname[^ ]*//') + + %$(TCL_SHLIB_SUFFIX): %.o $(TCL_SHLIB_LD) -o $@ $< $(TCL_LIB_SPEC) $(SHLIB_EXTRA_LIBS) regards, tom lane
Tom Lane writes: > Just apply this patch and RHL should work. I'm OK with this patch. (Although you don't need the override.) We should file a bug report with Red Hat, methinks. > *** src/pl/tcl/Makefile.orig Sat Oct 13 00:23:50 2001 > --- src/pl/tcl/Makefile Sun Jan 20 21:57:45 2002 > *************** > *** 49,54 **** > --- 49,58 ---- > endif > endif > > + # Suppress bogus soname switch that RedHat RPMs put into tclConfig.sh > + override TCL_SHLIB_LD := $(shell echo "$(TCL_SHLIB_LD)" | sed 's/-Wl,-soname[^ ]*//') > + > + > %$(TCL_SHLIB_SUFFIX): %.o > $(TCL_SHLIB_LD) -o $@ $< $(TCL_LIB_SPEC) $(SHLIB_EXTRA_LIBS) -- Peter Eisentraut peter_e@gmx.net
Peter Eisentraut <peter_e@gmx.net> writes: > I'm OK with this patch. (Although you don't need the override.) Okay, committed. (I left in the override; it can't hurt can it?) > We should file a bug report with Red Hat, methinks. Trond, I think this is your turf ... is it a bug? regards, tom lane >> *** src/pl/tcl/Makefile.orig Sat Oct 13 00:23:50 2001 >> --- src/pl/tcl/Makefile Sun Jan 20 21:57:45 2002 >> *************** >> *** 49,54 **** >> --- 49,58 ---- >> endif >> endif >> >> + # Suppress bogus soname switch that RedHat RPMs put into tclConfig.sh >> + override TCL_SHLIB_LD := $(shell echo "$(TCL_SHLIB_LD)" | sed 's/-Wl,-soname[^ ]*//') >> + >> + >> %$(TCL_SHLIB_SUFFIX): %.o >> $(TCL_SHLIB_LD) -o $@ $< $(TCL_LIB_SPEC) $(SHLIB_EXTRA_LIBS) > -- > Peter Eisentraut peter_e@gmx.net
On Sun, Jan 20, 2002 at 01:40:17PM -0500, Tom Lane wrote: > What I would *like* to do is make the same change in all the > port/dynloader files that reference RTLD_LAZY: > src/backend/port/dynloader/openbsd.h I can't speak for other platforms but openbsd only has RTLD_LAZY. -- David Terrell | "... a grandiose, wasteful drug war that will never dbt@meat.net | be won as long as so many Americans need to Nebcorp Prime Minister | anesthetize themselves to get through the day." http://wwn.nebcorp.com/ | -Camille Paglia
Re: RTLD_LAZY considered harmful (Re: pltlc and pltlcu problems)
From
"Christopher Kings-Lynne"
Date:
> On Sun, Jan 20, 2002 at 01:40:17PM -0500, Tom Lane wrote: > > What I would *like* to do is make the same change in all the > > port/dynloader files that reference RTLD_LAZY: > > src/backend/port/dynloader/openbsd.h > > I can't speak for other platforms but openbsd only has RTLD_LAZY. FreeBSD supports both: RTLD_LAZY Each external function reference is resolved when the func- tion is first called. RTLD_NOW All external function references are bound immediately by dlopen(). RTLD_LAZY is normally preferred, for reasons of efficiency. However, RTLD_NOW is useful to ensure that any undefinedsymbols are discovered Chris
Christopher Kings-Lynne wrote: > > On Sun, Jan 20, 2002 at 01:40:17PM -0500, Tom Lane wrote: > > > What I would *like* to do is make the same change in all the > > > port/dynloader files that reference RTLD_LAZY: > > > src/backend/port/dynloader/openbsd.h > > > > I can't speak for other platforms but openbsd only has RTLD_LAZY. > > FreeBSD supports both: > > RTLD_LAZY Each external function reference is resolved when the func- > tion is first called. > > RTLD_NOW All external function references are bound immediately by > dlopen(). > > RTLD_LAZY is normally preferred, for reasons of efficiency. However, > RTLD_NOW is useful to ensure that any undefined symbols are discovered > Interesting LAZY has better efficiency. Seems we should just keep LAZY as our default for future releases and tell people if they link to bad object files, they should expect trouble. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: > > > I'm still quite nervous about making these changes so late in the cycle. > > OTOH I suspect Andreas was right: we haven't been getting any pltcl > > portability testing from our beta testers. > > This logic can also be reversed: We haven't been getting any beta testing > from users of Red Hat 7.1. I don't think the tcl there had a proper so-name (from memory, I don't use tcl) -- Trond Eivind Glomsrød Red Hat, Inc.
Tom Lane <tgl@sss.pgh.pa.us> writes: > Peter Eisentraut <peter_e@gmx.net> writes: > > I'm OK with this patch. (Although you don't need the override.) > > Okay, committed. (I left in the override; it can't hurt can it?) > > > We should file a bug report with Red Hat, methinks. > > Trond, I think this is your turf ... is it a bug? (Note, I don't do tcl.) At least part of it is intentional - adding the so name to the tcl library. That said, it looks like a bug to me too... at the minimum, the soname should be removed from the tclConfig.sh script after use in generating the tcl libraries. Adrian? -- Trond Eivind Glomsrød Red Hat, Inc.
On Sun, Jan 20, 2002 at 01:40:17PM -0500, Tom Lane wrote: > cycle, because perhaps some of these platforms don't support the full > dlopen() API. Comments? Can anyone test whether RTLD_NOW works on > any of the above-mentioned ports? Didn't check it *works*, but from $NetBSD: dlfcn.h,v 1.13 2000/06/13 01:21:53 /* Values for dlopen `mode'. */ #define RTLD_LAZY 1 #define RTLD_NOW 2 #define RTLD_GLOBAL 0x100 /* Allow global searches in object */ #define RTLD_LOCAL 0x200 #if !defined(_XOPEN_SOURCE) #define DL_LAZY RTLD_LAZY /* Compat */ #endif Cheers, Patrick
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Interesting LAZY has better efficiency. "Efficiency" by what measure? I would think that batch resolution of symbols would be faster than taking a trap for each one. > Seems we should just keep LAZY > as our default for future releases and tell people if they link to bad > object files, they should expect trouble. (a) How are they going to find out if the object files are bad, other than by crashing their database? I *really* don't like the attitude that a backend crash is okay. Under any circumstances, development or not. (b) Badness may depend on context, eg LD_LIBRARY_PATH. So it's not really safe to assume that if it worked before then you don't have to worry about it crashing you in production. regards, tom lane
Bruce Momjian writes: > Interesting LAZY has better efficiency. Seems we should just keep LAZY > as our default for future releases and tell people if they link to bad > object files, they should expect trouble. In practice, we load object files only if we call the function, so symbol resolution happens either way briefly after loading. RTLD_NOW includes some overhead because it checks symbols that we might not end up needing, but for the typical PostgreSQL extension module, that should really not matter. -- Peter Eisentraut peter_e@gmx.net
On Sun, Jan 20, 2002 at 07:16:50PM -0500, Tom Lane wrote: > Brent Verner <brent@rcfile.org> writes: > > It seems that substvar operates on each " " separated token in the > > string. The following works for me. > > > override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname.*//') > > I suspect that the above works only because -Wl,-soname is the last > switch in TCL_SHLIB_LD; any following switches would be removed too. > Perhaps better > > override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' Sorry, but by this you broke freebsd build which has: TCL_SHLIB_LD = ld -shared -x -soname $@ and $@ gets substituted too early can you restrict this hack by putting something like ifeq ($(PORTNAME), linux) override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' endif instead?
Red Hat 7.2 Regression failures (Re: pltcl build problem on FreeBSD (was: Re: pltlc and pltlcu problems))
From
Lamar Owen
Date:
[I trimmed the CC list. It was getting out of hand. Boy, how I despise having to use reply-all.... :-)] On Wednesday 23 January 2002 04:44 am, Murray Prior Hobbs wrote: > and ran the check (make check) - 79 tests passed > then i ran the make installcheck > and get precisely the same failures as i got on my 686 over the last 3 days > could somebody else confirm these findings please or suggest what's going > on This probably has nothing to do with the TCL issue. It is the locale setting biting you. The first regression run using make check is using the C locale -- which passes all tests. The second run isn't using the C locale, apparently. And those tests fail when the locale is not 'C'. -- Lamar Owen WGCR Internet Radio 1 Peter 4:11
Murray Prior Hobbs <murray@efone.com> writes: > i made and installed > and ran the check (make check) - 79 tests passed > then i ran the make installcheck > and get precisely the same failures as i got on my 686 over the last 3 days PATH pointing at the wrong thing, or other conflicting environment variables (eg PGPORT), I'd guess. regards, tom lane
Vsevolod Lobko <seva@sevasoft.kiev.ua> writes: >> Perhaps better >> >> override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' > Sorry, but by this you broke freebsd build which has: > TCL_SHLIB_LD = ld -shared -x -soname $@ > and $@ gets substituted too early How annoying. I was debating whether to use single or double quotes around the echo parameter --- looks like I made the wrong choice. Will fix. regards, tom lane
Vsevolod Lobko writes: > > override TCL_SHLIB_LD := $(shell echo $(TCL_SHLIB_LD) | sed 's/-Wl,-soname[^ ]*//' > > Sorry, but by this you broke freebsd build which has: > > TCL_SHLIB_LD = ld -shared -x -soname $@ > > and $@ gets substituted too early I've submitted a bug report to FreeBSD about this. Let's hope they fix it soon. -- Peter Eisentraut peter_e@gmx.net
[2002-01-23 11:54] Tom Lane said: | Peter Eisentraut <peter_e@gmx.net> writes: | > I've submitted a bug report to FreeBSD about this. Let's hope they fix it | > soon. | | Will it not work to do | | $(shell echo '$(TCL_SHLIB_LD)' | sed ... No. I just tested this, and the $@ still got expanded too early. We'll need to use the suggested ifeq($(PORTNAME),linux) test. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Brent Verner <brent@rcfile.org> writes: > | Will it not work to do > | > | $(shell echo '$(TCL_SHLIB_LD)' | sed ... > No. I just tested this, and the $@ still got expanded too early. [ scratches head ... ] Where is the expansion happening, then? Seems weird. > We'll need to use the suggested ifeq($(PORTNAME),linux) test. I don't much like that since it makes an inappropriate assumption, viz that if you're on Linux you must have a TCL_SHLIB_LD value that hasn't got any $variables in it. I'd prefer to figure out *why* we are getting a premature evaluation. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: > I've submitted a bug report to FreeBSD about this. Let's hope they fix it > soon. Will it not work to do $(shell echo '$(TCL_SHLIB_LD)' | sed ... ? regards, tom lane
Tom Lane wrote: > Brent Verner <brent@rcfile.org> writes: > > | Will it not work to do > > | > > | $(shell echo '$(TCL_SHLIB_LD)' | sed ... > > > No. I just tested this, and the $@ still got expanded too early. > > [ scratches head ... ] Where is the expansion happening, then? Seems > weird. > > > We'll need to use the suggested ifeq($(PORTNAME),linux) test. > > I don't much like that since it makes an inappropriate assumption, > viz that if you're on Linux you must have a TCL_SHLIB_LD value that > hasn't got any $variables in it. I'd prefer to figure out *why* we > are getting a premature evaluation. As a data point, now that FreeBSD is showing problems too, I see this on BSD/OS with TCL 8.3 in tclConfig.sh: TCL_SHLIB_LD='cc -shared' Does this mean I don't have the problem here? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
[2002-01-23 12:06] Tom Lane said: | Brent Verner <brent@rcfile.org> writes: | > | Will it not work to do | > | | > | $(shell echo '$(TCL_SHLIB_LD)' | sed ... | | > No. I just tested this, and the $@ still got expanded too early. | | [ scratches head ... ] Where is the expansion happening, then? Seems | weird. apparently the $(shell ...) construct expands any shell-like vars. from make's info file: The `shell' function performs the same function that backquotes (``') perform in most shells: it does "command expansion". | > We'll need to use the suggested ifeq($(PORTNAME),linux) test. | | I don't much like that since it makes an inappropriate assumption, | viz that if you're on Linux you must have a TCL_SHLIB_LD value that | hasn't got any $variables in it. I'd prefer to figure out *why* we | are getting a premature evaluation. maybe check for a '$' in the TCL_SHLIB_LD ? I've been trying $(findstring ...) but have not gotten that to work right yet. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Tom Lane writes: > [ scratches head ... ] Where is the expansion happening, then? Seems > weird. The $@ is expanded as a make variable. Make does care whether you're executing a $(shell) thing around it. However, it seems that $@ should expand to nothing in that assignment, so where's the problem? -- Peter Eisentraut peter_e@gmx.net
Tom Lane <tgl@sss.pgh.pa.us> writes: > Brent Verner <brent@rcfile.org> writes: > > | Will it not work to do > > | > > | $(shell echo '$(TCL_SHLIB_LD)' | sed ... > > > No. I just tested this, and the $@ still got expanded too early. > > [ scratches head ... ] Where is the expansion happening, then? Seems > weird. The expansion is done by make. It does expand variables recursively, even for :=. Using $$@ instead of $@ should help in this particular case, I think. Bernhard -- Intevation GmbH http://intevation.de/ Sketch http://sketch.sourceforge.net/ MapIt! http://mapit.de/
Bernhard Herzog <bh@intevation.de> writes: > Using $$@ instead of $@ should help in this particular > case, I think. But we haven't got control of what the initial value of TCL_SHLIB_LD is. Hmm. Wait a minute; we're going about this all wrong. Instead of hacking the Makefile, let's hack mkMakefile.tcldefs.sh. We can definitely fix the TCL_SHLIB_LD value there, before Make sees it. regards, tom lane
Peter Eisentraut <peter_e@gmx.net> writes: > Tom Lane writes: >> [ scratches head ... ] Where is the expansion happening, then? Seems >> weird. > The $@ is expanded as a make variable. Make does care whether you're > executing a $(shell) thing around it. However, it seems that $@ should > expand to nothing in that assignment, so where's the problem? I think the complaint is that we need it to still look like $@ when TCL_SHLIB_LD is used in the shlib-building rule. If Make recursively expands the variable before executing the $shell construct then we got trouble. Ugly as it is, the check on portname may be the best solution available. I'm gonna think a little more, but I haven't got a better idea now. regards, tom lane
[2002-01-23 12:29] Brent Verner said: | [2002-01-23 12:06] Tom Lane said: | | Brent Verner <brent@rcfile.org> writes: | | > | Will it not work to do | | > | | | > | $(shell echo '$(TCL_SHLIB_LD)' | sed ... | | | | > No. I just tested this, and the $@ still got expanded too early. | | | | [ scratches head ... ] Where is the expansion happening, then? Seems | | weird. | | apparently the $(shell ...) construct expands any shell-like | vars. | | from make's info file: | | The `shell' function performs the same function that backquotes | (``') perform in most shells: it does "command expansion". | | | > We'll need to use the suggested ifeq($(PORTNAME),linux) test. | | | | I don't much like that since it makes an inappropriate assumption, | | viz that if you're on Linux you must have a TCL_SHLIB_LD value that | | hasn't got any $variables in it. I'd prefer to figure out *why* we | | are getting a premature evaluation. | | maybe check for a '$' in the TCL_SHLIB_LD ? I've been trying | $(findstring ...) but have not gotten that to work right yet. The best I can come up with is checking for the offending string 'libtcl.so.0' in the TCL_SHLIB_LD. ifneq (,$(findstring libtcl.so.0,$(TCL_SHLIB_LD))) TCL_SHLIB_LD := $(shell echo '$(TCL_SHLIB_LD)' | sed 's/-Wl,-soname[^]*//') endif any better ideas out there? I've exhausted my PG time for today ;-) brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
On Wed, Jan 23, 2002 at 12:59:00PM -0500, Peter Eisentraut wrote: > Tom Lane writes: > > > [ scratches head ... ] Where is the expansion happening, then? Seems > > weird. > > The $@ is expanded as a make variable. Make does care whether you're > executing a $(shell) thing around it. However, it seems that $@ should > expand to nothing in that assignment, so where's the problem? exactly in that, so make triing to execute ld -shared -x -soname -o pltcl.so pltcl.o ^ no arguments to -soname
Re: Red Hat 7.2 Regression failures (Re: pltcl build problem on FreeBSD (was: Re: pltlc and pltlcu problems))
From
Murray Prior Hobbs
Date:
ok i dropped the locale support and reconfigured, remade, reinstalled etc etc on two machines both machines failed - but with different failures see below Lamar's mail for deatils of the two sets of failures murray ps - i'll remove configure options until nothing fails Lamar Owen wrote: >[I trimmed the CC list. It was getting out of hand. Boy, how I despise having >to use reply-all.... :-)] > >On Wednesday 23 January 2002 04:44 am, Murray Prior Hobbs wrote: > >>and ran the check (make check) - 79 tests passed >> >>then i ran the make installcheck >> >>and get precisely the same failures as i got on my 686 over the last 3 days >> >>could somebody else confirm these findings please or suggest what's going >>on >> > >This probably has nothing to do with the TCL issue. It is the locale setting >biting you. The first regression run using make check is using the C locale >-- which passes all tests. The second run isn't using the C locale, >apparently. And those tests fail when the locale is not 'C'. > these are the make checkinstall failures for the following configuarion ./configure --enable-multibyte=UNICODE --enable-unicode-conversion --enable-locale --with-tcl --without-tk --enable-odbc--with-unixodbc --enable-syslog /bin/sh ./pg_regress --schedule=./serial_schedule --multibyte=UNICODE (using postmaster on Unix socket, default port) ============== dropping database "regression" ============== ERROR: DROP DATABASE: database "regression" does not exist dropdb: database removal failed ============== creating database "regression" ============== CREATE DATABASE ============== dropping regression test user accounts ============== ============== installing PL/pgSQL ============== ============== running regression test queries ============== test boolean ... ok test char ... FAILED test name ... ok test varchar ... FAILED test text ... ok test int2 ... ok test int4 ... ok test int8 ... FAILED test oid ... ok test float4 ... ok test float8 ... ok test bit ... ok test numeric ... FAILED test strings ... ok test numerology ... ok test point ... ok test lseg ... ok test box ... ok test path ... ok test polygon ... ok test circle ... ok test date ... ok test time ... ok test timetz ... ok test timestamp ... ok test timestamptz ... ok test interval ... ok test abstime ... ok test reltime ... ok test tinterval ... ok test inet ... ok test comments ... ok test oidjoins ... ok test type_sanity ... ok test opr_sanity ... ok test geometry ... ok test horology ... ok test create_function_1 ... ok test create_type ... ok test create_table ... ok test create_function_2 ... ok test copy ... FAILED test constraints ... ok test triggers ... ok test create_misc ... ok test create_aggregate ... ok test create_operator ... ok test create_index ... ok test inherit ... ok test create_view ... ok test sanity_check ... ok test errors ... ok test select ... ok test select_into ... ok test select_distinct ... ok test select_distinct_on ... ok test select_implicit ... FAILED test select_having ... FAILED test subselect ... ok test union ... ok test case ... ok test join ... ok test aggregates ... ok test transactions ... ok test random ... ok test portals ... ok test arrays ... ok test btree_index ... ok test hash_index ... ok test privileges ... ok test misc ... FAILED test select_views ... FAILED test alter_table ... ok test portals_p2 ... ok test rules ... ok test foreign_key ... ok test limit ... ok test plpgsql ... ok test temp ... ok =======================9 of 79 tests failed. ======================= these are the make checkinstall failures for the following configuarion - the locale option is dropped ./configure --enable-multibyte=UNICODE --enable-unicode-conversion --with-tcl --without-tk --enable-odbc --with-unixodbc--enable-syslog /bin/sh ./pg_regress --schedule=./serial_schedule --multibyte=UNICODE (using postmaster on Unix socket, default port) ============== dropping database "regression" ============== ERROR: DROP DATABASE: database "regression" does not exist dropdb: database removal failed ============== creating database "regression" ============== CREATE DATABASE ============== dropping regression test user accounts ============== ============== installing PL/pgSQL ============== ============== running regression test queries ============== test boolean ... ok test char ... ok test name ... ok test varchar ... ok test text ... ok test int2 ... ok test int4 ... ok test int8 ... ok test oid ... ok test float4 ... ok test float8 ... ok test bit ... ok test numeric ... ok test strings ... ok test numerology ... ok test point ... ok test lseg ... ok test box ... ok test path ... ok test polygon ... ok test circle ... ok test date ... ok test time ... ok test timetz ... ok test timestamp ... ok test timestamptz ... ok test interval ... ok test abstime ... ok test reltime ... ok test tinterval ... ok test inet ... ok test comments ... ok test oidjoins ... ok test type_sanity ... ok test opr_sanity ... ok test geometry ... ok test horology ... ok test create_function_1 ... ok test create_type ... ok test create_table ... ok test create_function_2 ... ok test copy ... FAILED test constraints ... ok test triggers ... ok test create_misc ... ok test create_aggregate ... ok test create_operator ... ok test create_index ... ok test inherit ... ok test create_view ... ok test sanity_check ... ok test errors ... ok test select ... FAILED test select_into ... ok test select_distinct ... FAILED test select_distinct_on ... FAILED test select_implicit ... ok test select_having ... ok test subselect ... ok test union ... ok test case ... ok test join ... ok test aggregates ... FAILED test transactions ... ok test random ... failed (ignored) test portals ... ok test arrays ... ok test btree_index ... ok test hash_index ... ok test privileges ... ok test misc ... FAILED test select_views ... ok test alter_table ... ok test portals_p2 ... FAILED test rules ... ok test foreign_key ... ok test limit ... FAILED test plpgsql ... ok test temp ... ok ====================================================9 of 79 tests failed, 1 of these failures ignored. ====================================================
Vsevolod Lobko writes: > Sorry, but by this you broke freebsd build which has: > > TCL_SHLIB_LD = ld -shared -x -soname $@ > > and $@ gets substituted too early FreeBSD has fixed this now. -- Peter Eisentraut peter_e@gmx.net
I hate to sound like a broken record, but I want to re-open that discussion about RTLD_LAZY binding that trailed off a week or two ago. I have just noticed that the 7.0 and 7.1 versions of src/backend/port/dynloader/linux.h have #define pg_dlopen(f) dlopen(f, 2) which in 7.2 has been changed to #define pg_dlopen(f) dlopen((f), RTLD_LAZY | RTLD_GLOBAL) But a quick look in /usr/include/bits/dlfcn.h shows that (at least on RH Linux 7.2), the old coding was equivalent to RTLD_NOW. I therefore assert that the current coding is effectively untested on Linux, which is probably our most popular platform, and therefore it should *NOT* be accorded the respect normally due to the status quo. Arguably, 7.2 has introduced breakage here. A grep through the 7.1 versions of src/backend/port/dynloader/*.h shows the following rather motley assortment of dlopen flag choices: aix.h:61:#define pg_dlopen(f) dlopen(f, RTLD_LAZY) bsdi.h:23:#define pg_dlopen(f) dlopen(f, RTLD_LAZY) dgux.h:26:#define pg_dlopen(f) dlopen(f,1) freebsd.h:36:#define pg_dlopen(f) BSD44_derived_dlopen(f, 1) irix5.h:29:#define pg_dlopen(f) dlopen(f,1) linux.h:34:#define pg_dlopen(f) dlopen(f, 2) netbsd.h:36:#define pg_dlopen(f) BSD44_derived_dlopen(f, 1) openbsd.h:36:#define pg_dlopen(f) BSD44_derived_dlopen(f, 1) osf.h:31:#define pg_dlopen(f) dlopen(f, RTLD_LAZY) sco.h:29:#define pg_dlopen(f) dlopen(f,1) solaris.h:9:#define pg_dlopen(f) dlopen(f,1) sunos4.h:29:#define pg_dlopen(f) dlopen(f, 1) svr4.h:29:#define pg_dlopen(f) dlopen(f,RTLD_LAZY) univel.h:29:#define pg_dlopen(f) dlopen(f,RTLD_LAZY) unixware.h:29:#define pg_dlopen(f) dlopen(f,RTLD_LAZY) win.h:29:#define pg_dlopen(f) dlopen(f,1) In 7.2 these have all been changed to "RTLD_LAZY | RTLD_GLOBAL", but I am no longer willing to presume that that's equivalent to the original coding. Could people who have these platforms look to see what the numeric values mentioned above actually equate to on their platforms? regards, tom lane
[2002-02-10 20:00] Tom Lane said: | freebsd.h:36:#define pg_dlopen(f) BSD44_derived_dlopen(f, 1) | netbsd.h:36:#define pg_dlopen(f) BSD44_derived_dlopen(f, 1) freebsd 4.5 netbsd 1.5.2 #define RTLD_LAZY 1 #define RTLD_GLOBAL 0x100 cheers. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Brent Verner <brent@rcfile.org> writes: > freebsd 4.5 > netbsd 1.5.2 > #define RTLD_LAZY 1 > #define RTLD_GLOBAL 0x100 Thanks. Is there an RTLD_NOW symbol on those platforms? regards, tom lane
[2002-02-10 21:24] Tom Lane said: | Brent Verner <brent@rcfile.org> writes: | > freebsd 4.5 | > netbsd 1.5.2 | | > #define RTLD_LAZY 1 | > #define RTLD_GLOBAL 0x100 | | Thanks. Is there an RTLD_NOW symbol on those platforms? yes. I've attached the dlfcn.h files from each incase there is anything else in there that might be of interest. cheers. brent -- "Develop your talent, man, and leave the world something. Records are really gifts from people. To think that an artist would love you enough to share his music with anyone is a beautiful thing." -- Duane Allman
Attachment
I wrote: > I hate to sound like a broken record, but I want to re-open that > discussion about RTLD_LAZY binding that trailed off a week or two > ago. > ... I therefore assert that the current coding is effectively untested > on Linux, which is probably our most popular platform, and therefore > it should *NOT* be accorded the respect normally due to the status > quo. Arguably, 7.2 has introduced breakage here. After some further digging around on the net, I believe that coding in the following style is safe and will work on all systems supporting dlopen(): /** In older systems, like SunOS 4.1.3, the RTLD_NOW flag isn't defined* and the mode argument to dlopen must always be 1. The RTLD_GLOBAL* flag is wanted if available, but it doesn't exist everywhere.* If it doesn't exist, set it to 0 so ithas no effect.*/ #ifndef RTLD_NOW # define RTLD_NOW 1 #endif #ifndef RTLD_GLOBAL # define RTLD_GLOBAL 0 #endif #define pg_dlopen(f) dlopen((f), RTLD_NOW | RTLD_GLOBAL) I also believe that this will produce more consistent cross-platform behavior: so far as I could learn from googling, systems that do not define RTLD_NOW/RTLD_LAZY all act as though the mode were RTLD_NOW, ie, immediate binding. Any objections to modifying all the port/dynloader files this way? regards, tom lane
On Mon, Feb 11, 2002 at 07:49:57PM -0500, Tom Lane wrote: > I also believe that this will produce more consistent cross-platform > behavior: so far as I could learn from googling, systems that do not > define RTLD_NOW/RTLD_LAZY all act as though the mode were RTLD_NOW, > ie, immediate binding. > > Any objections to modifying all the port/dynloader files this way? OpenBSD: The dlopen() function takes a name of a shared object as its first argu- ment. The shared object is mappedinto the address space, relocated, and its external references are resolved in the same way as is done with the implicitly loaded shared libraries at program startup. The path argument can either be an absolute pathname or it can be of the form ``lib<name>.so[.xx[.yy]]'' in whichcase the same library search rules apply that are used for ``intrinsic'' shared library searches. The secondargument currently has no effect, but should be set to DL_LAZY for future compatibility. That last sentence being key.... -- David Terrell | "War is peace, Prime Minister, Nebcorp | freedom is slavery, dbt@meat.net | ignorance is strength http://wwn.nebcorp.com/ | Dishes are clean." - Chris Fester