Thread: narwhal and PGDLLIMPORT
I happened to notice today that the owner of buildfarm member narwhal is trying to revive it after a long time offline, but it's failing in the 9.3 branch (and not attempting to build HEAD, yet). The cause appears to be that contrib/postgres_fdw is referencing the DateStyle and IntervalStyle variables, which aren't marked PGDLLIMPORT. Hm, well, that would be an easy change ... but that code was committed last March. How is it that we didn't notice this long ago? What this seems to indicate is that narwhal is the only buildfarm animal that has a need for PGDLLIMPORT marks on global variables that are referenced by extensions. Furthermore, nobody has attempted to build 9.3 on a platform that needs that (or at least they've not reported failure to us). According to the buildfarm database, narwhal is running a gcc build on Windows 2003. That hardly seems like a mainstream use case. I could believe that it might be of interest to developers, but clearly no developers are actually running such a build. I think we should give serious consideration to desupporting this combination so that we can get rid of the plague of PGDLLIMPORT marks. Obviously this would depend on confirming that there are no more-interesting Windows build methods that require it --- but if there are any, I'd sure demand that there be an active buildfarm instance to keep us from breaking the case again in future. Thoughts? regards, tom lane
On 2014-02-01 20:03:58 -0500, Tom Lane wrote: > I think we should give serious consideration to desupporting this > combination so that we can get rid of the plague of PGDLLIMPORT > marks. Obviously this would depend on confirming that there are > no more-interesting Windows build methods that require it --- but > if there are any, I'd sure demand that there be an active buildfarm > instance to keep us from breaking the case again in future. Weren't there more recent cases of needing to add PGDLLIMPORTs? I certainly remember pushing code to Craig's jenkins instance which made me do so for recent msvc builds... E.g. 7d7eee8bb702d7796a0d7c5886c1f4685f2e2806 and 708c529c7fdeba9387825d746752fc6f439d781e just a few days ago seems to be some of those cases. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/01/2014 08:03 PM, Tom Lane wrote: > I happened to notice today that the owner of buildfarm member narwhal > is trying to revive it after a long time offline, but it's failing in > the 9.3 branch (and not attempting to build HEAD, yet). The cause > appears to be that contrib/postgres_fdw is referencing the DateStyle > and IntervalStyle variables, which aren't marked PGDLLIMPORT. > Hm, well, that would be an easy change ... but that code was committed > last March. How is it that we didn't notice this long ago? > > What this seems to indicate is that narwhal is the only buildfarm > animal that has a need for PGDLLIMPORT marks on global variables that > are referenced by extensions. Furthermore, nobody has attempted to > build 9.3 on a platform that needs that (or at least they've not > reported failure to us). > > According to the buildfarm database, narwhal is running a gcc build on > Windows 2003. That hardly seems like a mainstream use case. I could > believe that it might be of interest to developers, but clearly no > developers are actually running such a build. > > I think we should give serious consideration to desupporting this > combination so that we can get rid of the plague of PGDLLIMPORT > marks. Obviously this would depend on confirming that there are > no more-interesting Windows build methods that require it --- but > if there are any, I'd sure demand that there be an active buildfarm > instance to keep us from breaking the case again in future. > > Thoughts? > > I don't think that it's Windows 2003 that matters so much here as the fact that it's a very old version of gcc. For example, my fairly old buildfarm animal frogmouth, which will be decommissioned soon when WindowsNT is totally unsupported, runs gcc 4.5.0, whereas narwhal runs 3.4.2. If anyone is really using such an old version, I think they should stop, immediately. cheers andrew
On 2014-02-02 02:15:39 +0100, Andres Freund wrote: > On 2014-02-01 20:03:58 -0500, Tom Lane wrote: > > I think we should give serious consideration to desupporting this > > combination so that we can get rid of the plague of PGDLLIMPORT > > marks. Obviously this would depend on confirming that there are > > no more-interesting Windows build methods that require it --- but > > if there are any, I'd sure demand that there be an active buildfarm > > instance to keep us from breaking the case again in future. > > Weren't there more recent cases of needing to add PGDLLIMPORTs? I > certainly remember pushing code to Craig's jenkins instance which made > me do so for recent msvc > builds... E.g. 7d7eee8bb702d7796a0d7c5886c1f4685f2e2806 and > 708c529c7fdeba9387825d746752fc6f439d781e just a few days ago seems to be > some of those cases. And indeed, currawong failed with a link time error: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=currawong&dt=2014-01-17%2013%3A30%3A21 I don't know why it didn't fail for postgres_fdw. Hm, maybe it's because Mkvcbuild.pm links postgres_fdw with libpq, but not test_shm_mq? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-01 20:03:58 -0500, Tom Lane wrote: >> I think we should give serious consideration to desupporting this >> combination so that we can get rid of the plague of PGDLLIMPORT >> marks. Obviously this would depend on confirming that there are >> no more-interesting Windows build methods that require it --- but >> if there are any, I'd sure demand that there be an active buildfarm >> instance to keep us from breaking the case again in future. > Weren't there more recent cases of needing to add PGDLLIMPORTs? Yeah, which makes the plot even thicker. If references to those variables failed, how did we not notice postgres_fdw's issue? I don't claim to know exactly what's going on, but I think we need to find out. regards, tom lane
Andres Freund <andres@2ndquadrant.com> writes: > I don't know why it didn't fail for postgres_fdw. Hm, maybe it's because > Mkvcbuild.pm links postgres_fdw with libpq, but not test_shm_mq? Hard to see how libpq as such could have anything to do with it. I wonder though if adding libpq causes some different link-time switch to be used. If so, could we arrange to use that switch all the time, and thus get rid of the need for PGDLLIMPORT marks? regards, tom lane
On 2014-02-01 21:04:44 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > I don't know why it didn't fail for postgres_fdw. Hm, maybe it's because > > Mkvcbuild.pm links postgres_fdw with libpq, but not test_shm_mq? > > Hard to see how libpq as such could have anything to do with it. > I wonder though if adding libpq causes some different link-time > switch to be used. If so, could we arrange to use that switch > all the time, and thus get rid of the need for PGDLLIMPORT marks? Well, not because of libpq itself, but because it causes a def file to be used. I am not sure how they are autogenerated, but it might already cause. I fortunately have forgotten most of that uglyness. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Feb 2, 2014 at 1:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I happened to notice today that the owner of buildfarm member narwhal > is trying to revive it after a long time offline, but it's failing in > the 9.3 branch (and not attempting to build HEAD, yet). The cause > appears to be that contrib/postgres_fdw is referencing the DateStyle > and IntervalStyle variables, which aren't marked PGDLLIMPORT. > Hm, well, that would be an easy change ... but that code was committed > last March. How is it that we didn't notice this long ago? > > What this seems to indicate is that narwhal is the only buildfarm > animal that has a need for PGDLLIMPORT marks on global variables that > are referenced by extensions. Furthermore, nobody has attempted to > build 9.3 on a platform that needs that (or at least they've not > reported failure to us). > > According to the buildfarm database, narwhal is running a gcc build on > Windows 2003. That hardly seems like a mainstream use case. I could > believe that it might be of interest to developers, but clearly no > developers are actually running such a build. > > I think we should give serious consideration to desupporting this > combination so that we can get rid of the plague of PGDLLIMPORT > marks. Obviously this would depend on confirming that there are > no more-interesting Windows build methods that require it --- but > if there are any, I'd sure demand that there be an active buildfarm > instance to keep us from breaking the case again in future. No objection here - though I should point out that it's not been offline for a long time (aside from a couple of weeks in January) - it's been happily building most pre-9.2 branches for ages. 9.1 seems to be stuck, along with HEAD, and I forgot to add 9.3. I'm in the process of cleaning that up as time allows, but am happy to drop it instead if we no longer want to support anything that old. We certainly don't use anything resembling that config for the EDB installer builds. I'm happy to replace it with something newer as time allows - what do we consider to be the biggest gap? I have an MSDN subscription, so can do any versions of Windows or VC++ (and obviously Mingw). -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Dave Page <dpage@pgadmin.org> writes: > On Sun, Feb 2, 2014 at 1:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we should give serious consideration to desupporting this >> combination so that we can get rid of the plague of PGDLLIMPORT >> marks. > No objection here - though I should point out that it's not been > offline for a long time (aside from a couple of weeks in January) - > it's been happily building most pre-9.2 branches for ages. 9.1 seems > to be stuck, along with HEAD, and I forgot to add 9.3. I'm in the > process of cleaning that up as time allows, but am happy to drop it > instead if we no longer want to support anything that old. We > certainly don't use anything resembling that config for the EDB > installer builds. Further discussion pointed out that currawong, for example, seems to want PGDLLIMPORT markings but is able to get by without them in some cases that narwhal evidently doesn't like. So at this point, desupporting narwhal's configuration is clearly premature --- we should instead be looking into exactly what is causing the different cases to fail or not fail. I still have hopes that we might be able to get rid of PGDLLIMPORT marks, but by actually understanding why they seem to be needed in some cases and not others, not by just arbitrarily dropping support. In the meantime, please do get HEAD running again on that machine. regards, tom lane
On 02/02/2014 09:03 AM, Tom Lane wrote: > According to the buildfarm database, narwhal is running a gcc build on > Windows 2003. That hardly seems like a mainstream use case. I could > believe that it might be of interest to developers, but clearly no > developers are actually running such a build. > > I think we should give serious consideration to desupporting this > combination I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also the only open source compiler currently supported for PostgreSQL on Windows - practically the only one available. I don't know about you, but I'm not too keen on assuming Microsoft will continue to offer free toolchains that're usable for our purposes. They're crippling their free build tools more and more with each release, which isn't a good trend. If you wish to eliminate PGDLLIMPORT from the codebase the correct approach would be building with --export-all-symbols (a MinGW extension flag to gcc). That would make the MinGW builds consistent with the MSVC build, which generates a .def file that exports all symbols. As for why PGDLLIMPORT appears to be required in some places on the MSVC build, so far it's looking like we auto-export functions, but not necessarily variables. I'd need to read the fairly scary MSVC build genreator scripts in detail to confirm that, to see how they produce their DEF files; that'll have to wait until after I've got the row-security work sorted out. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Feb 3, 2014 at 6:52 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
--
--
On 02/02/2014 09:03 AM, Tom Lane wrote:I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also
> According to the buildfarm database, narwhal is running a gcc build on
> Windows 2003. That hardly seems like a mainstream use case. I could
> believe that it might be of interest to developers, but clearly no
> developers are actually running such a build.
>
> I think we should give serious consideration to desupporting this
> combination
the only open source compiler currently supported for PostgreSQL on
Windows - practically the only one available. I don't know about you,
but I'm not too keen on assuming Microsoft will continue to offer free
toolchains that're usable for our purposes. They're crippling their free
build tools more and more with each release, which isn't a good trend.
If you wish to eliminate PGDLLIMPORT from the codebase the correct
approach would be building with --export-all-symbols (a MinGW extension
flag to gcc). That would make the MinGW builds consistent with the MSVC
build, which generates a .def file that exports all symbols.
As for why PGDLLIMPORT appears to be required in some places on the MSVC
build, so far it's looking like we auto-export functions, but not
necessarily variables.
In my experience, that is true.
During some contrib module development, while accessing a postgresql variable
it was crashing on windows, while same was accessible in non-windows platform.
I'd need to read the fairly scary MSVC build
genreator scripts in detail to confirm that, to see how they produce
their DEF files; that'll have to wait until after I've got the
row-security work sorted out.
--
Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
--
Thanks & Regards,
Ashesh Vashi
EnterpriseDB INDIA: Enterprise PostgreSQL Company
On 02/03/2014 11:10 AM, Ashesh Vashi wrote: > On Mon, Feb 3, 2014 at 6:52 AM, Craig Ringer <craig@2ndquadrant.com > <mailto:craig@2ndquadrant.com>> wrote: > > On 02/02/2014 09:03 AM, Tom Lane wrote: > > According to the buildfarm database, narwhal is running a gcc build on > > Windows 2003. That hardly seems like a mainstream use case. I could > > believe that it might be of interest to developers, but clearly no > > developers are actually running such a build. > > > > I think we should give serious consideration to desupporting this > > combination > > I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also > the only open source compiler currently supported for PostgreSQL on > Windows - practically the only one available. I don't know about you, > but I'm not too keen on assuming Microsoft will continue to offer free > toolchains that're usable for our purposes. They're crippling their free > build tools more and more with each release, which isn't a good trend. > > If you wish to eliminate PGDLLIMPORT from the codebase the correct > approach would be building with --export-all-symbols (a MinGW extension > flag to gcc). That would make the MinGW builds consistent with the MSVC > build, which generates a .def file that exports all symbols. > > As for why PGDLLIMPORT appears to be required in some places on the MSVC > build, so far it's looking like we auto-export functions, but not > necessarily variables. > > In my experience, that is true. > During some contrib module development, while accessing a postgresql > variable > it was crashing on windows, while same was accessible in non-windows > platform. I think it's a good thing personally - we shouldn't be exporting every little internal var in the symbol table. If we built with -fvisibility=hidden on 'nix there'd be no need to complain about commits being on on 'nix then breaking on Windows, 'cos the 'nix build would break in the same place. That's all or nothing though, there's no "vars hidden, procs exported" option in gcc. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Feb 3, 2014 at 6:52 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 02/02/2014 09:03 AM, Tom Lane wrote: >> According to the buildfarm database, narwhal is running a gcc build on >> Windows 2003. That hardly seems like a mainstream use case. I could >> believe that it might be of interest to developers, but clearly no >> developers are actually running such a build. >> >> I think we should give serious consideration to desupporting this >> combination > > I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also > the only open source compiler currently supported for PostgreSQL on > Windows - practically the only one available. I don't know about you, > but I'm not too keen on assuming Microsoft will continue to offer free > toolchains that're usable for our purposes. They're crippling their free > build tools more and more with each release, which isn't a good trend. > > If you wish to eliminate PGDLLIMPORT from the codebase the correct > approach would be building with --export-all-symbols (a MinGW extension > flag to gcc). That would make the MinGW builds consistent with the MSVC > build, which generates a .def file that exports all symbols. > > As for why PGDLLIMPORT appears to be required in some places on the MSVC > build, so far it's looking like we auto-export functions, but not > necessarily variables. I could see both the variables (DateStyle and IntervalStyle) currently Tom points out in both .def file and by using dumpbin utility. I think these symbols are exported, the main reason of failure is that they are not imported in postres_fdw as we have not used PGDLLIMPORT. Looking at below code, as per my understanding for contrib modules BUILDING_DLL is not defined and by using PGDLLIMPORT, it can lead to import of the required variables. #ifdef BUILDING_DLL #define PGDLLIMPORT __declspec (dllexport) #else /* not BUILDING_DLL */ #define PGDLLIMPORT __declspec (dllimport) #endif We need this for variables, but they are optional for functions. Please refer below link: http://msdn.microsoft.com/en-us/library/8fskxacy(v=vs.80).aspx With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Sun, Feb 2, 2014 at 6:33 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I happened to notice today that the owner of buildfarm member narwhal > is trying to revive it after a long time offline, but it's failing in > the 9.3 branch (and not attempting to build HEAD, yet). The cause > appears to be that contrib/postgres_fdw is referencing the DateStyle > and IntervalStyle variables, which aren't marked PGDLLIMPORT. > Hm, well, that would be an easy change ... but that code was committed > last March. How is it that we didn't notice this long ago? > > What this seems to indicate is that narwhal is the only buildfarm > animal that has a need for PGDLLIMPORT marks on global variables that > are referenced by extensions. I have tried to debug the part of code where we are using DateStyle and IntervalStyle variables in postgres_fdw on my m/c (Win 7 - 64 bit, MSVC2010) and found that there value is junk, to me the problem looks exactly similar to what we have seen for test_shm_mq modules few days back. I am not sure why it is passing on other buildfarm m/c, but I think we need to put PGDLLIMPORT for global variables we want to use in extensions. Also as per below link, it seems that PGDLLIMPORT is required for exported global variables. http://msdn.microsoft.com/en-us/library/8fskxacy(v=vs.80).aspx With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Amit Kapila <amit.kapila16@gmail.com> writes: > Also as per below link, it seems that PGDLLIMPORT is required for > exported global variables. > http://msdn.microsoft.com/en-us/library/8fskxacy(v=vs.80).aspx That was what we'd always assumed, but the fact that postgres_fdw has been working for the last year (on everything except narwhal) puts the lie to that as a blanket assumption. So what I want to know now is how come it wasn't failing; because then we might be able to exploit that to eliminate the need for the PGDLLIMPORT cruft everywhere. I've never been a fan of the fact that Windows insists on its own private set of global symbol visibility rules. If we can make that build work more like other platforms, it'll be a step forward IMO. regards, tom lane
On Sun, Feb 2, 2014 at 5:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote: > I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also > the only open source compiler currently supported for PostgreSQL on > Windows - practically the only one available. I don't know about you, > but I'm not too keen on assuming Microsoft will continue to offer free > toolchains that're usable for our purposes. They're crippling their free > build tools more and more with each release, which isn't a good trend. I was under the impression that Microsoft had finally come around to implementing a few C99 features in Visual Studio 2013 precisely because they want there to be an open source ecosystem on Windows. -- Peter Geoghegan
On 2014-02-03 12:00:40 +0800, Craig Ringer wrote: > I think it's a good thing personally - we shouldn't be exporting every > little internal var in the symbol table. > > If we built with -fvisibility=hidden on 'nix there'd be no need to > complain about commits being on on 'nix then breaking on Windows, 'cos > the 'nix build would break in the same place. That's all or nothing > though, there's no "vars hidden, procs exported" option in gcc. I think that'd be an exercise in futility. We're not talking about a general purpose library here, where I agree -fvisibility=hidden is a useful thing, but about the backend. We'd break countless extensions people have written. Most of those have been authored on *nix. To make any form of sense we'd need to have a really separate API layer between internal/external stuff. That doesn't seem likely to arrive anytime soon, if ever. I think all that would achieve is that we'd regularly need to backpatch visibility fixes. And have countless pointless flames about which variables to expose. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Sun, Feb 2, 2014 at 3:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Page <dpage@pgadmin.org> writes: >> On Sun, Feb 2, 2014 at 1:03 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I think we should give serious consideration to desupporting this >>> combination so that we can get rid of the plague of PGDLLIMPORT >>> marks. > >> No objection here - though I should point out that it's not been >> offline for a long time (aside from a couple of weeks in January) - >> it's been happily building most pre-9.2 branches for ages. 9.1 seems >> to be stuck, along with HEAD, and I forgot to add 9.3. I'm in the >> process of cleaning that up as time allows, but am happy to drop it >> instead if we no longer want to support anything that old. We >> certainly don't use anything resembling that config for the EDB >> installer builds. > > Further discussion pointed out that currawong, for example, seems to > want PGDLLIMPORT markings but is able to get by without them in > some cases that narwhal evidently doesn't like. So at this point, > desupporting narwhal's configuration is clearly premature --- we > should instead be looking into exactly what is causing the different > cases to fail or not fail. > > I still have hopes that we might be able to get rid of PGDLLIMPORT > marks, but by actually understanding why they seem to be needed in > some cases and not others, not by just arbitrarily dropping support. > > In the meantime, please do get HEAD running again on that machine. Done: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-03%2009%3A26%3A43 It's not happy though :-( -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Feb 3, 2014 at 5:45 AM, Dave Page <dpage@pgadmin.org> wrote: > Done: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-03%2009%3A26%3A43 > > It's not happy though :-( Specifically, it says: dlltool --export-all --output-def libpg_buffercachedll.def pg_buffercache_pages.o dllwrap -o pg_buffercache.dll --dllname pg_buffercache.dll --def libpg_buffercachedll.def pg_buffercache_pages.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -L../../src/backend -lpostgres Info: resolving _MainLWLockArray by linking to __imp__MainLWLockArray (auto-import) fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' fu000002.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' nmth000000.o(.idata$4+0x0): undefined reference to `_nm__MainLWLockArray' I assume I should go stick a DLLIMPORT on MainLWLockArray, unless somebody has another proposal. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I assume I should go stick a DLLIMPORT on MainLWLockArray, unless > somebody has another proposal. Hold up awhile. The reason we're resurrecting it is to get a crosscheck on what symbols it thinks need DLLIMPORT that no other platform does. regards, tom lane
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-03 12:00:40 +0800, Craig Ringer wrote: >> I think it's a good thing personally - we shouldn't be exporting every >> little internal var in the symbol table. >> >> If we built with -fvisibility=hidden on 'nix there'd be no need to >> complain about commits being on on 'nix then breaking on Windows, 'cos >> the 'nix build would break in the same place. That's all or nothing >> though, there's no "vars hidden, procs exported" option in gcc. > To make any form of sense we'd need to have a really separate API > layer between internal/external stuff. That doesn't seem likely to > arrive anytime soon, if ever. Indeed. Aside from the difficulty of having tighter requirements on just one platform (which, for development purposes, isn't even a mainstream one), it seems completely silly to me that the restriction applies only to variables and not functions. We expose *far* more global functions than global variables; and there's no particular reason to think that calling a random function is any safer than frobbing a random global variable. What we need is to make the Windows platform stop thinking that it is the center of the world, and make it act as much as possible like the Unix-oid platforms. regards, tom lane
On 2014-02-03 09:13:02 -0500, Tom Lane wrote: > What we need is to make the Windows platform stop thinking that it is > the center of the world, and make it act as much as possible like the > Unix-oid platforms. What about the completely crazy thing of simply redefining export to include the declspec on windows for backend build? That will possibly blow up the size of the .def files et al a bit, but I have little mercy with that. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/03/2014 06:37 PM, Andres Freund wrote: > On 2014-02-03 12:00:40 +0800, Craig Ringer wrote: >> I think it's a good thing personally - we shouldn't be exporting every >> little internal var in the symbol table. >> >> If we built with -fvisibility=hidden on 'nix there'd be no need to >> complain about commits being on on 'nix then breaking on Windows, 'cos >> the 'nix build would break in the same place. That's all or nothing >> though, there's no "vars hidden, procs exported" option in gcc. > > I think that'd be an exercise in futility. We're not talking about a > general purpose library here, where I agree -fvisibility=hidden is a > useful thing, but about the backend. We'd break countless extensions > people have written. Most of those have been authored on *nix. > To make any form of sense we'd need to have a really separate API > layer between internal/external stuff. That doesn't seem likely to > arrive anytime soon, if ever. > I think all that would achieve is that we'd regularly need to backpatch > visibility fixes. And have countless pointless flames about which > variables to expose. Fair point. If we're not going to define a proper API, then export control is not useful. And since there isn't a proper API, nor any on the cards, _that_ is a reasonable reason to just export all. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-02-03 22:23:16 +0800, Craig Ringer wrote: > On 02/03/2014 06:37 PM, Andres Freund wrote: > > I think that'd be an exercise in futility. We're not talking about a > > general purpose library here, where I agree -fvisibility=hidden is a > > useful thing, but about the backend. We'd break countless extensions > > people have written. Most of those have been authored on *nix. > > To make any form of sense we'd need to have a really separate API > > layer between internal/external stuff. That doesn't seem likely to > > arrive anytime soon, if ever. > > I think all that would achieve is that we'd regularly need to backpatch > > visibility fixes. And have countless pointless flames about which > > variables to expose. > > Fair point. If we're not going to define a proper API, then export > control is not useful. And since there isn't a proper API, nor any on > the cards, _that_ is a reasonable reason to just export all. We have a (mostly) proper API. Just not an internal/external API split. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > What about the completely crazy thing of simply redefining export to > include the declspec on windows for backend build? That will possibly > blow up the size of the .def files et al a bit, but I have little mercy > with that. You mean "#define extern extern PGDLLIMPORT" ? What will that do to function declarations that have "extern"? Are we sure C compilers are smart enough to not go into an infinite loop on such a macro? Anyway, narwhal's latest results thicken the plot even more. How come pg_buffercache has been building OK on the other Windows critters, when MainLWLockArray lacks PGDLLIMPORT? That lets out the theory about linking to libpq having something to do with it. regards, tom lane
On 2014-02-03 09:25:57 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > What about the completely crazy thing of simply redefining export to > > include the declspec on windows for backend build? That will possibly > > blow up the size of the .def files et al a bit, but I have little mercy > > with that. > > You mean "#define extern extern PGDLLIMPORT" ? What will that do to > function declarations that have "extern"? As far as I understand it's perfectly ok to have that on functions. http://msdn.microsoft.com/en-us/library/aa271769%28v=vs.60%29.aspx even claims it's faster in some scenarios. > Are we sure C compilers > are smart enough to not go into an infinite loop on such a macro? I'd be really surprised if there were any that wouldn't completely choke on pg otherwise. Especially not as we are only talking about windows where only gcc and msvc is supported these days. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/03/2014 01:13 AM, Peter Geoghegan wrote: > On Sun, Feb 2, 2014 at 5:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >> I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also >> the only open source compiler currently supported for PostgreSQL on >> Windows - practically the only one available. I don't know about you, >> but I'm not too keen on assuming Microsoft will continue to offer free >> toolchains that're usable for our purposes. They're crippling their free >> build tools more and more with each release, which isn't a good trend. > I was under the impression that Microsoft had finally come around to > implementing a few C99 features in Visual Studio 2013 precisely > because they want there to be an open source ecosystem on Windows. > > It's not so long ago that they were saying they would no longer publish free-as-in-beer command line compilers at all. The outrage made them change their minds, but we really can't rely on only Microsoft compilers for Windows. cheers andrew
On Mon, Feb 3, 2014 at 9:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote: > It's not so long ago that they were saying they would no longer publish > free-as-in-beer command line compilers at all. The outrage made them change > their minds, but we really can't rely on only Microsoft compilers for > Windows. +1. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Andrew Dunstan <andrew@dunslane.net> writes: > On 02/03/2014 01:13 AM, Peter Geoghegan wrote: >> On Sun, Feb 2, 2014 at 5:22 PM, Craig Ringer <craig@2ndquadrant.com> wrote: >>> I'm not a fan of MinGW (gcc) on Windows, it's a right pain. It's also >>> the only open source compiler currently supported for PostgreSQL on >>> Windows - practically the only one available. I don't know about you, >>> but I'm not too keen on assuming Microsoft will continue to offer free >>> toolchains that're usable for our purposes. They're crippling their free >>> build tools more and more with each release, which isn't a good trend. >> I was under the impression that Microsoft had finally come around to >> implementing a few C99 features in Visual Studio 2013 precisely >> because they want there to be an open source ecosystem on Windows. > It's not so long ago that they were saying they would no longer publish > free-as-in-beer command line compilers at all. The outrage made them > change their minds, but we really can't rely on only Microsoft compilers > for Windows. FWIW, I'd definitely not be in favor of desupporting mingw altogether. However, narwhal appears to be running a pretty ancient version: configure: using compiler=gcc.exe (GCC) 3.4.2 (mingw-special) If it turns out that more recent versions have saner behavior, I'd not have the slightest qualms about saying we don't support such old versions anymore. But this is all speculation at the moment. I'm hoping somebody with a Windows build environment (ie, not me) will poke into the situation and figure out exactly why some of these global variables seem to need PGDLLIMPORT while others don't. regards, tom lane
On Mon, Feb 3, 2014 at 11:36 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >> Also as per below link, it seems that PGDLLIMPORT is required for >> exported global variables. >> http://msdn.microsoft.com/en-us/library/8fskxacy(v=vs.80).aspx > > That was what we'd always assumed, but the fact that postgres_fdw has been > working for the last year (on everything except narwhal) puts the lie to > that as a blanket assumption. > > So what I want to know now is how come it wasn't failing; because then > we might be able to exploit that to eliminate the need for the PGDLLIMPORT > cruft everywhere. Today, I had tried to debug the same scenario on Win XP 32-bit, but the value for DateStyle is junk on that m/c as well. Is it possible that someway we can check on buildfarm m/c (Windows) where regression test is passing what is the value of DateStyle. We can use something like attached patch to print the value in log. In my test, it prints the values as below: Without using PGDLLIMPORT for DateStyle LOG: Value of DateStyle is: 326903295 Using PGDLLIMPORT for DateStyle LOG: Value of DateStyle is: 1 In the function where it is used, it seems to me that it is setting DateStyle as ISO if it is not ISO and function configure_remote_session() will set it to ISO initially. So basically even if the value of DateStyle is junk, it will just do what we wanted i.e setting it to ISO. Does the failure is due to reason that it is not expecting DateStyle to be ISO and due to junk value of this parameter it sets the same to ISO. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
Amit Kapila <amit.kapila16@gmail.com> writes: > In the function where it is used, it seems to me that it is setting DateStyle > as ISO if it is not ISO and function configure_remote_session() will set it > to ISO initially. So basically even if the value of DateStyle is junk, it will > just do what we wanted i.e setting it to ISO. Does the failure is due to reason > that it is not expecting DateStyle to be ISO and due to junk value of this > parameter it sets the same to ISO. Meh. It might be that the DateStyle usage in postgres_fdw would accidentally fail to malfunction if it saw a bogus value of the variable. But it's hard to believe that this would be true of MainLWLockArray. regards, tom lane
On Tue, Feb 4, 2014 at 12:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >> In the function where it is used, it seems to me that it is setting DateStyle >> as ISO if it is not ISO and function configure_remote_session() will set it >> to ISO initially. So basically even if the value of DateStyle is junk, it will >> just do what we wanted i.e setting it to ISO. Does the failure is due to reason >> that it is not expecting DateStyle to be ISO and due to junk value of this >> parameter it sets the same to ISO. > > Meh. It might be that the DateStyle usage in postgres_fdw would > accidentally fail to malfunction if it saw a bogus value of the variable. > But it's hard to believe that this would be true of MainLWLockArray. Thats true, but for me its failing as MainLWLockArray contains Junk value. Now the point to look out is why its passing on some of the build farm m/c's. I will study about this more, if you have anything specific in mind then, do let me know. Can I get the details of m/c env on which it is passing like which windows version and msvc version? It might give some clue. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 2014-02-04 02:10:47 -0500, Tom Lane wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: > > In the function where it is used, it seems to me that it is setting DateStyle > > as ISO if it is not ISO and function configure_remote_session() will set it > > to ISO initially. So basically even if the value of DateStyle is junk, it will > > just do what we wanted i.e setting it to ISO. Does the failure is due to reason > > that it is not expecting DateStyle to be ISO and due to junk value of this > > parameter it sets the same to ISO. > > Meh. It might be that the DateStyle usage in postgres_fdw would > accidentally fail to malfunction if it saw a bogus value of the variable. > But it's hard to believe that this would be true of MainLWLockArray. There's not that much lwlock usage in contrib. It's just pg_stat_statements and pg_buffercache. Neither has tests... So it very well could be that breakage simply hasn't been observed. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-04 02:10:47 -0500, Tom Lane wrote: >> Meh. It might be that the DateStyle usage in postgres_fdw would >> accidentally fail to malfunction if it saw a bogus value of the variable. >> But it's hard to believe that this would be true of MainLWLockArray. > There's not that much lwlock usage in contrib. It's just > pg_stat_statements and pg_buffercache. Neither has tests... So it very > well could be that breakage simply hasn't been observed. Hm, you're right --- I'd have thought there were more of those. Ugh. This problem was bad enough when I thought that it would only lead to link-time errors detectable in the buildfarm. If it can lead to errors only observable at runtime --- and maybe not obvious even then --- then I think we *have to* do something about it. By that I mean that we must get rid of the need to manually plaster PGDLLIMPORT on global variables. Anybody with a Windows build environment want to test the "#define extern" trick? regards, tom lane
On 02/04/2014 10:43 AM, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-02-04 02:10:47 -0500, Tom Lane wrote: >>> Meh. It might be that the DateStyle usage in postgres_fdw would >>> accidentally fail to malfunction if it saw a bogus value of the variable. >>> But it's hard to believe that this would be true of MainLWLockArray. >> There's not that much lwlock usage in contrib. It's just >> pg_stat_statements and pg_buffercache. Neither has tests... So it very >> well could be that breakage simply hasn't been observed. > Hm, you're right --- I'd have thought there were more of those. > > Ugh. This problem was bad enough when I thought that it would only lead > to link-time errors detectable in the buildfarm. If it can lead to errors > only observable at runtime --- and maybe not obvious even then --- then > I think we *have to* do something about it. By that I mean that we must > get rid of the need to manually plaster PGDLLIMPORT on global variables. > > Anybody with a Windows build environment want to test the "#define extern" > trick? > > We have details on how to build with Mingw/Msys on Windows on an Amazon VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is either free or very cheap. Do I need to give instructions on how to do this for MSVC builds too? It's really not terribly hard. cheers andrew
On February 4, 2014 5:06:52 PM CET, Andrew Dunstan <andrew@dunslane.net> wrote: > >On 02/04/2014 10:43 AM, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2014-02-04 02:10:47 -0500, Tom Lane wrote: >>>> Meh. It might be that the DateStyle usage in postgres_fdw would >>>> accidentally fail to malfunction if it saw a bogus value of the >variable. >>>> But it's hard to believe that this would be true of >MainLWLockArray. >>> There's not that much lwlock usage in contrib. It's just >>> pg_stat_statements and pg_buffercache. Neither has tests... So it >very >>> well could be that breakage simply hasn't been observed. >> Hm, you're right --- I'd have thought there were more of those. >> >> Ugh. This problem was bad enough when I thought that it would only >lead >> to link-time errors detectable in the buildfarm. If it can lead to >errors >> only observable at runtime --- and maybe not obvious even then --- >then >> I think we *have to* do something about it. By that I mean that we >must >> get rid of the need to manually plaster PGDLLIMPORT on global >variables. >> >> Anybody with a Windows build environment want to test the "#define >extern" >> trick? >> >> > > > >We have details on how to build with Mingw/Msys on Windows on an Amazon > >VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is >either >free or very cheap. Do I need to give instructions on how to do this >for >MSVC builds too? It's really not terribly hard. Err. It might not be very hard but it certainly is time consuming. And that for people not caring about windows. If there were usable, regularly refreshed, instances out there'd it'd be slightly less bad. But this still by far the mostannoying and intrusive platform to care about. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 02/04/2014 11:30 AM, Andres Freund wrote: >> >> >> We have details on how to build with Mingw/Msys on Windows on an Amazon >> >> VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is >> either >> free or very cheap. Do I need to give instructions on how to do this >> for >> MSVC builds too? It's really not terribly hard. > Err. It might not be very hard but it certainly is time consuming. And that for people not caring about windows. > > If there were usable, regularly refreshed, instances out there'd it'd be slightly less bad. But this still by far the mostannoying and intrusive platform to care about. > If someone volunteered to pay for the storage, I'd be prepared to make some time to create an AMI to reduce the startup time dramatically. Basically it would be "boot the AMI and start testing your patches". I'd even make it as friendly as possible for people who don't like to get too far from unix-ish environments. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > If someone volunteered to pay for the storage, I'd be prepared to make > some time to create an AMI to reduce the startup time dramatically. > Basically it would be "boot the AMI and start testing your patches". I'd > even make it as friendly as possible for people who don't like to get > too far from unix-ish environments. My own opinion is that I've already wasted untold man-hours thanks to the random porting problems induced by Windows, a platform that I never have and never will care about personally. I will *not* spend my own time doing tests that someone else could do. If we can't get some effort contributed by someone who does use that platform, I'm personally prepared to declare the entire damn thing no longer supported. regards, tom lane
On 02/04/2014 09:34 AM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> If someone volunteered to pay for the storage, I'd be prepared to make >> some time to create an AMI to reduce the startup time dramatically. >> Basically it would be "boot the AMI and start testing your patches". I'd >> even make it as friendly as possible for people who don't like to get >> too far from unix-ish environments. > > My own opinion is that I've already wasted untold man-hours thanks to > the random porting problems induced by Windows, a platform that I never > have and never will care about personally. I will *not* spend my own > time doing tests that someone else could do. If we can't get some > effort contributed by someone who does use that platform, I'm personally > prepared to declare the entire damn thing no longer supported. Although that is obviously your prerogative it is important to remember that Windows is easily the second most used version of PostgreSQL out there (behind Linux). I know many people that run it on Windows, especially in the medical field. I also know many people that embed it (Apple not withstanding). Yes it is an obnoxious platform but it is THE platform, no matter how much we like to tout our Linux(Hiker) creds. Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On Tue, Feb 4, 2014 at 8:06 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
We have details on how to build with Mingw/Msys on Windows on an Amazon VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is either free or very cheap. Do I need to give instructions on how to do this for MSVC builds too? It's really not terribly hard.
On 02/04/2014 10:43 AM, Tom Lane wrote:
Ugh. This problem was bad enough when I thought that it would only lead
to link-time errors detectable in the buildfarm. If it can lead to errors
only observable at runtime --- and maybe not obvious even then --- then
I think we *have to* do something about it. By that I mean that we must
get rid of the need to manually plaster PGDLLIMPORT on global variables.
Anybody with a Windows build environment want to test the "#define extern"
trick?
If you gave step by step instructions like the ones for MinGW, I would at least give it a try. Last time a looked into it, I gave up after I couldn't figure out which of the umpteen similarly-named products was the one I needed to buy/download-for-free/register-and-download and then install, and I tried a few of them at random without much success.
Cheers,
Jeff
"Joshua D. Drake" <jd@commandprompt.com> writes: > On 02/04/2014 09:34 AM, Tom Lane wrote: >> My own opinion is that I've already wasted untold man-hours thanks to >> the random porting problems induced by Windows, a platform that I never >> have and never will care about personally. I will *not* spend my own >> time doing tests that someone else could do. If we can't get some >> effort contributed by someone who does use that platform, I'm personally >> prepared to declare the entire damn thing no longer supported. > Although that is obviously your prerogative it is important to remember > that Windows is easily the second most used version of PostgreSQL out > there (behind Linux). [ shrug... ] If it's so widely used, why is it so hard to find somebody who's willing to put in some development effort for it? regards, tom lane
On 02/04/2014 01:53 PM, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> On 02/04/2014 09:34 AM, Tom Lane wrote: >>> My own opinion is that I've already wasted untold man-hours thanks to >>> the random porting problems induced by Windows, a platform that I never >>> have and never will care about personally. I will *not* spend my own >>> time doing tests that someone else could do. If we can't get some >>> effort contributed by someone who does use that platform, I'm personally >>> prepared to declare the entire damn thing no longer supported. >> Although that is obviously your prerogative it is important to remember >> that Windows is easily the second most used version of PostgreSQL out >> there (behind Linux). > [ shrug... ] If it's so widely used, why is it so hard to find somebody > who's willing to put in some development effort for it? I suspect that the open hostility towards it doesn't help much. I do put in quite alot of effort supporting it one way and another, committing patches and running buildfarm members among other things. And we have several others who put in significant amounts of effort. And no, it's not my platform of choice either. Anyway, I will explore my suggestion upthread of a pre-prepared windows development AMI, when I have time. Right now between work for actual money and things like jsonb I am slammed. cheers andrew
On 02/04/2014 10:53 AM, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> On 02/04/2014 09:34 AM, Tom Lane wrote: >>> My own opinion is that I've already wasted untold man-hours thanks to >>> the random porting problems induced by Windows, a platform that I never >>> have and never will care about personally. I will *not* spend my own >>> time doing tests that someone else could do. If we can't get some >>> effort contributed by someone who does use that platform, I'm personally >>> prepared to declare the entire damn thing no longer supported. > >> Although that is obviously your prerogative it is important to remember >> that Windows is easily the second most used version of PostgreSQL out >> there (behind Linux). > > [ shrug... ] If it's so widely used, why is it so hard to find somebody > who's willing to put in some development effort for it? Well, from what I have seen of the Windows world there is a fairly sharp demarcation between users and developers. So use does not necessarily mean developer knowledge. > > regards, tom lane > > -- Adrian Klaver adrian.klaver@gmail.com
On 02/04/2014 11:17 AM, Andrew Dunstan wrote: >>>> prepared to declare the entire damn thing no longer supported. >>> Although that is obviously your prerogative it is important to remember >>> that Windows is easily the second most used version of PostgreSQL out >>> there (behind Linux). >> [ shrug... ] If it's so widely used, why is it so hard to find somebody >> who's willing to put in some development effort for it? > Because Windows developers get paid and have better things to do than interact with a bunch of Open Source people and their opinions. Let's be honest, all of us here are not the easiest to work with and we are here largely due to an ideology. Why would a Windows developer bother? I know lots of Windows Developers and most of them don't follow anywhere near the ideology we do. I know very few (read 0) Windows developers that do it for fun (I am not saying they don't have fun. I am saying they don't do it for fun). They do it for their career. Heck, look back through the years at our archives. Nobody really thought that we would somehow bring windows developers into the fold if we ported to that platform. It was a market share play and it worked. However, now we are paying the price for it. To be perfectly frank, the first thing I do when a customer calls and says we are on Windows is tell them I will spend our relationship pushing them to move PostgreSQL to Linux. It works over time for the most part but not on all. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On Tue, Feb 4, 2014 at 9:26 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
If someone volunteered to pay for the storage, I'd be prepared to make some time to create an AMI to reduce the startup time dramatically. Basically it would be "boot the AMI and start testing your patches". I'd even make it as friendly as possible for people who don't like to get too far from unix-ish environments.
On 02/04/2014 11:30 AM, Andres Freund wrote:Err. It might not be very hard but it certainly is time consuming. And that for people not caring about windows.
We have details on how to build with Mingw/Msys on Windows on an Amazon
VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is
either
free or very cheap. Do I need to give instructions on how to do this
for
MSVC builds too? It's really not terribly hard.
If there were usable, regularly refreshed, instances out there'd it'd be slightly less bad. But this still by far the most annoying and intrusive platform to care about.
Do you know about what it would cost? Could official community funds be used for it (it seems like something that is cheap, but which you wouldn't want to be forgotten about some month.)
Having an AMI would help, but even with an AMI in place, MinGW is still insanely slow. Running "make" on already made PostgreSQL (so there was nothing to actually do) takes 1.5 minutes. And a make after a "make clean" takes half an hour. This is on an actual desktop, not an AWS micro instance. So doing a git bisect is just painful. Is the MSVC build faster?
Cheers,
Jeff
On Tue, Feb 4, 2014 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> If someone volunteered to pay for the storage, I'd be prepared to make >> some time to create an AMI to reduce the startup time dramatically. >> Basically it would be "boot the AMI and start testing your patches". I'd >> even make it as friendly as possible for people who don't like to get >> too far from unix-ish environments. > > My own opinion is that I've already wasted untold man-hours thanks to > the random porting problems induced by Windows, a platform that I never > have and never will care about personally. I will *not* spend my own > time doing tests that someone else could do. If we can't get some > effort contributed by someone who does use that platform, I'm personally > prepared to declare the entire damn thing no longer supported. You know, I would really prefer to just stick a PGDLLIMPORT on this place and any others that need it, and any others that come up, than turn this into a political football. Having to sprinkle PGDLLIMPORT on the handful of variables that are accessed by contrib modules is, indeed, annoying. But it's not any more annoying than twelve other things that I have to do as a committer, like remembering to bump whichever of catversion, the xlog page magic, and the control data version are relevant to a particular commit, knowing which particular SGML files have to build standalone, and enforcing the project's code formatting, comment, and documentation conventions on everyone who submits a patch. So acting as if this particular annoyance is somehow unique or a good reason to desupport what is despite all of our misgivings one of our most popular platforms doesn't impress me one bit. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/04/2014 03:08 PM, Jeff Janes wrote: > > > Do you know about what it would cost? Could official community funds > be used for it (it seems like something that is cheap, but which you > wouldn't want to be forgotten about some month.) > > Having an AMI would help, but even with an AMI in place, MinGW is > still insanely slow. Running "make" on already made PostgreSQL (so > there was nothing to actually do) takes 1.5 minutes. And a make after > a "make clean" takes half an hour. This is on an actual desktop, not > an AWS micro instance. So doing a git bisect is just painful. Is the > MSVC build faster? > > Would have to check with the same build options (cassert and debug have major timing effects.) I agree it's not lightning fast like "make -j 4" on a decent linux box. cheers andrew
Robert Haas <robertmhaas@gmail.com> writes: > You know, I would really prefer to just stick a PGDLLIMPORT on this > place and any others that need it, and any others that come up, than > turn this into a political football. Having to sprinkle PGDLLIMPORT > on the handful of variables that are accessed by contrib modules is, > indeed, annoying. I'm not actually trying to turn this into a political football. What I want is a solution that we can trust, ie, that will allow us to ship Windows code that's not broken. We have failed to do so for at least the past year, and not even known it. I had been okay with the manual PGDLLIMPORT-sprinkling approach (not happy with it, of course, but prepared to tolerate it) as long as I believed the buildfarm would reliably tell us of the need for it. That assumption has now been conclusively disproven, though. The question therefore becomes, what are we going to do instead? "Keep on doing what we were doing" doesn't strike me as an acceptable answer. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > On 02/04/2014 03:08 PM, Jeff Janes wrote: >> Having an AMI would help, but even with an AMI in place, MinGW is >> still insanely slow. Running "make" on already made PostgreSQL (so >> there was nothing to actually do) takes 1.5 minutes. And a make after >> a "make clean" takes half an hour. This is on an actual desktop, not >> an AWS micro instance. So doing a git bisect is just painful. Is the >> MSVC build faster? > Would have to check with the same build options (cassert and debug have > major timing effects.) I agree it's not lightning fast like "make -j 4" > on a decent linux box. I wonder if ccache exists for Mingw. That thing makes a huge difference in the perceived build speed ... regards, tom lane
On 02/04/2014 05:47 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 02/04/2014 03:08 PM, Jeff Janes wrote: >>> Having an AMI would help, but even with an AMI in place, MinGW is >>> still insanely slow. Running "make" on already made PostgreSQL (so >>> there was nothing to actually do) takes 1.5 minutes. And a make after >>> a "make clean" takes half an hour. This is on an actual desktop, not >>> an AWS micro instance. So doing a git bisect is just painful. Is the >>> MSVC build faster? >> Would have to check with the same build options (cassert and debug have >> major timing effects.) I agree it's not lightning fast like "make -j 4" >> on a decent linux box. > I wonder if ccache exists for Mingw. That thing makes a huge difference > in the perceived build speed ... > > Indeed. But it's not really, AFAIK. Certainly it's not in the list of packages known to mingw-get on the machine i checked on (jacana). cheers andrew
On 02/05/2014 04:08 AM, Jeff Janes wrote: > So doing a git bisect is just painful. Is the MSVC > build faster? Yes, but not on EC2. I've found Windows EC2 instances so impossibly slow I just gave up working with it. It took 1.5 hours to do a build and regression check with msvc on a Medium EC2 instance; the same build takes 10 mins on my tiny Intel i3 based Windows test machine. It's possible that some of the larger instance types may perform better as they use different approaches to virtualization than simple Xen HVM. I haven't tested, as the cost of those instances rapidly becomes problematic. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/05/2014 02:53 AM, Tom Lane wrote: > "Joshua D. Drake" <jd@commandprompt.com> writes: >> On 02/04/2014 09:34 AM, Tom Lane wrote: >>> My own opinion is that I've already wasted untold man-hours thanks to >>> the random porting problems induced by Windows, a platform that I never >>> have and never will care about personally. I will *not* spend my own >>> time doing tests that someone else could do. If we can't get some >>> effort contributed by someone who does use that platform, I'm personally >>> prepared to declare the entire damn thing no longer supported. > >> Although that is obviously your prerogative it is important to remember >> that Windows is easily the second most used version of PostgreSQL out >> there (behind Linux). > > [ shrug... ] If it's so widely used, why is it so hard to find somebody > who's willing to put in some development effort for it? Sadly, I'm now convinced that Windows users are just much less likely to contribute anything constructive to a project - code, documentation, anything. It's a real "gimme" world, and has a really strong ethic that the "vendor" does things with their software, you don't just go and get involved. That said, I think the fuss being made about the intrusiveness of Windows support and its impact is overblown here. These are a few macros that're noops on other platforms anyway, and some build code hidden away in src/tools . It's ugly. It's annoying. It's crap that users don't contribute back. It's also just not that big a deal; there are many other things that are similarly painful or more so. Expecting folks to fire up an AMI and hand-control the build with a GUI over a high latency connection is a waste of time better spent elsewhere, though, and will result in everyone continuing to avoid any sort of testing on Windows. Personally what I think we need is a *public* Jenkins instance, or similar, to which you can push a branch and have it automatically build and "make check" on Windows. I've got that running for internal use, but it's on a host I can't share access to (and an unreliable one, at that). I'd be happy to share the setup for the Jenkins instance and the Windows integration parts, along with the instructions I wrote on how to set up the Windows build test node(s) and the tooling I'm using to automate the Windows build. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/04/2014 10:48 PM, Craig Ringer wrote: > On 02/05/2014 04:08 AM, Jeff Janes wrote: >> So doing a git bisect is just painful. Is the MSVC >> build faster? > Yes, but not on EC2. > > I've found Windows EC2 instances so impossibly slow I just gave up > working with it. It took 1.5 hours to do a build and regression check > with msvc on a Medium EC2 instance; the same build takes 10 mins on my > tiny Intel i3 based Windows test machine. > > It's possible that some of the larger instance types may perform better > as they use different approaches to virtualization than simple Xen HVM. > I haven't tested, as the cost of those instances rapidly becomes > problematic. > I typically use m1.medium. A spot instance for that is currently $0.033 / hour. When I was working on one such the other day it took nothing like 1.5 hours to build and test. I didn't time it so I can't tell you how long it took, but much less than that. Of course YMMV. cheers andrew
On 02/05/2014 06:29 AM, Tom Lane wrote: > I had been okay with the manual PGDLLIMPORT-sprinkling approach > (not happy with it, of course, but prepared to tolerate it) as long > as I believed the buildfarm would reliably tell us of the need for > it. That assumption has now been conclusively disproven, though. > The question therefore becomes, what are we going to do instead? > "Keep on doing what we were doing" doesn't strike me as an acceptable > answer. I'm in complete agreement here. Silent failures we can't test for that might sneak data corruption in are not cool. I'll have a look into ways to making sure that globals with incorrect linkage fail at runtime link time, as is the case for functions. I won't be able to spend much time on it immediately; will take a quick look and if I don't find anything, will follow up post-CF4. I'm kind of horrified that the dynamic linker doesn't throw its toys when it sees this. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/05/2014 12:06 AM, Andrew Dunstan wrote: > > On 02/04/2014 10:43 AM, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2014-02-04 02:10:47 -0500, Tom Lane wrote: >>>> Meh. It might be that the DateStyle usage in postgres_fdw would >>>> accidentally fail to malfunction if it saw a bogus value of the >>>> variable. >>>> But it's hard to believe that this would be true of MainLWLockArray. >>> There's not that much lwlock usage in contrib. It's just >>> pg_stat_statements and pg_buffercache. Neither has tests... So it very >>> well could be that breakage simply hasn't been observed. >> Hm, you're right --- I'd have thought there were more of those. >> >> Ugh. This problem was bad enough when I thought that it would only lead >> to link-time errors detectable in the buildfarm. If it can lead to >> errors >> only observable at runtime --- and maybe not obvious even then --- then >> I think we *have to* do something about it. By that I mean that we must >> get rid of the need to manually plaster PGDLLIMPORT on global variables. >> >> Anybody with a Windows build environment want to test the "#define >> extern" >> trick? >> >> > > > > We have details on how to build with Mingw/Msys on Windows on an Amazon > VM <http://wiki.postgresql.org/wiki/Building_With_MinGW> which is either > free or very cheap. Do I need to give instructions on how to do this for > MSVC builds too? It's really not terribly hard. I've got some guidance on that here: https://github.com/2ndQuadrant/pg_build_win from setting up a clean Windows instance for builds, on to the build process. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > On 02/05/2014 06:29 AM, Tom Lane wrote: >> I had been okay with the manual PGDLLIMPORT-sprinkling approach >> (not happy with it, of course, but prepared to tolerate it) as long >> as I believed the buildfarm would reliably tell us of the need for >> it. That assumption has now been conclusively disproven, though. > I'm kind of horrified that the dynamic linker doesn't throw its toys > when it sees this. Indeed :-(. The truly strange part of this is that it seems that the one Windows buildfarm member that's telling the truth (or most nearly so, anyway) is narwhal, which appears to have the oldest and cruftiest toolchain of the lot. I'd really like to come out the other end of this investigation with a clear understanding of why the newer toolchains are failing to report a link problem, and yet not building working executables. regards, tom lane
On 02/03/2014 07:04 AM, Robert Haas wrote: > > On Mon, Feb 3, 2014 at 9:57 AM, Andrew Dunstan <andrew@dunslane.net> wrote: >> It's not so long ago that they were saying they would no longer publish >> free-as-in-beer command line compilers at all. The outrage made them change >> their minds, but we really can't rely on only Microsoft compilers for >> Windows. > > +1. > It appears that LLVM supports Windows: http://llvm.org/docs/GettingStartedVS.html Sincerely, JD -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On 02/05/2014 01:41 PM, Joshua D. Drake wrote: > > On 02/03/2014 07:04 AM, Robert Haas wrote: >> >> On Mon, Feb 3, 2014 at 9:57 AM, Andrew Dunstan <andrew@dunslane.net> >> wrote: >>> It's not so long ago that they were saying they would no longer publish >>> free-as-in-beer command line compilers at all. The outrage made them >>> change >>> their minds, but we really can't rely on only Microsoft compilers for >>> Windows. >> >> +1. >> > > It appears that LLVM supports Windows: > > http://llvm.org/docs/GettingStartedVS.html > > If someone wants to work on getting a Windows build working with llvm then go for it. But until then the Mingw tools are the only thing we have that we know works other than Microsoft tools. Having a compiler that's alleged to work is only one step of many in getting there. cheers andrew
On 02/05/2014 10:56 AM, Andrew Dunstan wrote: >> >> It appears that LLVM supports Windows: >> >> http://llvm.org/docs/GettingStartedVS.html >> >> > > > If someone wants to work on getting a Windows build working with llvm > then go for it. But until then the Mingw tools are the only thing we > have that we know works other than Microsoft tools. Having a compiler > that's alleged to work is only one step of many in getting there. I was just trying to provide another resource in case people wanted to take a look. JD > > cheers > > andrew > -- Command Prompt, Inc. - http://www.commandprompt.com/ 509-416-6579 PostgreSQL Support, Training, Professional Services and Development High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc For my dreams of your image that blossoms a rose in the deeps of my heart. - W.B. Yeats
On 02/05/2014 01:52 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 02/05/2014 06:29 AM, Tom Lane wrote: >>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>> (not happy with it, of course, but prepared to tolerate it) as long >>> as I believed the buildfarm would reliably tell us of the need for >>> it. That assumption has now been conclusively disproven, though. > >> I'm kind of horrified that the dynamic linker doesn't throw its toys >> when it sees this. > > Indeed :-(. > > The truly strange part of this is that it seems that the one Windows > buildfarm member that's telling the truth (or most nearly so, anyway) > is narwhal, which appears to have the oldest and cruftiest toolchain > of the lot. I'd really like to come out the other end of this > investigation with a clear understanding of why the newer toolchains > are failing to report a link problem, and yet not building working > executables. I've done some digging and asked for some input from people with appropriate knowledge. Getting a hard answer's going to require some quality time with a debugger that I won't have until after the CF. Here's what I've found so far: In a .DEF file, symbol exports default to code exports if unspecified. http://msdn.microsoft.com/en-us/library/hyx1zcd3.aspx (I'm told that .DEF files take precedence over annotations in code, but haven't found a reference to confirm that yet, and I'm not convinced it's true since data annotated correctly with __declspec(dllexport) seems to work. You'd think places like this would mention it, but they don't: http://msdn.microsoft.com/en-us/library/7k30y2k5.aspx) So we're probably generating code thunk functions for our global variables in the import library. There seem to be two possibilities for what's happening when you access an extern variable defined in another module without __declspec(dllimport): 1. Our extern variable points to a thunk function's definition; or 2. Our extern variable points to the indirected pointer to the real variable, which is what you get if you link to a CONSTANT export without using __declspec(dllimport). I'd need to go poking in disassemblies with the debugger to confirm which it is, and don't have time right now. After the CF, maybe. I suspect #1, since we're probably generating thunk functions for our data exports. Either way, the key thing is: http://msdn.microsoft.com/en-us/library/hyx1zcd3.aspx "Note that when you export a variable from a DLL with a .def file, you do not need to specify __declspec(dllexport) on the variable. However, in any file that uses the DLL, you must still use __declspec(dllimport) on the declaration of data." See also "rules and limitations for dllimport/dllexport": http://msdn.microsoft.com/en-us/library/da6zd0a4.aspx This page notes that you should annotate .DEF entries for data with DATA "to reduce the likelihood that incorrect coding will cause a problem". http://msdn.microsoft.com/en-us/library/54xsd65y.aspx That will prevent generation of the linker symbol that's a pointer to the real data, so we should get linker errors when we fail to specify __declspec(dllimport). To do this, src/tools/msvc/gendefs.pl needs to be modified to correctly annotate globals with DATA based on the dumpbin output. I'll have a crack at that after the CF is closed. However, *this modification will not get rid of the need for PGDLLIMPORT macros in headers*, because it's still necessary to explicitly tag them with __declspec(dllimport) for correct results. In fact, I'm now wondering if we're using inefficient thunk based function calls from modules because we're not specifying __declspec(dllimport) on external functions, too. Again, that's a post-CF thing to check out using the MSVC debugger. BTW, there's a general reference for __declspec(dllexport) here: http://msdn.microsoft.com/en-us/library/a90k134d.aspx and higher level concepts here: http://msdn.microsoft.com/en-us/library/z4zxe9k8.aspx http://msdn.microsoft.com/en-us/library/900axts6.aspx Other references: http://msdn.microsoft.com/en-us/library/da6zd0a4.aspx http://msdn.microsoft.com/en-us/library/zw3za17w.aspx http://msdn.microsoft.com/en-us/library/619w14ds.aspx http://msdn.microsoft.com/en-us/library/54xsd65y.aspx http://blogs.msdn.com/b/oldnewthing/archive/2006/07/20/672695.aspx http://blogs.msdn.com/b/oldnewthing/archive/2006/07/18/669668.aspx -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/06/2014 10:14 AM, Craig Ringer wrote: > On 02/05/2014 01:52 PM, Tom Lane wrote: >> Craig Ringer <craig@2ndquadrant.com> writes: >>> On 02/05/2014 06:29 AM, Tom Lane wrote: >>>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>>> (not happy with it, of course, but prepared to tolerate it) as long >>>> as I believed the buildfarm would reliably tell us of the need for >>>> it. That assumption has now been conclusively disproven, though. >> >>> I'm kind of horrified that the dynamic linker doesn't throw its toys >>> when it sees this. >> >> Indeed :-(. >> >> The truly strange part of this is that it seems that the one Windows >> buildfarm member that's telling the truth (or most nearly so, anyway) >> is narwhal, which appears to have the oldest and cruftiest toolchain >> of the lot. I'd really like to come out the other end of this >> investigation with a clear understanding of why the newer toolchains >> are failing to report a link problem, and yet not building working >> executables. > > I've done some digging and asked for some input from people with > appropriate knowledge. Getting a hard answer's going to require some > quality time with a debugger that I won't have until after the CF. ... though this link may shed some light, it turns out: http://msdn.microsoft.com/en-us/library/aa271769(v=vs.60).aspx So, yeah. It looks like we're probably linking to thunk functions as data (ouch). To produce the desired error if __declspec(dllimport) is missing we need to change gendefs.pl to emit DATA for exported globals. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/05 14:52), Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 02/05/2014 06:29 AM, Tom Lane wrote: >>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>> (not happy with it, of course, but prepared to tolerate it) as long >>> as I believed the buildfarm would reliably tell us of the need for >>> it. That assumption has now been conclusively disproven, though. > >> I'm kind of horrified that the dynamic linker doesn't throw its toys >> when it sees this. > > Indeed :-(. > > The truly strange part of this is that it seems that the one Windows > buildfarm member that's telling the truth (or most nearly so, anyway) > is narwhal, which appears to have the oldest and cruftiest toolchain > of the lot. I'd really like to come out the other end of this > investigation with a clear understanding of why the newer toolchains > are failing to report a link problem, and yet not building working > executables. Is it a linkage error? Could you please show me the error message concretely? regards, Hiroshi Inoue
OK I can see the error message at PostgreSQL Build Farm Log. I see a similar problem in a simple test case. gcc -DCALLPG=\"import1\" -c export.c -o export1.o dlltool --export-all --output-def export1.def export1.o dlltool --dllname export1.exe --def export1.def --output-lib export1.a dlltool --dllname export1.exe --output-exp export1.exp --def export1.def gcc -o export1.exe -Wl,--base-file,export1.base export1.exp export1.o dlltool --dllname export1.exe --base-file export1.base --output-exp export1.exp --def export1.def gcc -Wl,--stack=65536 -o export1.exe export1.exp export1.o rm -f export1.exp export1.base gcc -DCALLPG=\"import1\" -c import.c -o import1.o dlltool --export-all --output-def import1.def import1.o dllwrap -o import1.dll --dllname import1.dll import1.o export1.a --def import1.def Info: resolving _intnum by linking to __imp__intnum (auto-import) fu000001.o:(.idata$2+0xc): undefined reference to `export1_a_iname' nmth000000.o:(.idata$4+0x0): undefined reference to `_nm__intnum' collect2: ld returned 1 exit status Adding __declspec(dllimport) fixes the problem. Using gcc only build also fixes the problem.gcc -DCALLPG=\"import2\" -c export.c -o export2.ogcc -Wl,--export-all-symbols-Wl,--out-implib=export2.a -o export2 export2.oCreating library file: export2.agcc -DCALLPG=\"import2\" -c import.c -o import2.ogcc -shared -o import2.dll import2.oexport2.a -Wl,--out-implib=import2.dll.aInfo: resolving _intnum by linking to __imp__intnum (auto-import)Creating library file: import2.dll.a Though I'm not a MINGW expert at all, I know dllwrap is a deprecated tool and dlltool is almost a deprecated tool. Cygwin port is removing the use of dllwrap and dlltool now. Isn't it better for MINGW port to follow it? Changing the machine from the one using gcc 3.4.5 to another one using 4.6.1 also fixes the problem. regards, Hiroshi Inoue (2014/02/07 12:25), Hiroshi Inoue wrote: > (2014/02/05 14:52), Tom Lane wrote: >> Craig Ringer <craig@2ndquadrant.com> writes: >>> On 02/05/2014 06:29 AM, Tom Lane wrote: >>>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>>> (not happy with it, of course, but prepared to tolerate it) as long >>>> as I believed the buildfarm would reliably tell us of the need for >>>> it. That assumption has now been conclusively disproven, though. >> >>> I'm kind of horrified that the dynamic linker doesn't throw its toys >>> when it sees this. >> >> Indeed :-(. >> >> The truly strange part of this is that it seems that the one Windows >> buildfarm member that's telling the truth (or most nearly so, anyway) >> is narwhal, which appears to have the oldest and cruftiest toolchain >> of the lot. I'd really like to come out the other end of this >> investigation with a clear understanding of why the newer toolchains >> are failing to report a link problem, and yet not building working >> executables. > > Is it a linkage error? > Could you please show me the error message concretely? > > regards, > Hiroshi Inoue
Hiroshi Inoue <inoue@tpf.co.jp> writes: > Though I'm not a MINGW expert at all, I know dllwrap is a deprecated > tool and dlltool is almost a deprecated tool. Cygwin port is removing > the use of dllwrap and dlltool now. Isn't it better for MINGW port to > follow it? Only way to make that happen is to prepare and test a patch ... In the meantime, here's a short patch trying the "#define extern" approach. Anyone want to try this on a Windows machine or two? Or we could just commit it in HEAD for long enough to see what the buildfarm says. regards, tom lane *** /home/postgres/pgsql/src/include/c.h Fri Jan 17 15:57:23 2014 --- new/c.h Sat Feb 8 17:19:00 2014 *************** extern int vsnprintf(char *str, size_t c *** 961,973 **** #define memmove(d, s, c) bcopy(s, d, c) #endif ! /* no special DLL markers on most ports */ ! #ifndef PGDLLIMPORT #define PGDLLIMPORT - #endif - #ifndef PGDLLEXPORT #define PGDLLEXPORT - #endif /* * The following is used as the arg list for signal handlers. Any ports --- 961,969 ---- #define memmove(d, s, c) bcopy(s, d, c) #endif ! /* we don't use special DLL markers in declarations anymore */ #define PGDLLIMPORT #define PGDLLEXPORT /* * The following is used as the arg list for signal handlers. Any ports *** /home/postgres/pgsql/src/include/port/cygwin.h Tue Jul 23 13:41:12 2013 --- new/cygwin.h Sat Feb 8 17:22:32 2014 *************** *** 10,18 **** #endif #ifdef BUILDING_DLL ! #define PGDLLIMPORT __declspec (dllexport) #else ! #define PGDLLIMPORT __declspec (dllimport) #endif - - #define PGDLLEXPORT --- 10,16 ---- #endif #ifdef BUILDING_DLL ! #define extern extern __declspec (dllexport) #else ! #define extern extern __declspec (dllimport) #endif *** /home/postgres/pgsql/src/include/port/win32.h Sun Jan 26 22:38:59 2014 --- new/win32.h Sat Feb 8 17:26:53 2014 *************** *** 73,93 **** */ #if defined(WIN32) || defined(__CYGWIN__) - #ifdef BUILDING_DLL ! #define PGDLLIMPORT __declspec (dllexport) #else /* not BUILDING_DLL */ ! #define PGDLLIMPORT __declspec (dllimport) ! #endif ! ! #ifdef _MSC_VER ! #define PGDLLEXPORT __declspec (dllexport) ! #else ! #define PGDLLEXPORT #endif - #else /* not CYGWIN, not MSVC, not MingW */ - #define PGDLLIMPORT - #define PGDLLEXPORT #endif --- 73,83 ---- */ #if defined(WIN32) || defined(__CYGWIN__) #ifdef BUILDING_DLL ! #define extern extern __declspec (dllexport) #else /* not BUILDING_DLL */ ! #define extern extern __declspec (dllimport) #endif #endif *************** typedef int pid_t; *** 347,354 **** /* In backend/port/win32/signal.c */ ! extern PGDLLIMPORT volatile int pg_signal_queue; ! extern PGDLLIMPORT int pg_signal_mask; extern HANDLE pgwin32_signal_event; extern HANDLE pgwin32_initial_signal_pipe; --- 337,344 ---- /* In backend/port/win32/signal.c */ ! extern volatile int pg_signal_queue; ! extern int pg_signal_mask; extern HANDLE pgwin32_signal_event; extern HANDLE pgwin32_initial_signal_pipe;
On 2014-02-08 17:34:32 -0500, Tom Lane wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: > > Though I'm not a MINGW expert at all, I know dllwrap is a deprecated > > tool and dlltool is almost a deprecated tool. Cygwin port is removing > > the use of dllwrap and dlltool now. Isn't it better for MINGW port to > > follow it? > > Only way to make that happen is to prepare and test a patch ... > > In the meantime, here's a short patch trying the "#define extern" approach. > Anyone want to try this on a Windows machine or two? Or we could just > commit it in HEAD for long enough to see what the buildfarm says. I'd be fine with committing it to see what happens. Any chance you (or better somebody else!) could rename BUILDING_DLL while at it? The name is really misleading, since it's set when building the core postgres. I think that there's a small problem with your patch, namely that you made the explicit PGDLLEXPORT definition empty, but that's currently used in fmgr.h for the function and module magic, I think we actually need those to be __declspec(dllexport)ed when building an extension, right? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/08/2014 05:34 PM, Tom Lane wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >> tool and dlltool is almost a deprecated tool. Cygwin port is removing >> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >> follow it? > Only way to make that happen is to prepare and test a patch ... Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We did get rid of dllwrap. But I agree this is worth trying for Mingw. cheers andrew
Andres Freund <andres@2ndquadrant.com> writes: > I think that there's a small problem with your patch, namely that you > made the explicit PGDLLEXPORT definition empty, but that's currently > used in fmgr.h for the function and module magic, I think we actually > need those to be __declspec(dllexport)ed when building an extension, > right? The comment next to that says that we're relying on --export-all. If PGDLLEXPORT were being used for other contrib-module functions, you would have a point, but AFAICS no one ever bothered. (If this patch works, we could go around and clean up stuff like that, but that's a bit premature at this point...) regards, tom lane
On Sun, Feb 9, 2014 at 4:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >> tool and dlltool is almost a deprecated tool. Cygwin port is removing >> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >> follow it? > > Only way to make that happen is to prepare and test a patch ... > > In the meantime, here's a short patch trying the "#define extern" approach. > Anyone want to try this on a Windows machine or two? There are quite a few warnings and errors in build: Most prominent are: 1> reloptions.c 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h(88): warning C4141: 'dllexport' : used more than once 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h(460): warning C4141: 'dllexport' : used more than once 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\float.h(180): warning C4141: 'dllexport' : used more than once 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h(88): warning C4141: 'dllexport' : used more than once 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\math.h(460): warning C4141: 'dllexport' : used more than once 1>c:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\include\float.h(180): warning C4141: 'dllexport' : used more than once There are few others as well: 1>src\backend\utils\misc\guc-file.c(736): error C2375: 'GUC_yylex' : redefinition; different linkage 1> src\backend\utils\misc\guc-file.l(49) : see declaration of 'GUC_yylex' 1>c:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\include\winioctl.h(39): error C2370: 'GUID_DEVINTERFACE_DISK' : redefinition; different storage class 1> c:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\include\winioctl.h(39) : see declaration of 'GUID_DEVINTERFACE_DISK' Attached find the build output of postgres. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Attachment
On 09/02/2014 00:06, Andrew Dunstan wrote: > > On 02/08/2014 05:34 PM, Tom Lane wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >>> tool and dlltool is almost a deprecated tool. Cygwin port is removing >>> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >>> follow it? >> Only way to make that happen is to prepare and test a patch ... > > Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We > did get rid of dllwrap. But I agree this is worth trying for Mingw. > > cheers > > andrew > we should have get rid of dlltool on cygwin. At least it is not used on my build Regards Marco
On 02/09/2014 01:12 AM, Marco Atzeri wrote: > > > On 09/02/2014 00:06, Andrew Dunstan wrote: >> >> On 02/08/2014 05:34 PM, Tom Lane wrote: >>> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>>> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >>>> tool and dlltool is almost a deprecated tool. Cygwin port is removing >>>> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >>>> follow it? >>> Only way to make that happen is to prepare and test a patch ... >> >> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >> did get rid of dllwrap. But I agree this is worth trying for Mingw. >> > > we should have get rid of dlltool on cygwin. > At least it is not used on my build > > Regards > Marco > > > The send in a patch. The patch you sent in previously did not totally remove it IIRC. cheers andrew
On 09/02/2014 14:10, Andrew Dunstan wrote: > > On 02/09/2014 01:12 AM, Marco Atzeri wrote: >> >> >> On 09/02/2014 00:06, Andrew Dunstan wrote: >>> >>> On 02/08/2014 05:34 PM, Tom Lane wrote: >>>> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>>>> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >>>>> tool and dlltool is almost a deprecated tool. Cygwin port is removing >>>>> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >>>>> follow it? >>>> Only way to make that happen is to prepare and test a patch ... >>> >>> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >>> did get rid of dllwrap. But I agree this is worth trying for Mingw. >>> >> >> we should have get rid of dlltool on cygwin. >> At least it is not used on my build >> >> Regards >> Marco >> >> >> > > > The send in a patch. The patch you sent in previously did not totally > remove it IIRC. > > cheers > > andrew > attached patch versus latest git. except prepared_xacts test all others are fine ======================= All 140 tests passed. ======================= Regards Marco
Attachment
Amit Kapila <amit.kapila16@gmail.com> writes: > On Sun, Feb 9, 2014 at 4:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> In the meantime, here's a short patch trying the "#define extern" approach. >> Anyone want to try this on a Windows machine or two? > There are quite a few warnings and errors in build: Hmm. Looks like the #define is messing up the handling of "extern"s appearing in system headers. There's likely no good way around that, at least none with the simplicity that's the only redeeming value of this approach. So this won't work, and we should be looking at whether it will help to not use dlltool anymore (per other emails in this thread). Thanks for doing the testing! regards, tom lane
On 02/05/2014 01:52 PM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 02/05/2014 06:29 AM, Tom Lane wrote: >>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>> (not happy with it, of course, but prepared to tolerate it) as long >>> as I believed the buildfarm would reliably tell us of the need for >>> it. That assumption has now been conclusively disproven, though. > >> I'm kind of horrified that the dynamic linker doesn't throw its toys >> when it sees this. > > Indeed :-(. > > The truly strange part of this is that it seems that the one Windows > buildfarm member that's telling the truth (or most nearly so, anyway) > is narwhal, which appears to have the oldest and cruftiest toolchain > of the lot. I'd really like to come out the other end of this > investigation with a clear understanding of why the newer toolchains > are failing to report a link problem, and yet not building working > executables. For MSVC, here's a patch that makes gendef.pl emit DATA annotations for global var exports. Unfortunately, my Windows test machine has been chewing up its file system with memory errors due to a hardware fault, so compilation attempts (of anything) are currently generating internal compiler errors. The output of the script looks correct, but I can't get a good build with or without the patch. I'll try to get my build box working for testing, but have to get on to other things now, so I won't be able to work further on it today. Also attached is a patch to make vcregress.pl produce a better error message when there's no build output, instead of just reporting that ".. is not a recognized internal or external command, operable program, or batch file" -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, Feb 10, 2014 at 8:51 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 02/05/2014 01:52 PM, Tom Lane wrote: >> Craig Ringer <craig@2ndquadrant.com> writes: >>> On 02/05/2014 06:29 AM, Tom Lane wrote: >>>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>>> (not happy with it, of course, but prepared to tolerate it) as long >>>> as I believed the buildfarm would reliably tell us of the need for >>>> it. That assumption has now been conclusively disproven, though. >> >>> I'm kind of horrified that the dynamic linker doesn't throw its toys >>> when it sees this. >> >> Indeed :-(. >> >> The truly strange part of this is that it seems that the one Windows >> buildfarm member that's telling the truth (or most nearly so, anyway) >> is narwhal, which appears to have the oldest and cruftiest toolchain >> of the lot. I'd really like to come out the other end of this >> investigation with a clear understanding of why the newer toolchains >> are failing to report a link problem, and yet not building working >> executables. > > For MSVC, here's a patch that makes gendef.pl emit DATA annotations for > global var exports. Is this change intended to avoid the usage of __declspec (dllimport) for global variables? Won't it increase the build time for Windows as it seems to me you are traversing each symbol file to find the global vars? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On 02/10/2014 01:59 PM, Amit Kapila wrote: > On Mon, Feb 10, 2014 at 8:51 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 02/05/2014 01:52 PM, Tom Lane wrote: >>> Craig Ringer <craig@2ndquadrant.com> writes: >>>> On 02/05/2014 06:29 AM, Tom Lane wrote: >>>>> I had been okay with the manual PGDLLIMPORT-sprinkling approach >>>>> (not happy with it, of course, but prepared to tolerate it) as long >>>>> as I believed the buildfarm would reliably tell us of the need for >>>>> it. That assumption has now been conclusively disproven, though. >>> >>>> I'm kind of horrified that the dynamic linker doesn't throw its toys >>>> when it sees this. >>> >>> Indeed :-(. >>> >>> The truly strange part of this is that it seems that the one Windows >>> buildfarm member that's telling the truth (or most nearly so, anyway) >>> is narwhal, which appears to have the oldest and cruftiest toolchain >>> of the lot. I'd really like to come out the other end of this >>> investigation with a clear understanding of why the newer toolchains >>> are failing to report a link problem, and yet not building working >>> executables. >> >> For MSVC, here's a patch that makes gendef.pl emit DATA annotations for >> global var exports. > > Is this change intended to avoid the usage of __declspec (dllimport) for > global variables? > > Won't it increase the build time for Windows as it seems to me you > are traversing each symbol file to find the global vars? gendefs.pl does that anyway. This change just annotates emitted entries with "DATA" if they're exported vars. It takes a couple of seconds to run gendefs.pl, so I don't think it's really a big concern. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/09 8:06), Andrew Dunstan wrote: > > On 02/08/2014 05:34 PM, Tom Lane wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >>> tool and dlltool is almost a deprecated tool. Cygwin port is removing >>> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >>> follow it? >> Only way to make that happen is to prepare and test a patch ... > > Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We > did get rid of dllwrap. But I agree this is worth trying for Mingw. I tried MINGW port with the attached change and successfully built src and contrib and all pararell regression tests were OK. regards, Hiroshi Inoue
Attachment
(2014/02/10 22:42), Hiroshi Inoue wrote: > (2014/02/09 8:06), Andrew Dunstan wrote: >> >> On 02/08/2014 05:34 PM, Tom Lane wrote: >>> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>>> Though I'm not a MINGW expert at all, I know dllwrap is a deprecated >>>> tool and dlltool is almost a deprecated tool. Cygwin port is removing >>>> the use of dllwrap and dlltool now. Isn't it better for MINGW port to >>>> follow it? >>> Only way to make that happen is to prepare and test a patch ... >> >> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >> did get rid of dllwrap. But I agree this is worth trying for Mingw. > > I tried MINGW port with the attached change and successfully built > src and contrib and all pararell regression tests were OK. I forgot to mention the environment. I tried the change in 2 machines and both worked. 1. 32bit Windows7 GCC 4.6.1 2. 32bit Windows Vista GCC 3.4.5 I didn't test 64bit platform. regards, Hiroshi Inoue -- I am using the free version of SPAMfighter. SPAMfighter has removed 3567 of my spam emails to date. Get the free SPAMfighter here: http://www.spamfighter.com/len Do you have a slow PC? Try a Free scan http://www.spamfighter.com/SLOW-PCfighter?cid=sigen
"Inoue, Hiroshi" <inoue@tpf.co.jp> writes: > (2014/02/10 22:42), Hiroshi Inoue wrote: >> I tried MINGW port with the attached change and successfully built >> src and contrib and all pararell regression tests were OK. > I forgot to mention the environment. I tried the change in 2 machines > and both worked. If there are no objections, I'll push this patch into HEAD tomorrow, along with the upthread patches from Craig Ringer and Marco Atzeri. We might as well see if this stuff is going to work ... regards, tom lane
On 02/11/2014 01:28 PM, Tom Lane wrote: > If there are no objections, I'll push this patch into HEAD tomorrow, > along with the upthread patches from Craig Ringer and Marco Atzeri. > We might as well see if this stuff is going to work ... I'd love to test my patch properly before pushing it, but my dev machine is going to need a total teardown and rebuild, and I'm currently focused on polishing off urgent open work. So let's see what the buildfarm makes of it, I guess. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Feb 11, 2014 at 11:01 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 02/11/2014 01:28 PM, Tom Lane wrote: >> If there are no objections, I'll push this patch into HEAD tomorrow, >> along with the upthread patches from Craig Ringer and Marco Atzeri. >> We might as well see if this stuff is going to work ... > > I'd love to test my patch properly before pushing it, but my dev machine > is going to need a total teardown and rebuild, I can do the test of your patch/idea, please confirm if below steps are sufficient: a. Change manually postgres.def file and add DATA for MainLWLockArray. (Will it be sufficient to change manually or shouldI apply your patch) b. Rebuild pg_buffercache module c. Test pg_buffercache if it can access the variable. d. If above works, then run 'check'. If I understand correctly your patch is intended to resolve PGDLLIMPORT problem, right? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Marco Atzeri <marco.atzeri@gmail.com> writes: > On 09/02/2014 14:10, Andrew Dunstan wrote: >> On 02/09/2014 01:12 AM, Marco Atzeri wrote: >>> we should have get rid of dlltool on cygwin. >>> At least it is not used on my build >> The send in a patch. The patch you sent in previously did not totally >> remove it IIRC. > attached patch versus latest git. I've committed this with some fixes. However, I omitted the hunks that change the names of generated shared libraries (to add SO_MAJOR_VERSION). I think that requires some policy discussion around whether we want to do it or not, and in any case it's unrelated to the issues being discussed in this thread. If you still want that, please submit it as a separate patch in a new thread, with some discussion as to why it's a good idea. regards, tom lane
Hiroshi Inoue <inoue@tpf.co.jp> writes: > (2014/02/09 8:06), Andrew Dunstan wrote: >> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >> did get rid of dllwrap. But I agree this is worth trying for Mingw. > I tried MINGW port with the attached change and successfully built > src and contrib and all pararell regression tests were OK. I cleaned this up a bit (the if-nesting in Makefile.shlib was making my head hurt, not to mention that it left a bunch of dead code) and committed it. By my count, the only remaining usage of dlltool is in plpython's Makefile. Can we get rid of that? Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile. Do we need that either? regards, tom lane
Craig Ringer <craig@2ndquadrant.com> writes: > For MSVC, here's a patch that makes gendef.pl emit DATA annotations for > global var exports. Committed. > Also attached is a patch to make vcregress.pl produce a better error > message when there's no build output, instead of just reporting that > ".. is not a recognized internal or external command, operable program, > or batch file" I would've committed this, but I'm a bit hesitant about this part: + use 5.10.1; Are we moving the goalposts on the Perl version needed, and if so how far? I didn't question the "use 5.8.0" you added to gendef.pl, but it disturbs me that this isn't the same. regards, tom lane
On 11/02/2014 18:15, Tom Lane wrote: > Marco Atzeri <marco.atzeri@gmail.com> writes: >> On 09/02/2014 14:10, Andrew Dunstan wrote: >>> On 02/09/2014 01:12 AM, Marco Atzeri wrote: >>>> we should have get rid of dlltool on cygwin. >>>> At least it is not used on my build > >>> The send in a patch. The patch you sent in previously did not totally >>> remove it IIRC. > >> attached patch versus latest git. > > I've committed this with some fixes. However, I omitted the hunks that > change the names of generated shared libraries (to add SO_MAJOR_VERSION). > I think that requires some policy discussion around whether we want to > do it or not, and in any case it's unrelated to the issues being discussed > in this thread. If you still want that, please submit it as a separate > patch in a new thread, with some discussion as to why it's a good idea. > > regards, tom lane > Noted. On cygwin the shared libraries are using the SO_MAJOR_VERSION by long time. cd /usr/bin $ ls cyggcc*dll cyggcc_s-1.dll cyggccpp-1.dll $ ls cygfo*dll cygfontconfig-1.dll cygform-10.dll cygform-8.dll cygformw-10.dll cygfontenc-1.dll cygform7.dll cygform-9.dll In this way we allow coexistence of several release, similar to /usr/lib/libpq.so.5 on unix. Regards Marco
Craig Ringer <craig@2ndquadrant.com> writes: > On 02/11/2014 01:28 PM, Tom Lane wrote: >> If there are no objections, I'll push this patch into HEAD tomorrow, >> along with the upthread patches from Craig Ringer and Marco Atzeri. >> We might as well see if this stuff is going to work ... > I'd love to test my patch properly before pushing it, but my dev machine > is going to need a total teardown and rebuild, and I'm currently focused > on polishing off urgent open work. > So let's see what the buildfarm makes of it, I guess. So the early returns from currawong are interesting: "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> (contrib\pg_buffercache target) -> pg_buffercache_pages.obj : error LNK2001: unresolved external symbol _MainLWLockArray.\Release\pg_buffercache\pg_buffercache.dll : fatal error LNK1120: 1 unresolved externals "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> (contrib\pg_stat_statements target) -> pg_stat_statements.obj : error LNK2001: unresolved external symbol _MainLWLockArray.\Release\pg_stat_statements\pg_stat_statements.dll : fatal error LNK1120: 1 unresolved externals "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> (contrib\test_shm_mq target) -> worker.obj : error LNK2001: unresolved external symbol _ProcDiePending worker.obj : errorLNK2001: unresolved external symbol _proc_exit_inprogress .\Release\test_shm_mq\test_shm_mq.dll : fatal error LNK1120:2 unresolved externals I guess this is good news in one sense: now we're getting results from an MSVC machine that are consistent with what narwhal has been telling us. But how is it that your patch caused this to be reported when it wasn't before? And how come we're not getting the results we hoped for, that these symbols would be effectively exported without needing PGDLLIMPORT? regards, tom lane
I wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> I tried MINGW port with the attached change and successfully built >> src and contrib and all pararell regression tests were OK. > I cleaned this up a bit (the if-nesting in Makefile.shlib was making > my head hurt, not to mention that it left a bunch of dead code) and > committed it. Hm ... according to buildfarm member narwhal, this doesn't work so well for plperl: gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a Cannot export .idata$4: symbol not found Cannot export .idata$5: symbol not found Cannot export .idata$6: symbol not found Cannot export .text: symbol not found Cannot export perl58_NULL_THUNK_DATA: symbol not found Creating library file: libplperl.a collect2: ld returned 1 exit status make[3]: *** [plperl.dll] Error 1 Not very clear what's going on there; could this be a problem in narwhal's admittedly-ancient toolchain? BTW, now that I look at this ... why are we bothering to build static libraries (.a files) for DLLs? They have no possible use AFAICS. regards, tom lane
On 02/12/2014 07:22 AM, Tom Lane wrote: > So the early returns from currawong are interesting: > > "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> > (contrib\pg_buffercache target) -> > pg_buffercache_pages.obj : error LNK2001: unresolved external symbol _MainLWLockArray > .\Release\pg_buffercache\pg_buffercache.dll : fatal error LNK1120: 1 unresolved externals > > > "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> > (contrib\pg_stat_statements target) -> > pg_stat_statements.obj : error LNK2001: unresolved external symbol _MainLWLockArray > .\Release\pg_stat_statements\pg_stat_statements.dll : fatal error LNK1120: 1 unresolved externals > > > "d:\bf\root\HEAD\pgsql.920\pgsql.sln" (default target) (1) -> > (contrib\test_shm_mq target) -> > worker.obj : error LNK2001: unresolved external symbol _ProcDiePending > worker.obj : error LNK2001: unresolved external symbol _proc_exit_inprogress > .\Release\test_shm_mq\test_shm_mq.dll : fatal error LNK1120: 2 unresolved externals Great, that's what I was hoping to see - proper errors where we've omitted things, not silent miscompilation. > I guess this is good news in one sense: now we're getting results from an > MSVC machine that are consistent with what narwhal has been telling us. > But how is it that your patch caused this to be reported when it wasn't > before? And how come we're not getting the results we hoped for, that > these symbols would be effectively exported without needing PGDLLIMPORT? Unfortunately I don't think we can get to "totally invisible" with the MS toolchain. We're still going to have to annotate exported symbols, but at least we can tell where code tries to import something that hasn't been annotated. This just fixes the bug in the .def generator that was permitting wrong code to compile silently and incorrectly, rather than fail to link. It makes omitted PGDLLIMPORT's obvious, but it doesn't appear to be possible to tell the toolchain that the *importing side* should auto-detect whether an extern is from another DLL and automatically __declspec("dllimport") it. As I understand it, what's happening here is that the importing side (the contrib/tool) is trying to do old-style DLL linkage, where the export library for the DLL is expected to expose a symbol referring to *a pointer to the real data* that gets statically linked into the importing side. We're trying to link to those pointer symbols and are failing because the `DATA` annotation suppresses their generation. (Remember that in Microsoft-land you don't actually link to a .DLL at compilation time, you statically link to the export library .LIB for the DLL, which contains stubs and fixup information). If we instead emitted "CONST" in the .DEF file, these indirect pointers would be generated by the linker and added to the export library, and we'd link to those instead. Our code doesn't expect the extra level of indirection introduced and would fail (mostly) at runtime, much like what was happening when we were linking to function stubs. The only way I can see to actually get unix-like linkage is to use __declspec("dllimport") on the *importing side*, e.g. when compiling contribs etc. That means we still need the much-hated windows-droppings, but we can at least see when they got left out. (On a side note, I'm starting to suspect - and need to verify - that as a result of not using PGDLLIMPORT on exported functions we're actually doing old-style thunk function calls from extensions into postgres.exe, not direct function calls, for a similar reason.) I'm going to go wash my brain out now, it feels dirty. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/11/2014 11:04 PM, Amit Kapila wrote: > On Tue, Feb 11, 2014 at 11:01 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >> On 02/11/2014 01:28 PM, Tom Lane wrote: >>> If there are no objections, I'll push this patch into HEAD tomorrow, >>> along with the upthread patches from Craig Ringer and Marco Atzeri. >>> We might as well see if this stuff is going to work ... >> >> I'd love to test my patch properly before pushing it, but my dev machine >> is going to need a total teardown and rebuild, > > I can do the test of your patch/idea, please confirm if below steps are > sufficient: > a. Change manually postgres.def file and add DATA for MainLWLockArray. > (Will it be sufficient to change manually or should I apply your patch) No, you must rebuild "postgres", or at least re-generate the .DEF file and re-link postgres.exe to generate a new import library (.lib) for it. It'll be easiest to just recompile. > b. Rebuild pg_buffercache module Yes. > c. Test pg_buffercache if it can access the variable. Yes, though it should fail to link if there's no PGDLLIMPORT. > d. If above works, then run 'check'. Yes, after adding the PGDLLIMPORT and re-generating the postgres .lib by relinking or recompiling. > If I understand correctly your patch is intended to resolve PGDLLIMPORT > problem, right? No, just make it obvious where such imports are missing. I can't see a way to completely make it invisible with the MS toolchain. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/12/2014 07:30 AM, Tom Lane wrote: > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a > Cannot export .idata$4: symbol not found > Cannot export .idata$5: symbol not found > Cannot export .idata$6: symbol not found > Cannot export .text: symbol not found > Cannot export perl58_NULL_THUNK_DATA: symbol not found > Creating library file: libplperl.a > collect2: ld returned 1 exit status > make[3]: *** [plperl.dll] Error 1 > > Not very clear what's going on there; could this be a problem in > narwhal's admittedly-ancient toolchain? Easily. > BTW, now that I look at this ... why are we bothering to build static > libraries (.a files) for DLLs? They have no possible use AFAICS. This is actually compiling a DLL: -o plperl.dll but is also emitting an import library: -Wl,--out-implib=libplperl.a which is required if you wish to link to plperl.dll from any other compilation unit (except for by dynamic linkage). I don't see any use for that with plperl, but it might be a valid thing to be doing for (e.g.) hstore.dll. Though you can't really link to it from another module anyway, you have to go through the fmgr to get access to its symbols at rutime, so we can probably just skip generation of import libraries for contribs and PLs. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > On 02/12/2014 07:30 AM, Tom Lane wrote: >> BTW, now that I look at this ... why are we bothering to build static >> libraries (.a files) for DLLs? They have no possible use AFAICS. > I don't see any use for that with plperl, but it might be a valid thing > to be doing for (e.g.) hstore.dll. Though you can't really link to it > from another module anyway, you have to go through the fmgr to get > access to its symbols at rutime, so we can probably just skip generation > of import libraries for contribs and PLs. On second thought, I believe we need it for, say, libpq. Not sure if it's worth adding the Makefile logic that'd be needed to not build the import library for other cases. regards, tom lane
Craig Ringer <craig@2ndquadrant.com> writes: > On 02/12/2014 07:22 AM, Tom Lane wrote: >> So the early returns from currawong are interesting: > Great, that's what I was hoping to see - proper errors where we've > omitted things, not silent miscompilation. Well, before you get too optimistic about that benighted platform ... mastodon just reported in on this patch, and it's showing a *different* set of unresolved symbols than currawong is. None of them are a surprise exactly, but nonetheless, why didn't currawong find the postgres_fdw issues? It still seems there's something unexplained here. regards, tom lane
On 02/12/2014 08:30 AM, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 02/12/2014 07:22 AM, Tom Lane wrote: >>> So the early returns from currawong are interesting: > >> Great, that's what I was hoping to see - proper errors where we've >> omitted things, not silent miscompilation. > > Well, before you get too optimistic about that benighted platform ... > mastodon just reported in on this patch, and it's showing a *different* > set of unresolved symbols than currawong is. None of them are a > surprise exactly, but nonetheless, why didn't currawong find the > postgres_fdw issues? > > It still seems there's something unexplained here. Looks like currawong doesn't build postgres_fdw. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/11/2014 08:04 PM, Craig Ringer wrote: > On 02/12/2014 08:30 AM, Tom Lane wrote: >> Craig Ringer <craig@2ndquadrant.com> writes: >>> On 02/12/2014 07:22 AM, Tom Lane wrote: >>>> So the early returns from currawong are interesting: >>> Great, that's what I was hoping to see - proper errors where we've >>> omitted things, not silent miscompilation. >> Well, before you get too optimistic about that benighted platform ... >> mastodon just reported in on this patch, and it's showing a *different* >> set of unresolved symbols than currawong is. None of them are a >> surprise exactly, but nonetheless, why didn't currawong find the >> postgres_fdw issues? >> >> It still seems there's something unexplained here. > Looks like currawong doesn't build postgres_fdw. > > It sure used to: <http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=currawong&dt=2014-02-03%2005%3A30%3A00&stg=make> cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > On 02/11/2014 08:04 PM, Craig Ringer wrote: >> Looks like currawong doesn't build postgres_fdw. > It sure used to: > <http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=currawong&dt=2014-02-03%2005%3A30%3A00&stg=make> Hm, does the MSVC build system do parallel builds? Maybe it abandoned the build after noticing the first failure, and whether you get to see more than one is timing-dependent. regards, tom lane
On 02/12/2014 09:45 AM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 02/11/2014 08:04 PM, Craig Ringer wrote: >>> Looks like currawong doesn't build postgres_fdw. > >> It sure used to: >> <http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=currawong&dt=2014-02-03%2005%3A30%3A00&stg=make> > > Hm, does the MSVC build system do parallel builds? Maybe it > abandoned the build after noticing the first failure, and > whether you get to see more than one is timing-dependent. I'm pretty certain it does, and that's quite a reasonable explanation. That reminds me - in addition to the option to force a serial build, I really need to post a patch for the msvc build that lets you suppress VERBOSE mode. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/12 8:30), Tom Lane wrote: > I wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> I tried MINGW port with the attached change and successfully built >>> src and contrib and all pararell regression tests were OK. > >> I cleaned this up a bit (the if-nesting in Makefile.shlib was making >> my head hurt, not to mention that it left a bunch of dead code) and >> committed it. > > Hm ... according to buildfarm member narwhal, this doesn't work so well > for plperl: > > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a > Cannot export .idata$4: symbol not found > Cannot export .idata$5: symbol not found > Cannot export .idata$6: symbol not found > Cannot export .text: symbol not found > Cannot export perl58_NULL_THUNK_DATA: symbol not found > Creating library file: libplperl.a > collect2: ld returned 1 exit status > make[3]: *** [plperl.dll] Error 1 Oops I forgot to inclule plperl, tcl or python, sorry. I would retry build operations with them. Unfortunately it would take pretty long time because build operations are pretty (or veeeryin an old machine) slow. > Not very clear what's going on there; could this be a problem in > narwhal's admittedly-ancient toolchain? > > BTW, now that I look at this ... why are we bothering to build static > libraries (.a files) for DLLs? They have no possible use AFAICS. They are import libraries though I doubt such plugins need them. regards, Hiroshi Inoue -- I am using the free version of SPAMfighter. SPAMfighter has removed 3592 of my spam emails to date. Get the free SPAMfighter here: http://www.spamfighter.com/len Do you have a slow PC? Try a Free scan http://www.spamfighter.com/SLOW-PCfighter?cid=sigen
On Wed, Feb 12, 2014 at 5:30 AM, Craig Ringer <craig@2ndquadrant.com> wrote: > On 02/11/2014 11:04 PM, Amit Kapila wrote: >> On Tue, Feb 11, 2014 at 11:01 AM, Craig Ringer <craig@2ndquadrant.com> wrote: >>> On 02/11/2014 01:28 PM, Tom Lane wrote: >>>> If there are no objections, I'll push this patch into HEAD tomorrow, >>>> along with the upthread patches from Craig Ringer and Marco Atzeri. >>>> We might as well see if this stuff is going to work ... >>> >>> I'd love to test my patch properly before pushing it, but my dev machine >>> is going to need a total teardown and rebuild, >> >> I can do the test of your patch/idea, please confirm if below steps are >> sufficient: >> a. Change manually postgres.def file and add DATA for MainLWLockArray. >> (Will it be sufficient to change manually or should I apply your patch) > > No, you must rebuild "postgres", or at least re-generate the .DEF file > and re-link postgres.exe to generate a new import library (.lib) for it. Thanks, I think already we have results from build farm, so it might not be of much use to do any additional verification/test. > No, just make it obvious where such imports are missing. I can't see a > way to completely make it invisible with the MS toolchain. Yes, I have also studied and tried a couple of things, but couldn't find a way to make it happen. So I think this might be the way things work for Windows platform and infact I have always used it in same way (__declspec (dllimport)) whenever there is any need. With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
<p dir="ltr"><br /> On Feb 12, 2014 4:09 AM, "Craig Ringer" <<a href="mailto:craig@2ndquadrant.com">craig@2ndquadrant.com</a>>wrote:<br /> ><br /> > On 02/12/2014 09:45 AM, TomLane wrote:<br /> > > Andrew Dunstan <<a href="mailto:andrew@dunslane.net">andrew@dunslane.net</a>> writes:<br/> > >> On 02/11/2014 08:04 PM, Craig Ringer wrote:<br /> > >>> Looks like currawong doesn'tbuild postgres_fdw.<br /> > ><br /> > >> It sure used to:<br /> > >> <<a href="http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=currawong&dt=2014-02-03%2005%3A30%3A00&stg=make">http://www.pgbuildfarm.org/cgi-bin/show_stage_log.pl?nm=currawong&dt=2014-02-03%2005%3A30%3A00&stg=make</a>><br />> ><br /> > > Hm, does the MSVC build system do parallel builds? Maybe it<br /> > > abandoned the buildafter noticing the first failure, and<br /> > > whether you get to see more than one is timing-dependent.<br />><br /> > I'm pretty certain it does, and that's quite a reasonable explanation.<p dir="ltr">It definitely does.It does it both between individual c files in a project, and between projects (taking care about dependencies). <p dir="ltr">Andyes, I've seen that cause it to build modules, particularly smaller ones, in different order at different invocationsbased on small differences in timing. And yes, it aborts on first failure. <p dir="ltr">So it would surprise meif that is *not* the problem... <br /><p dir="ltr">> That reminds me - in addition to the option to force a serial build,I<br /> > really need to post a patch for the msvc build that lets you suppress<br /> > VERBOSE mode.<p dir="ltr">Thatprobably wouldn't hurt :-) <p dir="ltr">/Magnus
(2014/02/12 3:03), Tom Lane wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> (2014/02/09 8:06), Andrew Dunstan wrote: >>> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >>> did get rid of dllwrap. But I agree this is worth trying for Mingw. > >> I tried MINGW port with the attached change and successfully built >> src and contrib and all pararell regression tests were OK. > > I cleaned this up a bit (the if-nesting in Makefile.shlib was making > my head hurt, not to mention that it left a bunch of dead code) and > committed it. Thanks. > By my count, the only remaining usage of dlltool is in plpython's > Makefile. Can we get rid of that? Maybe this is one of the few use cases of dlltool. Because python doesn't ship with its MINGW import library, the Makefile uses dlltool to generate an import library from the python DLL. > Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile. > Do we need that either? Maybe this can be removed. I would make a patch later. regards, Hiroshi Inoue -- I am using the free version of SPAMfighter. SPAMfighter has removed 3597 of my spam emails to date. Get the free SPAMfighter here: http://www.spamfighter.com/len Do you have a slow PC? Try a Free scan http://www.spamfighter.com/SLOW-PCfighter?cid=sigen
Craig Ringer <craig@2ndquadrant.com> writes: > Great, that's what I was hoping to see - proper errors where we've > omitted things, not silent miscompilation. And, just to make sure that ain't nobody happy ... brolga, our one operational Cygwin animal, is still building without complaints. We really need to understand what is going on here and why some toolchains don't seem to feel that lack of PGDLLIMPORT is a problem. Nosing around, I happened to notice this in src/template/cygwin: # --enable-auto-import gets rid of a diagnostics linker message LDFLAGS="-Wl,--allow-multiple-definition -Wl,--enable-auto-import" Anybody know *exactly* what --enable-auto-import does? The name is, um, suggestive. regards, tom lane
On 2014-02-12 10:58:20 -0500, Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: > > Great, that's what I was hoping to see - proper errors where we've > > omitted things, not silent miscompilation. > > And, just to make sure that ain't nobody happy ... brolga, our one > operational Cygwin animal, is still building without complaints. Hm, isn't that pretty much expected? Cygwin's infrastructure tries to be unixoid, so it's not surprising that the toolchain doesn't require such strange things? Otherwise even larger amounts of unix software wouldn't run, right? I am pretty sure halfway recent versions of mingw can be made to behave similarly, but I don't really see the advantage in doing so, to the contrary even. > # --enable-auto-import gets rid of a diagnostics linker message > LDFLAGS="-Wl,--allow-multiple-definition -Wl,--enable-auto-import" > > Anybody know *exactly* what --enable-auto-import does? The name > is, um, suggestive. My ld(1)'s manpage has three screen's worth of details... Most of it seems to be on http://www.freebsd.org/cgi/man.cgi?query=ld&sektion=1 It's essentially elf like shared library linking in pe-coff through dirty tricks. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-12 10:58:20 -0500, Tom Lane wrote: >> Anybody know *exactly* what --enable-auto-import does? The name >> is, um, suggestive. > My ld(1)'s manpage has three screen's worth of details... Most of it > seems to be on http://www.freebsd.org/cgi/man.cgi?query=ld&sektion=1 > It's essentially elf like shared library linking in pe-coff through > dirty tricks. Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, we ought to actually remove that, so that the Cygwin build works more like the other Windows builds? regards, tom lane
On 2014-02-12 11:26:56 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-12 10:58:20 -0500, Tom Lane wrote: > >> Anybody know *exactly* what --enable-auto-import does? The name > >> is, um, suggestive. > > > My ld(1)'s manpage has three screen's worth of details... Most of it > > seems to be on http://www.freebsd.org/cgi/man.cgi?query=ld&sektion=1 > > > It's essentially elf like shared library linking in pe-coff through > > dirty tricks. > > Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, > we ought to actually remove that, so that the Cygwin build works more like > the other Windows builds? Hm, I don't see a big advantage in detecting the errors as It won't hugely increase the buildfarm coverage. But --auto-import seems to require marking the .text sections as writable, avoiding that seems to be a good idea if we don't need it anymore. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-12 11:26:56 -0500, Tom Lane wrote: >> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, >> we ought to actually remove that, so that the Cygwin build works more like >> the other Windows builds? > Hm, I don't see a big advantage in detecting the errors as It won't > hugely increase the buildfarm coverage. But --auto-import seems to > require marking the .text sections as writable, avoiding that seems to > be a good idea if we don't need it anymore. Given the rather small number of Windows machines in the buildfarm, I'd really like them all to be on the same page about what's valid and what isn't. Having to wait ~24 hours to find out if they're all happy with something isn't my idea of a good time. In any case, as long as we have discrepancies between them, I'm not going to be entirely convinced that any of them are telling the full truth. I'm going to go remove that switch and see if brolga starts failing. If it does, I'll be satisfied that we understand the issues here. regards, tom lane
On 2014-02-12 11:39:43 -0500, Tom Lane wrote: > I'm going to go remove that switch and see if brolga starts failing. > If it does, I'll be satisfied that we understand the issues here. The manpage also has a --disable-auto-import, but doesn't document wheter enabling or disabling is the default... So maybe try to be explicit? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Feb 12, 2014 at 11:39:43AM -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-12 11:26:56 -0500, Tom Lane wrote: > >> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, > >> we ought to actually remove that, so that the Cygwin build works more like > >> the other Windows builds? > > > Hm, I don't see a big advantage in detecting the errors as It won't > > hugely increase the buildfarm coverage. But --auto-import seems to > > require marking the .text sections as writable, avoiding that seems to > > be a good idea if we don't need it anymore. > > Given the rather small number of Windows machines in the buildfarm, > I'd really like them all to be on the same page about what's valid > and what isn't. Having to wait ~24 hours to find out if they're > all happy with something isn't my idea of a good time. In any case, > as long as we have discrepancies between them, I'm not going to be > entirely convinced that any of them are telling the full truth. > > I'm going to go remove that switch and see if brolga starts failing. > If it does, I'll be satisfied that we understand the issues here. Can document the issue in case it comes up again, please? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-12 11:39:43 -0500, Tom Lane wrote: >> I'm going to go remove that switch and see if brolga starts failing. >> If it does, I'll be satisfied that we understand the issues here. > The manpage also has a --disable-auto-import, but doesn't document > wheter enabling or disabling is the default... So maybe try to be > explicit? Yeah, I was just wondering about that myself. Presumably --enable is not the default, or it would not have gotten added. However, I notice that it was added a long time ago (2004) ... is it possible the default changed since then? regards, tom lane
On 2014-02-12 11:50:41 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-12 11:39:43 -0500, Tom Lane wrote: > >> I'm going to go remove that switch and see if brolga starts failing. > >> If it does, I'll be satisfied that we understand the issues here. > > > The manpage also has a --disable-auto-import, but doesn't document > > wheter enabling or disabling is the default... So maybe try to be > > explicit? > > Yeah, I was just wondering about that myself. Presumably --enable > is not the default, or it would not have gotten added. However, > I notice that it was added a long time ago (2004) ... is it possible > the default changed since then? Some release notes I just found seem to suggest it is http://sourceware.org/binutils/docs-2.17/ld/WIN32.html : This feature is enabled with the `--enable-auto-import' command-line option, although it is enabled by default on cygwin/mingw. The `--enable-auto-import' option itself now serves mainly to suppress any warnings that are ordinarily emitted when linked objects trigger the feature's use. 2.17 is from 2007, so it's been enabled by default at least since then. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > Some release notes I just found seem to suggest it is > http://sourceware.org/binutils/docs-2.17/ld/WIN32.html : > This feature is enabled with the `--enable-auto-import' command-line > option, although it is enabled by default on cygwin/mingw. Curious ... narwhal is mingw but evidently doesn't have that switch turned on. OTOH we also know it's an old version. I'm inclined to change template/win32 also. regards, tom lane
On 12/02/2014 17:26, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-02-12 10:58:20 -0500, Tom Lane wrote: >>> Anybody know *exactly* what --enable-auto-import does? The name >>> is, um, suggestive. > >> My ld(1)'s manpage has three screen's worth of details... Most of it >> seems to be on http://www.freebsd.org/cgi/man.cgi?query=ld&sektion=1 > >> It's essentially elf like shared library linking in pe-coff through >> dirty tricks. > > Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, > we ought to actually remove that, so that the Cygwin build works more like > the other Windows builds? > > regards, tom lane > If I am not wrong "--enable-auto-import" is already the default on cygwin build chain ( binutils >= 2.19.51 ) so it should make no difference on latest cygwin. Not sure for you 1.7.7 build enviroment. About PGDLLIMPORT , my build log is full of "warning: ‘optarg’ redeclared without dllimport attribute: previous dllimport ignored " I suspect that removing will also make no difference. Marco PS: we aim unix-like builds not windows one....
On 2014-02-12 19:13:07 +0100, Marco Atzeri wrote: > On 12/02/2014 17:26, Tom Lane wrote: > >Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, > >we ought to actually remove that, so that the Cygwin build works more like > >the other Windows builds? > If I am not wrong "--enable-auto-import" is already the > default on cygwin build chain ( binutils >= 2.19.51 ) so it should make no > difference on latest cygwin. Not sure for you 1.7.7 build enviroment. We're *disabling* not *enabling* it. > About PGDLLIMPORT , my build log is full of "warning: ‘optarg’ redeclared > without dllimport attribute: previous dllimport ignored " That should be fixed then. I guess cygwin's getopt.h declares it that way? > I suspect that removing will also make no difference. The committed patch explicitly disables the functionality. > PS: we aim unix-like builds not windows one.... Well, there are a significant number of caveats around the auto import functionality, so there seems little benefit in using it if all the declspec's have to be there anyway. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/02/2014 19:19, Andres Freund wrote: > On 2014-02-12 19:13:07 +0100, Marco Atzeri wrote: >> On 12/02/2014 17:26, Tom Lane wrote: >>> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, >>> we ought to actually remove that, so that the Cygwin build works more like >>> the other Windows builds? > >> If I am not wrong "--enable-auto-import" is already the >> default on cygwin build chain ( binutils >= 2.19.51 ) so it should make no >> difference on latest cygwin. Not sure for you 1.7.7 build enviroment. > > We're *disabling* not *enabling* it. remove is not disable if enable is already the default inside binutils and gcc. Or I am missing something ? >> About PGDLLIMPORT , my build log is full of "warning: ‘optarg’ redeclared >> without dllimport attribute: previous dllimport ignored " > > That should be fixed then. I guess cygwin's getopt.h declares it that way? from /usr/include/getopt.h #ifndef __INSIDE_CYGWIN__ extern int __declspec(dllimport) opterr; /* if error message should be printed */ extern int __declspec(dllimport) optind; /* index into parent argv vector */ extern int __declspec(dllimport) optopt; /* character checked for validity */ extern int __declspec(dllimport) optreset; /* reset getopt */ extern char __declspec(dllimport) *optarg; /* argument associated with option */ #endif > >> I suspect that removing will also make no difference. > > The committed patch explicitly disables the functionality. > >> PS: we aim unix-like builds not windows one.... > > Well, there are a significant number of caveats around the auto import > functionality, so there seems little benefit in using it if all the > declspec's have to be there anyway. I think that I am not currently using anymore the declspec in the build. But I could be wrong, as the the postgresql build way is the most complicated between all the ones I am dealing with. > Greetings, > > Andres Freund > Cheers Marco
Marco Atzeri <marco.atzeri@gmail.com> writes: > On 12/02/2014 19:19, Andres Freund wrote: >> On 2014-02-12 19:13:07 +0100, Marco Atzeri wrote: >>> About PGDLLIMPORT , my build log is full of "warning: ‘optarg’ redeclared >>> without dllimport attribute: previous dllimport ignored " >> That should be fixed then. I guess cygwin's getopt.h declares it that way? > from /usr/include/getopt.h > extern char __declspec(dllimport) *optarg; /* argument associated > with option */ Hm. All of our files that use getopt also do extern char *optarg; extern int optind, opterr; #ifdef HAVE_INT_OPTRESET extern int optreset; /* might not be declared by system headers */ #endif regards, tom lane
On 2/11/14, 7:04 PM, Craig Ringer wrote: > I don't see any use for that with plperl, but it might be a valid thing > to be doing for (e.g.) hstore.dll. Though you can't really link to it > from another module anyway, you have to go through the fmgr to get > access to its symbols at rutime, so we can probably just skip generation > of import libraries for contribs and PLs. There are cases where one module needs symbols from another directly. Would that be affected by this?
On February 12, 2014 10:23:21 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >On 2/11/14, 7:04 PM, Craig Ringer wrote: >> I don't see any use for that with plperl, but it might be a valid >thing >> to be doing for (e.g.) hstore.dll. Though you can't really link to it >> from another module anyway, you have to go through the fmgr to get >> access to its symbols at rutime, so we can probably just skip >generation >> of import libraries for contribs and PLs. > >There are cases where one module needs symbols from another directly. >Would that be affected by this? I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On February 12, 2014 10:23:21 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >> There are cases where one module needs symbols from another directly. >> Would that be affected by this? > I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. No, I've not touched the PGDLLIMPORT macros. I was hoping to, but it looks like we're not getting there :-( regards, tom lane
On 2/12/14, 4:30 PM, Andres Freund wrote: >> There are cases where one module needs symbols from another directly. >> Would that be affected by this? > > I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. It works reasonably well on other platforms. Of course, we can barely build extension modules on Windows, so maybe this is a bit much to ask. But as long as we're dealing only with functions, not variables, it should work without any dllimport dances, right?
Peter Eisentraut <peter_e@gmx.net> writes: >>> There are cases where one module needs symbols from another directly. >>> Would that be affected by this? > It works reasonably well on other platforms. > Of course, we can barely build extension modules on Windows, so maybe > this is a bit much to ask. But as long as we're dealing only with > functions, not variables, it should work without any dllimport dances, > right? AFAIK we've not changed anything that would affect that. regards, tom lane
On February 12, 2014 10:35:06 PM CET, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Andres Freund <andres@2ndquadrant.com> writes: >> On February 12, 2014 10:23:21 PM CET, Peter Eisentraut ><peter_e@gmx.net> wrote: >>> There are cases where one module needs symbols from another >directly. >>> Would that be affected by this? > >> I don't think we have real infrastructure for that yet. Neither from >the POV of loading several .so's, nor from a symbol visibility. Afaics >we'd need a working definition of PGDLLIMPORT which inverts the >declspecs. I think Tom just removed the remnants of that. > >No, I've not touched the PGDLLIMPORT macros. I was hoping to, but it >looks like we're not getting there :-( Right, that was just the test patch... Then the macros we're using in fmgr.h for the magic macros (even if not strictly needed)should work for Peter's case. Andres -- Please excuse brevity and formatting - I am writing this on my mobile phone. Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 02/12/2014 02:31 PM, Inoue, Hiroshi wrote: > Maybe this is one of the few use cases of dlltool. > Because python doesn't ship with its MINGW import library, the > Makefile uses dlltool to generate an import library from the python > DLL. Wow. Has anyone been in touch with the Python package maintainers to see if they can fix that and bundle the .lib ? -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/13/2014 12:39 AM, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-02-12 11:26:56 -0500, Tom Lane wrote: >>> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, >>> we ought to actually remove that, so that the Cygwin build works more like >>> the other Windows builds? > >> Hm, I don't see a big advantage in detecting the errors as It won't >> hugely increase the buildfarm coverage. But --auto-import seems to >> require marking the .text sections as writable, avoiding that seems to >> be a good idea if we don't need it anymore. > > Given the rather small number of Windows machines in the buildfarm, > I'd really like them all to be on the same page about what's valid > and what isn't. Having to wait ~24 hours to find out if they're > all happy with something isn't my idea of a good time. In any case, > as long as we have discrepancies between them, I'm not going to be > entirely convinced that any of them are telling the full truth. > > I'm going to go remove that switch and see if brolga starts failing. > If it does, I'll be satisfied that we understand the issues here. I wouldn't assume that _anything_ Cygwin does makes sense, or is consistent with anything else. Its attempts to produce UNIX-like behaviour on Windows cause some truly bizarre behaviour at times. It is not totally unfair to compare the level of weirdness of Cygwin to that of WINE, though they operate in completely different ways. I wouldn't suggest making any conclusions about the _other_ platforms based on Cygwin. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/13/2014 05:23 AM, Peter Eisentraut wrote: > On 2/11/14, 7:04 PM, Craig Ringer wrote: >> I don't see any use for that with plperl, but it might be a valid thing >> to be doing for (e.g.) hstore.dll. Though you can't really link to it >> from another module anyway, you have to go through the fmgr to get >> access to its symbols at rutime, so we can probably just skip generation >> of import libraries for contribs and PLs. > > There are cases where one module needs symbols from another directly. > Would that be affected by this? Yes, in that you cannot link directly to another DLL on Windows (without hoop jumping and lots of pain), you link to the import library. So if we don't generate an import library then (eg) MyExtension cannot link to hstore.dll . It can still look up function exports via the fmgr. As concluded upthread, it's easier to just generate import libraries for everything since we need it for the client library, so nothing's going to change anyway. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > On 02/12/2014 02:31 PM, Inoue, Hiroshi wrote: >> Maybe this is one of the few use cases of dlltool. >> Because python doesn't ship with its MINGW import library, the >> Makefile uses dlltool to generate an import library from the python >> DLL. > Wow. Has anyone been in touch with the Python package maintainers to see > if they can fix that and bundle the .lib ? Indeed somebody should pester them about that, but it's not clear how much it'd help us even if they fixed it tomorrow. We're going to have to be able to build with existing Python distributions for a long time to come. regards, tom lane
On 02/13/2014 05:35 AM, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On February 12, 2014 10:23:21 PM CET, Peter Eisentraut <peter_e@gmx.net> wrote: >>> There are cases where one module needs symbols from another directly. >>> Would that be affected by this? > >> I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. > > No, I've not touched the PGDLLIMPORT macros. I was hoping to, but it > looks like we're not getting there :-( In theory we could now remove the __declspec(dllexport) case for BUILDING_DLL, as it should now be redundant with the fixed .DEF generator. Should. Unfortunately it looks like there's not going to be any getting around having something that can turn into __declspec(dllimport) in the headers for compiling things that link to postgres.exe, though. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/13/2014 05:35 AM, Peter Eisentraut wrote: > On 2/12/14, 4:30 PM, Andres Freund wrote: >>> There are cases where one module needs symbols from another directly. >>> Would that be affected by this? >> >> I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. > > It works reasonably well on other platforms. > > Of course, we can barely build extension modules on Windows, so maybe > this is a bit much to ask. But as long as we're dealing only with > functions, not variables, it should work without any dllimport dances, > right? Don't think so. If you don't have __declspec(dllexport) or a .DEF file marking something as exported, it's not part of the DLL interface at all, it's like you compiled it with gcc using -fvisibility=hidden and didn't give the symbol __attribute__((visibility ("default")) . If you _do_ have the symbol exported from the DLL, using __declspec(dllimport) or a generated .DEF file that exposes all "extern"s, you can link to the symbol. However, from the reading I've done recently, I'm pretty sure that if you fail to declare __declspec(dllimport) on the importing side, you actually land up statically linking to a thunk function that in turn calls the real function in the DLL. So it works, but at a performance cost. So you should do the dance. Sorry. It gets worse, too. Say you want hstore to export a couple of symbols. Those symbols must be __declspec(dllexport) while everything else in headers must be __declspec(dllimport). This means you can't just use PGDLLIMPORT. You must define a HSTOREDLLIMPORT that's __declspec(dllexport) when compiling hstore and otherwise __declspec(dllimport). Then set a preprocessor macro like -DCOMPILING_HSTORE to trigger it. Even though we're generating .DEF files, I don't think we can avoid that, because we at minimum need to have the hstore exported symbols *not* annotated __declspec(dllimport) when compiling hstore, but have them so annotated when importing the header into anything else. (Come to think of it, how're we dealing with this in libpq? I wonder if postgres is linking to libpq using thunk functions?) -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Craig Ringer <craig@2ndquadrant.com> writes: > However, from the reading I've done recently, I'm pretty sure that if > you fail to declare __declspec(dllimport) on the importing side, you > actually land up statically linking to a thunk function that in turn > calls the real function in the DLL. So it works, but at a performance cost. Meh. I'm not really willing to go through the pushups that would be needed to deal with that, even if it can be shown that the performance cost is noticeable. We've already kluged our source code for Windows far beyond what anybody would tolerate for any other platform, and I find that to be quite enough bending over for Redmond. regards, tom lane
On 2014-02-13 07:58:09 +0800, Craig Ringer wrote: > On 02/13/2014 05:35 AM, Peter Eisentraut wrote: > > On 2/12/14, 4:30 PM, Andres Freund wrote: > >>> There are cases where one module needs symbols from another directly. > >>> Would that be affected by this? > >> > >> I don't think we have real infrastructure for that yet. Neither from the POV of loading several .so's, nor from a symbolvisibility. Afaics we'd need a working definition of PGDLLIMPORT which inverts the declspecs. I think Tom just removedthe remnants of that. > > > > It works reasonably well on other platforms. > > > > Of course, we can barely build extension modules on Windows, so maybe > > this is a bit much to ask. But as long as we're dealing only with > > functions, not variables, it should work without any dllimport dances, > > right? > > Don't think so. > > If you don't have __declspec(dllexport) or a .DEF file marking something > as exported, it's not part of the DLL interface at all, it's like you > compiled it with gcc using -fvisibility=hidden and didn't give the > symbol __attribute__((visibility ("default")) . > > If you _do_ have the symbol exported from the DLL, using > __declspec(dllimport) or a generated .DEF file that exposes all > "extern"s, you can link to the symbol. > > However, from the reading I've done recently, I'm pretty sure that if > you fail to declare __declspec(dllimport) on the importing side, you > actually land up statically linking to a thunk function that in turn > calls the real function in the DLL. So it works, but at a performance cost. > > So you should do the dance. Sorry. I don't think the thunk function will have such a high overhead in this day and age. And it's what we essentially already do for all functions called *from* extensions, no? > It gets worse, too. Say you want hstore to export a couple of symbols. > Those symbols must be __declspec(dllexport) while everything else in > headers must be __declspec(dllimport). This means you can't just use > PGDLLIMPORT. You must define a HSTOREDLLIMPORT that's > __declspec(dllexport) when compiling hstore and otherwise > __declspec(dllimport). Then set a preprocessor macro like > -DCOMPILING_HSTORE to trigger it. We actually have a a macro that should do that, namely PGDLLEXPORT. I am not sure though, why it's not dependent on dependent on BUILDING_DLL (which is absolutely horribly misnamed, being essentially inverted). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 02/13/2014 08:26 AM, Andres Freund wrote: >> > It gets worse, too. Say you want hstore to export a couple of symbols. >> > Those symbols must be __declspec(dllexport) while everything else in >> > headers must be __declspec(dllimport). This means you can't just use >> > PGDLLIMPORT. You must define a HSTOREDLLIMPORT that's >> > __declspec(dllexport) when compiling hstore and otherwise >> > __declspec(dllimport). Then set a preprocessor macro like >> > -DCOMPILING_HSTORE to trigger it. > We actually have a a macro that should do that, namely PGDLLEXPORT. I am > not sure though, why it's not dependent on dependent on BUILDING_DLL > (which is absolutely horribly misnamed, being essentially inverted). Yes, it does that *for building postgres*. What I'm saying is that you need another one for hstore if you wish to be able to export symbols from hstore and import them into other libs. You can't use PGDLLEXPORT because BUILDING_DLL will be unset, so it'll expand to __declspec(dllimport) while compiling hstore. Which is wrong if you want to be exporting symbols; even if you use a .DEF file, it must expand blank, and if you don't use a .DEF it must expand to __declspec(dllexport) ... but ONLY for those symbols exported by hstore its self. So each lib that wants to export symbols needs its own equivalent of PGDLLIMPORT, and a flag set just when compiling that lib. This is the normal way it's done on Windows builds, not stuff I'm looking into and saying how _I_ think we should do it. AFAIK this is just how it's done on Windows, horrible as that is. I'll see if I can find a couple of relevant links. BTW, BUILDING_DLL is so-named because the macro definition will be taken from examples that talk about compiling a DLL, that's all. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/12 12:28), Inoue, Hiroshi wrote: > (2014/02/12 8:30), Tom Lane wrote: >> I wrote: >>> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>>> I tried MINGW port with the attached change and successfully built >>>> src and contrib and all pararell regression tests were OK. >> >>> I cleaned this up a bit (the if-nesting in Makefile.shlib was making >>> my head hurt, not to mention that it left a bunch of dead code) and >>> committed it. >> >> Hm ... according to buildfarm member narwhal, this doesn't work so well >> for plperl: >> >> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a >> Cannot export .idata$4: symbol not found >> Cannot export .idata$5: symbol not found >> Cannot export .idata$6: symbol not found >> Cannot export .text: symbol not found >> Cannot export perl58_NULL_THUNK_DATA: symbol not found >> Creating library file: libplperl.a >> collect2: ld returned 1 exit status >> make[3]: *** [plperl.dll] Error 1 > > Oops I forgot to inclule plperl, tcl or python, sorry. I would > retry build operations with them. Unfortunately it would take > pretty long time because build operations are pretty (or veeery > in an old machine) slow. > >> Not very clear what's going on there; could this be a problem in >> narwhal's admittedly-ancient toolchain? As for build, plperl and pltcl are OK on both Windows7+gcc4.6.1 machine and Windows Vista+gcc3.4.5 machine. plpython is OK on gcc4.6.1 machine but causes a *initializer element is not constant* error on gcc3.4.5 machine. I've not run regression test yet. regards, Hiroshi Inoue
On 13/02/2014 00:17, Craig Ringer wrote: > On 02/13/2014 12:39 AM, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2014-02-12 11:26:56 -0500, Tom Lane wrote: >>>> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, >>>> we ought to actually remove that, so that the Cygwin build works more like >>>> the other Windows builds? >> >>> Hm, I don't see a big advantage in detecting the errors as It won't >>> hugely increase the buildfarm coverage. But --auto-import seems to >>> require marking the .text sections as writable, avoiding that seems to >>> be a good idea if we don't need it anymore. >> >> Given the rather small number of Windows machines in the buildfarm, >> I'd really like them all to be on the same page about what's valid >> and what isn't. Having to wait ~24 hours to find out if they're >> all happy with something isn't my idea of a good time. In any case, >> as long as we have discrepancies between them, I'm not going to be >> entirely convinced that any of them are telling the full truth. >> >> I'm going to go remove that switch and see if brolga starts failing. >> If it does, I'll be satisfied that we understand the issues here. > > I wouldn't assume that _anything_ Cygwin does makes sense, or is > consistent with anything else. Of course, I disagree ;-) > Its attempts to produce UNIX-like behaviour on Windows cause some truly > bizarre behaviour at times. unfortunately our baseground (Windows) is limiting us. > It is not totally unfair to compare the level of weirdness of Cygwin to > that of WINE, though they operate in completely different ways. I > wouldn't suggest making any conclusions about the _other_ platforms > based on Cygwin. My personal experience is that a UNIX-vanilla source build 99% right on recent cygwin. The problem with postgreql make/build system is that it tries to guess platform behavior and that is not fix in time. Porting a new platform to postgresql from scratch will be a very hard effort Automake/Autoconf makes it much more simple; as its aim is to test features not platforms. Cmake also makes it right, now. We teach it that cygwin is a unix platform not a win32 platform I am biased, of course. Marco
(2014/02/13 8:21), Tom Lane wrote: > Craig Ringer <craig@2ndquadrant.com> writes: >> On 02/12/2014 02:31 PM, Inoue, Hiroshi wrote: >>> Maybe this is one of the few use cases of dlltool. >>> Because python doesn't ship with its MINGW import library, the >>> Makefile uses dlltool to generate an import library from the python >>> DLL. > >> Wow. Has anyone been in touch with the Python package maintainers to see >> if they can fix that and bundle the .lib ? > > Indeed somebody should pester them about that, but it's not clear how > much it'd help us even if they fixed it tomorrow. We're going to have > to be able to build with existing Python distributions for a long time > to come. Though the comment in src/pl/plpython/Makefile refers to it, I see libpython27.a in Python27. I can't find it in Python25. regards, Hiroshi Inoue
On 02/13/2014 02:42 PM, Marco Atzeri wrote: > My personal experience is that a UNIX-vanilla source build 99% right > on recent cygwin. Yeah, I freely admit my experience with _recent_ cygwin is not abundant. Things may well have improved. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/13 9:51), Hiroshi Inoue wrote: > (2014/02/12 12:28), Inoue, Hiroshi wrote: >> (2014/02/12 8:30), Tom Lane wrote: >>> I wrote: >>>> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>>>> I tried MINGW port with the attached change and successfully built >>>>> src and contrib and all pararell regression tests were OK. >>> >>>> I cleaned this up a bit (the if-nesting in Makefile.shlib was making >>>> my head hurt, not to mention that it left a bunch of dead code) and >>>> committed it. >>> >>> Hm ... according to buildfarm member narwhal, this doesn't work so well >>> for plperl: >>> >>> gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a >>> Cannot export .idata$4: symbol not found >>> Cannot export .idata$5: symbol not found >>> Cannot export .idata$6: symbol not found >>> Cannot export .text: symbol not found >>> Cannot export perl58_NULL_THUNK_DATA: symbol not found >>> Creating library file: libplperl.a >>> collect2: ld returned 1 exit status >>> make[3]: *** [plperl.dll] Error 1 >> >> Oops I forgot to inclule plperl, tcl or python, sorry. I would >> retry build operations with them. Unfortunately it would take >> pretty long time because build operations are pretty (or veeery >> in an old machine) slow. >> >>> Not very clear what's going on there; could this be a problem in >>> narwhal's admittedly-ancient toolchain? > > As for build, plperl and pltcl are OK on both Windows7+gcc4.6.1 > machine and Windows Vista+gcc3.4.5 machine. plpython is OK on > gcc4.6.1 machine but causes a *initializer element is not constant* > error on gcc3.4.5 machine. > I've not run regression test yet. Rebuild with --disable-auto-import causes errors in contrib on both machines. Errors occur in pg_buffercache, pg_stat_statements,postgres_fdw and test_shm_mq. --enable-auto-import curesall of them. regards, Hiroshi Inoue
Hiroshi Inoue <inoue@tpf.co.jp> writes: > Rebuild with --disable-auto-import causes errors in contrib on > both machines. Errors occur in pg_buffercache, pg_stat_statements, > postgres_fdw and test_shm_mq. Yeah, that's the idea: we want to get the same failures as on MSVC. I'm going to put in PGDLLIMPORT macros to fix these cases, but I'm waiting for verification that Cygwin also sees the problem before making it go away. brolga unfortunately seems to have been AWOL for the past day and a half. Andrew, could you give it a kick? regards, tom lane
(2014/02/12 8:30), Tom Lane wrote: > I wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> I tried MINGW port with the attached change and successfully built >>> src and contrib and all pararell regression tests were OK. > >> I cleaned this up a bit (the if-nesting in Makefile.shlib was making >> my head hurt, not to mention that it left a bunch of dead code) and >> committed it. > > Hm ... according to buildfarm member narwhal, this doesn't work so well > for plperl: > > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute-Wformat-security -fno-strict-aliasing -fwrapv -g -Wno-comment -shared -o plperl.dll plperl.oSPI.o Util.o -L../../../src/port -L../../../src/common -Wl,--allow-multiple-definition -L/mingw/lib -Wl,--as-needed -LC:/Perl/lib/CORE -lperl58 -L../../../src/backend -lpostgres -lpgcommon -lpgport -lintl -lxslt -lxml2 -lssleay32-leay32 -lz -lm -lws2_32 -lshfolder -Wl,--export-all-symbols -Wl,--out-implib=libplperl.a > Cannot export .idata$4: symbol not found > Cannot export .idata$5: symbol not found > Cannot export .idata$6: symbol not found > Cannot export .text: symbol not found > Cannot export perl58_NULL_THUNK_DATA: symbol not found > Creating library file: libplperl.a > collect2: ld returned 1 exit status > make[3]: *** [plperl.dll] Error 1 > > Not very clear what's going on there; could this be a problem in > narwhal's admittedly-ancient toolchain? The error doesn't occur here (un)fortunately. One thing I'm wondering about is that plperl is linking perlxx.lib not libperlxx.a. I made a patch following plpython and it also works here. Is it worth trying? regards, Hiroshi Inoue
Attachment
Hiroshi Inoue <inoue@tpf.co.jp> writes: > (2014/02/12 8:30), Tom Lane wrote: >> Not very clear what's going on there; could this be a problem in >> narwhal's admittedly-ancient toolchain? > The error doesn't occur here (un)fortunately. > One thing I'm wondering about is that plperl is linking perlxx.lib > not libperlxx.a. I made a patch following plpython and it also > works here. > Is it worth trying? I hadn't noticed that part of plpython's Makefile before. Man, that's an ugly technique :-(. Still, there's little about this platform that isn't ugly. Let's try it and see what happens. regards, tom lane
I wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> One thing I'm wondering about is that plperl is linking perlxx.lib >> not libperlxx.a. I made a patch following plpython and it also >> works here. >> Is it worth trying? > I hadn't noticed that part of plpython's Makefile before. Man, > that's an ugly technique :-(. Still, there's little about this > platform that isn't ugly. Let's try it and see what happens. And what happens is this: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-14%2017%3A00%3A02 namely, it gets through plperl now and then chokes with the same symptoms on pltcl. So I guess we need the same hack in pltcl. The fun never stops ... (BTW, narwhal is evidently not trying to build plpython. I wonder why not?) regards, tom lane
(2014/02/15 2:32), Tom Lane wrote: > I wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> One thing I'm wondering about is that plperl is linking perlxx.lib >>> not libperlxx.a. I made a patch following plpython and it also >>> works here. >>> Is it worth trying? > >> I hadn't noticed that part of plpython's Makefile before. Man, >> that's an ugly technique :-(. Still, there's little about this >> platform that isn't ugly. Let's try it and see what happens. > > And what happens is this: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-14%2017%3A00%3A02 > namely, it gets through plperl now and then chokes with the same > symptoms on pltcl. So I guess we need the same hack in pltcl. > The fun never stops ... There seems some risk to link msvc import library. Linking perlxx.lib or tcl.lib worls here but linking pythonxx.lib doesn't work even in newer machine. > (BTW, narwhal is evidently not trying to build plpython. I wonder > why not?) Due to *initializer element is not constant* error which also can besee on my old machine. http://www.postgresql.org/message-id/CA+OCxozr=0wkQDF7kfd2n-bJQOwdSUs0Myohg29pA_U5=2pfqA@mail.gmail.com regards, Hiroshi Inoue
Hiroshi Inoue <inoue@tpf.co.jp> writes: > (2014/02/15 2:32), Tom Lane wrote: >> (BTW, narwhal is evidently not trying to build plpython. I wonder >> why not?) > Due to *initializer element is not constant* error which also can be > see on my old machine. Hmm, isn't that one of the symptoms of inadequacies in --enable-auto-import? Wonder if the disable change fixed it. regards, tom lane
(2014/02/15 11:42), Tom Lane wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> (2014/02/15 2:32), Tom Lane wrote: >>> (BTW, narwhal is evidently not trying to build plpython. I wonder >>> why not?) > >> Due to *initializer element is not constant* error which also can be >> see on my old machine. > > Hmm, isn't that one of the symptoms of inadequacies in > --enable-auto-import? Wonder if the disable change fixed it. The error is a compilation error not a linkage one. gcc on narwhal or my old machine is too old unfortunately. regards, Hiroshi Inoue
On 2014-02-12 14:11:10 -0500, Tom Lane wrote: > Marco Atzeri <marco.atzeri@gmail.com> writes: > > On 12/02/2014 19:19, Andres Freund wrote: > >> On 2014-02-12 19:13:07 +0100, Marco Atzeri wrote: > >>> About PGDLLIMPORT , my build log is full of "warning: ‘optarg’ redeclared > >>> without dllimport attribute: previous dllimport ignored " > > >> That should be fixed then. I guess cygwin's getopt.h declares it that way? > > > from /usr/include/getopt.h > > > extern char __declspec(dllimport) *optarg; /* argument associated > > with option */ > > Hm. All of our files that use getopt also do > > extern char *optarg; > extern int optind, > opterr; > > #ifdef HAVE_INT_OPTRESET > extern int optreset; /* might not be declared by system headers */ > #endif At least zic.c doesn't... So, now that brolga is revived, this unsurprisingly causes breakage there after the --disable-auto-export change. ISTM the easiest way to deal with this is to just adorn all those with a PGDLLIMPORT? I don't immediately see how the HAVE_INT_OPTRESET stuff could be reused in any interesting way as it's really independent of optind/optarg whether it exists? It strikes me that we should just move all that to src/include/getopt_long.h and include that from the places that currently include getopt.h directly. It's pretty pointless to have all those externs, HAVE_GETOPT_H, HAVE_INT_OPTRESET in every file. It'd be prettier if it were named getopt.h, but well... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-12 14:11:10 -0500, Tom Lane wrote: >> Marco Atzeri <marco.atzeri@gmail.com> writes: >>> from /usr/include/getopt.h >>> extern char __declspec(dllimport) *optarg; /* argument associated >>> with option */ >> Hm. All of our files that use getopt also do >> extern char *optarg; > So, now that brolga is revived, this unsurprisingly causes breakage > there after the --disable-auto-export change. ISTM the easiest way to > deal with this is to just adorn all those with a PGDLLIMPORT? Uh, no, because that's backwards: we'd need the __declspec(dllimport) in the backend code. The best thing probably is not to have the duplicate declarations on platforms that don't need 'em. Unfortunately, I seem to recall that the current coding was arrived at to forestall link problems on weird platforms that *had* these symbols declared and yet we needed externs anyway. We might have to do something as ugly as "#ifndef CYGWIN". regards, tom lane
On 2014-02-15 10:16:50 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-12 14:11:10 -0500, Tom Lane wrote: > >> Marco Atzeri <marco.atzeri@gmail.com> writes: > >>> from /usr/include/getopt.h > >>> extern char __declspec(dllimport) *optarg; /* argument associated > >>> with option */ > > >> Hm. All of our files that use getopt also do > >> extern char *optarg; > > > So, now that brolga is revived, this unsurprisingly causes breakage > > there after the --disable-auto-export change. ISTM the easiest way to > > deal with this is to just adorn all those with a PGDLLIMPORT? > > Uh, no, because that's backwards: we'd need the __declspec(dllimport) > in the backend code. Yuck, it's even uglier than that. On normal windows it's not actually the platform's getopt() that ends up getting really used, but the one from port/... # mingw has adopted a GNU-centric interpretation of optind/optreset, # so always use our version on Windows. if test "$PORTNAME" = "win32"; then AC_LIBOBJ(getopt) AC_LIBOBJ(getopt_long) fi > The best thing probably is not to have the duplicate declarations on > platforms that don't need 'em. Unfortunately, I seem to recall that > the current coding was arrived at to forestall link problems on weird > platforms that *had* these symbols declared and yet we needed externs > anyway. We might have to do something as ugly as "#ifndef CYGWIN". Hm, according to a quick blame, they are there unconditionally since at least 2000 (c.f. a70e74b06 moving them around). So it very well might be that that reasoning isn't current anymore. One ugly thing to do is to fall back to the port implementation of getopt on cygwin as well... That'd still have the warning parade tho. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-15 10:16:50 -0500, Tom Lane wrote: >> The best thing probably is not to have the duplicate declarations on >> platforms that don't need 'em. Unfortunately, I seem to recall that >> the current coding was arrived at to forestall link problems on weird >> platforms that *had* these symbols declared and yet we needed externs >> anyway. We might have to do something as ugly as "#ifndef CYGWIN". > Hm, according to a quick blame, they are there unconditionally since at > least 2000 (c.f. a70e74b06 moving them around). So it very well might be > that that reasoning isn't current anymore. I don't have time right now to research it (have to go shovel snow), but I think that at least some of the issue was that we needed the externs when we force use of our src/port implementation. > One ugly thing to do is to fall back to the port implementation of > getopt on cygwin as well... That'd still have the warning parade tho. Yeah, that doesn't sound terribly satisfactory. Another idea would be to wrap the externs in "#ifndef HAVE_GETOPT_H". regards, tom lane
On 2014-02-15 10:59:17 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-15 10:16:50 -0500, Tom Lane wrote: > >> The best thing probably is not to have the duplicate declarations on > >> platforms that don't need 'em. Unfortunately, I seem to recall that > >> the current coding was arrived at to forestall link problems on weird > >> platforms that *had* these symbols declared and yet we needed externs > >> anyway. We might have to do something as ugly as "#ifndef CYGWIN". > > > Hm, according to a quick blame, they are there unconditionally since at > > least 2000 (c.f. a70e74b06 moving them around). So it very well might be > > that that reasoning isn't current anymore. > > I don't have time right now to research it (have to go shovel snow), > but I think that at least some of the issue was that we needed the > externs when we force use of our src/port implementation. I think that'd be solvable easy enough if we'd just always included pg's getopt_long.h (or a new getopt.h) which properly deals with defining them when included. That'd centralize all the magic and it'd overall get rid of a ton of ifdefs and externs. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-15 10:59:17 -0500, Tom Lane wrote: >> I don't have time right now to research it (have to go shovel snow), >> but I think that at least some of the issue was that we needed the >> externs when we force use of our src/port implementation. > I think that'd be solvable easy enough if we'd just always included pg's > getopt_long.h (or a new getopt.h) which properly deals with defining > them when included. That'd centralize all the magic and it'd overall get > rid of a ton of ifdefs and externs. Yeah, there are enough copies of that stuff that centralizing them sounds like a great idea. Call it "pg_getopt.h", perhaps? AFAICS, getopt_long.h just defines them unconditionally, which is as wrong as everyplace else. The reasonable alternatives seem to be (1) invent pg_getopt.h, which would #include <getopt.h> if available and then provide properly-ifdef'd externs for optarg and friends; getopt_long.h would #include pg_getopt.h. (2) Just do the above in getopt_long.h, and include it in the places that currently have externs for optarg and friends. Option (2) seems a bit odd in files that aren't actually using getopt_long(), but it avoids making a new header file. Preferences? regards, tom lane
On 2014-02-15 12:16:58 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-15 10:59:17 -0500, Tom Lane wrote: > >> I don't have time right now to research it (have to go shovel snow), > >> but I think that at least some of the issue was that we needed the > >> externs when we force use of our src/port implementation. > > > I think that'd be solvable easy enough if we'd just always included pg's > > getopt_long.h (or a new getopt.h) which properly deals with defining > > them when included. That'd centralize all the magic and it'd overall get > > rid of a ton of ifdefs and externs. > > Yeah, there are enough copies of that stuff that centralizing them > sounds like a great idea. Call it "pg_getopt.h", perhaps? I'm just working on it. pg_getopt.h was exactly what I came up with. > (1) invent pg_getopt.h, which would #include <getopt.h> if available > and then provide properly-ifdef'd externs for optarg and friends; > getopt_long.h would #include pg_getopt.h. That's what I've done. I'll post in a minute, just wanted to give a headsup so we don't duplicate work. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 2014-02-15 18:21:56 +0100, Andres Freund wrote: > On 2014-02-15 12:16:58 -0500, Tom Lane wrote: > > Andres Freund <andres@2ndquadrant.com> writes: > > > On 2014-02-15 10:59:17 -0500, Tom Lane wrote: > > >> I don't have time right now to research it (have to go shovel snow), > > >> but I think that at least some of the issue was that we needed the > > >> externs when we force use of our src/port implementation. > > > > > I think that'd be solvable easy enough if we'd just always included pg's > > > getopt_long.h (or a new getopt.h) which properly deals with defining > > > them when included. That'd centralize all the magic and it'd overall get > > > rid of a ton of ifdefs and externs. > > > > Yeah, there are enough copies of that stuff that centralizing them > > sounds like a great idea. Call it "pg_getopt.h", perhaps? > > I'm just working on it. pg_getopt.h was exactly what I came up with. > > > (1) invent pg_getopt.h, which would #include <getopt.h> if available > > and then provide properly-ifdef'd externs for optarg and friends; > > getopt_long.h would #include pg_getopt.h. Patch attached. I am not sure whether HAVE_GETOPT is the best condition to use, since it's set by configure by a link based check, same goes for HAVE_INT_OPTERR. The other choices would be relying on HAVE_GETOPT_H or a new AC_CHECK_DECL. I haven't touched entab.c because it's not linking with pgport, so there seems little use in changing it. I've also removed some #ifndef WIN32's that didn't seem to make much sense. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-15 18:21:56 +0100, Andres Freund wrote: >> On 2014-02-15 12:16:58 -0500, Tom Lane wrote: >>> Yeah, there are enough copies of that stuff that centralizing them >>> sounds like a great idea. Call it "pg_getopt.h", perhaps? > Patch attached. I am not sure whether HAVE_GETOPT is the best condition > to use, since it's set by configure by a link based check, same goes for > HAVE_INT_OPTERR. The other choices would be relying on HAVE_GETOPT_H or > a new AC_CHECK_DECL. Thanks. I'll look this over and try to get it committed before brolga's next scheduled run. At least this'll ensure we only have one place to tweak if more tweaking is needed. regards, tom lane
Andres Freund <andres@2ndquadrant.com> writes: > Patch attached. Committed with some cosmetic adjustments --- thanks! > I am not sure whether HAVE_GETOPT is the best condition > to use, since it's set by configure by a link based check, same goes for > HAVE_INT_OPTERR. The other choices would be relying on HAVE_GETOPT_H or > a new AC_CHECK_DECL. I'm pretty sure HAVE_GETOPT is *not* what we want to use for the variable declarations. I've tried HAVE_GETOPT_H for the moment, but that may need more tweaking depending on what the buildfarm has to say. > I haven't touched entab.c because it's not linking with pgport, so there > seems little use in changing it. Agreed. We don't have very high standards for its portability anyway. > I've also removed some #ifndef WIN32's that didn't seem to make much sense. They might've been needed at one time, but we'll see. (I also took out a few includes of <unistd.h> that were clearly now redundant, though I didn't try to be thorough about it.) regards, tom lane
On 2014-02-15 14:35:02 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > Patch attached. > > Committed with some cosmetic adjustments --- thanks! > > > I am not sure whether HAVE_GETOPT is the best condition > > to use, since it's set by configure by a link based check, same goes for > > HAVE_INT_OPTERR. The other choices would be relying on HAVE_GETOPT_H or > > a new AC_CHECK_DECL. > > I'm pretty sure HAVE_GETOPT is *not* what we want to use for the variable > declarations. I've tried HAVE_GETOPT_H for the moment, but that may need > more tweaking depending on what the buildfarm has to say. I only used it because it was what previously protected the getopt() declaration in port.h... But that should probably have been depending on !HAVE_GETOPT_H in the first place. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 12/02/2014 17:39, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: >> On 2014-02-12 11:26:56 -0500, Tom Lane wrote: >>> Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, >>> we ought to actually remove that, so that the Cygwin build works more like >>> the other Windows builds? > >> Hm, I don't see a big advantage in detecting the errors as It won't >> hugely increase the buildfarm coverage. But --auto-import seems to >> require marking the .text sections as writable, avoiding that seems to >> be a good idea if we don't need it anymore. > > Given the rather small number of Windows machines in the buildfarm, > I'd really like them all to be on the same page about what's valid > and what isn't. Having to wait ~24 hours to find out if they're > all happy with something isn't my idea of a good time. In any case, > as long as we have discrepancies between them, I'm not going to be > entirely convinced that any of them are telling the full truth. > > I'm going to go remove that switch and see if brolga starts failing. > If it does, I'll be satisfied that we understand the issues here. > > regards, tom lane > > disabling is not working on cygwin gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard zic.o ialloc.o scheck.o localtime.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -Wl,--disable-auto-import -L/usr/local/lib -Wl,--as-needed -lpgcommon -lpgport -lintl -lssl -lcrypto -lz -lreadline -lcrypt -o zic.exe zic.o:zic.c:(.text.startup+0xc9): undefined reference to `optarg' zic.o:zic.c:(.text.startup+0x10d): undefined reference to `optarg' /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: zic.o: bad reloc address 0x10d in section `.text.startup' /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: final link failed: Invalid operation collect2: error: ld returned 1 exit status just removing "-Wl,--disable-auto-import" and everything works $ gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard zic.o ialloc.o scheck.o localtime.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition -L/usr/local/lib -Wl,--as-needed -lpgcommon -lpgport -lintl -lssl -lcrypto -lz -lreadline -lcrypt -o zic.exe Regards Marco
On 2014-02-15 21:49:43 +0100, Marco Atzeri wrote: > On 12/02/2014 17:39, Tom Lane wrote: > >Andres Freund <andres@2ndquadrant.com> writes: > >>On 2014-02-12 11:26:56 -0500, Tom Lane wrote: > >>>Hm. So if we're giving up on the idea of ever getting rid of PGDLLIMPORT, > >>>we ought to actually remove that, so that the Cygwin build works more like > >>>the other Windows builds? > > > >>Hm, I don't see a big advantage in detecting the errors as It won't > >>hugely increase the buildfarm coverage. But --auto-import seems to > >>require marking the .text sections as writable, avoiding that seems to > >>be a good idea if we don't need it anymore. > > > >Given the rather small number of Windows machines in the buildfarm, > >I'd really like them all to be on the same page about what's valid > >and what isn't. Having to wait ~24 hours to find out if they're > >all happy with something isn't my idea of a good time. In any case, > >as long as we have discrepancies between them, I'm not going to be > >entirely convinced that any of them are telling the full truth. > > > >I'm going to go remove that switch and see if brolga starts failing. > >If it does, I'll be satisfied that we understand the issues here. > > > > regards, tom lane > > > > > > disabling is not working on cygwin > > gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute > -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard > zic.o ialloc.o scheck.o localtime.o -L../../src/port -L../../src/common > -Wl,--allow-multiple-definition -Wl,--disable-auto-import -L/usr/local/lib > -Wl,--as-needed -lpgcommon -lpgport -lintl -lssl -lcrypto -lz -lreadline > -lcrypt -o zic.exe > zic.o:zic.c:(.text.startup+0xc9): undefined reference to `optarg' > zic.o:zic.c:(.text.startup+0x10d): undefined reference to `optarg' > /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: zic.o: > bad reloc address 0x10d in section `.text.startup' > /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: final > link failed: Invalid operation > collect2: error: ld returned 1 exit status > > just removing "-Wl,--disable-auto-import" and everything works > > $ gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute > -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard > zic.o ialloc.o scheck.o localtime.o -L../../src/port -L../../src/common > -Wl,--allow-multiple-definition -L/usr/local/lib -Wl,--as-needed > -lpgcommon -lpgport -lintl -lssl -lcrypto -lz -lreadline -lcrypt -o zic.exe Please pull and retry, that already might fix it. The reason it's probably failing is the warnings about declspec you reported earlier. See 60ff2fdd9970ba29f5267317a5e7354d2658c1e5 Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 15/02/2014 21:54, Andres Freund wrote: > > Please pull and retry, that already might fix it. The reason it's > probably failing is the warnings about declspec you reported earlier. > > See 60ff2fdd9970ba29f5267317a5e7354d2658c1e5 > > Greetings, > > Andres Freund > that is fine. there is a different one, later on [cut] ../../src/timezone/localtime.o ../../src/timezone/strftime.o ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o postgres libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' libpq/auth.o:auth.c:(.text+0x1954): undefined reference to `in6addr_any' libpq/auth.o:auth.c:(.text+0x196d): undefined reference to `in6addr_any' libpq/auth.o:auth.c:(.text+0x1979): undefined reference to `in6addr_any' /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: libpq/auth.o: bad reloc address 0xce4 in section `.rdata' collect2: error: ld returned 1 exit status Makefile:66: recipe for target 'postgres' failed make[2]: *** [postgres] Error 1 make[2]: Leaving directory '/pub/devel/postgresql/git/postgresql_build/src/backend' Regards Marco
On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: > > > On 15/02/2014 21:54, Andres Freund wrote: > > > >Please pull and retry, that already might fix it. The reason it's > >probably failing is the warnings about declspec you reported earlier. > > > >See 60ff2fdd9970ba29f5267317a5e7354d2658c1e5 > > > >Greetings, > > > >Andres Freund > > > > that is fine. > there is a different one, later on > > [cut] > ../../src/timezone/localtime.o ../../src/timezone/strftime.o > ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a > ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o > postgres > libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' Could you try additionally linking with -lwsock32? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: >> ../../src/timezone/localtime.o ../../src/timezone/strftime.o >> ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a >> ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o >> postgres >> libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' > Could you try additionally linking with -lwsock32? The interesting question here is why it used to work. There is no "extern" for in6addr_any in our code, so there must have been a declaration of that constant in some system header. Which one, and what linkage is it defining, and where was the linkage getting resolved before? I notice that brolga is showing both this failure and some libxml issues. regards, tom lane
On 2014-02-15 17:26:30 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: > >> ../../src/timezone/localtime.o ../../src/timezone/strftime.o > >> ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a > >> ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o > >> postgres > >> libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' > > > Could you try additionally linking with -lwsock32? > > The interesting question here is why it used to work. There is no > "extern" for in6addr_any in our code, so there must have been a > declaration of that constant in some system header. Which one, and > what linkage is it defining, and where was the linkage getting > resolved before? mingwcompat.c has the following ugly as heck tidbit: #ifndef WIN32_ONLY_COMPILER /** MingW defines an extern to this struct, but the actual struct isn't present* in any library. It's trivial enough thatwe can safely define it* ourselves.*/ const struct in6_addr in6addr_any = {{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}}}; I think when that was added the problem might just have been misanalyzed, but due to the auto import magic this probably wasn't noticed... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 15/02/2014 23:37, Andres Freund wrote: > On 2014-02-15 17:26:30 -0500, Tom Lane wrote: >> Andres Freund <andres@2ndquadrant.com> writes: >>> On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: >>>> ../../src/timezone/localtime.o ../../src/timezone/strftime.o >>>> ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a >>>> ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o >>>> postgres >>>> libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' >> >>> Could you try additionally linking with -lwsock32? >> >> The interesting question here is why it used to work. There is no >> "extern" for in6addr_any in our code, so there must have been a >> declaration of that constant in some system header. Which one, and >> what linkage is it defining, and where was the linkage getting >> resolved before? > > mingwcompat.c has the following ugly as heck tidbit: > > #ifndef WIN32_ONLY_COMPILER > /* > * MingW defines an extern to this struct, but the actual struct isn't present > * in any library. It's trivial enough that we can safely define it > * ourselves. > */ > const struct in6_addr in6addr_any = {{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}}}; > > I think when that was added the problem might just have been > misanalyzed, but due to the auto import magic this probably wasn't > noticed... > > Greetings, > > Andres Freund > on cygwin header in /usr/include 32 $ grep -rH in6addr_any * cygwin/in6.h:extern const struct in6_addr in6addr_any; cygwin/version.h: in6addr_any, in6addr_loopback.
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-15 17:26:30 -0500, Tom Lane wrote: >> The interesting question here is why it used to work. There is no >> "extern" for in6addr_any in our code, so there must have been a >> declaration of that constant in some system header. Which one, and >> what linkage is it defining, and where was the linkage getting >> resolved before? > mingwcompat.c has the following ugly as heck tidbit: > #ifndef WIN32_ONLY_COMPILER > /* > * MingW defines an extern to this struct, but the actual struct isn't present > * in any library. It's trivial enough that we can safely define it > * ourselves. > */ > const struct in6_addr in6addr_any = {{{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}}}; Yeah, I noticed that. AFAICS, mingwcompat.c isn't built in Cygwin builds, so we'd probably need to put the constant someplace else entirely if we end up defining it ourselves. regards, tom lane
Marco Atzeri <marco.atzeri@gmail.com> writes: > 32 $ grep -rH in6addr_any * > cygwin/in6.h:extern const struct in6_addr in6addr_any; > cygwin/version.h: in6addr_any, in6addr_loopback. So how come there's a declspec on the getopt.h variables, but not this one? regards, tom lane
On 2014-02-15 17:48:00 -0500, Tom Lane wrote: > Marco Atzeri <marco.atzeri@gmail.com> writes: > > 32 $ grep -rH in6addr_any * > > cygwin/in6.h:extern const struct in6_addr in6addr_any; > > cygwin/version.h: in6addr_any, in6addr_loopback. > > So how come there's a declspec on the getopt.h variables, but not this > one? Well, those are real constants, so they probably are fully contained in the static link library, no need to dynamically resolve them. Note that the frontend libpq indeed links to wsock32, just as Mkvcbuild.pm does for both frontend and backend... It might be that ipv6 never worked for mingw and cygwin, and in6addr_any just resolved to some magic thunk --auto-import created... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > It might be that ipv6 never worked for mingw and cygwin, and in6addr_any > just resolved to some magic thunk --auto-import created... Yeah, it would not be surprising if this exercise is exposing bugs we didn't even know we had. regards, tom lane
On 2014-02-15 18:26:46 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > It might be that ipv6 never worked for mingw and cygwin, and in6addr_any > > just resolved to some magic thunk --auto-import created... > > Yeah, it would not be surprising if this exercise is exposing bugs > we didn't even know we had. Agreed, but I think we should start fixing the missing PGDLLIMPORTs soon, at least the ones needed in the backbranches. AFAICT there's live bugs with postgres_fdw there. And possibly it might need more than one full cycle to get right... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/15 2:32), Tom Lane wrote: > I wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> One thing I'm wondering about is that plperl is linking perlxx.lib >>> not libperlxx.a. I made a patch following plpython and it also >>> works here. >>> Is it worth trying? > >> I hadn't noticed that part of plpython's Makefile before. Man, >> that's an ugly technique :-(. Still, there's little about this >> platform that isn't ugly. Let's try it and see what happens. > > And what happens is this: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-14%2017%3A00%3A02 > namely, it gets through plperl now and then chokes with the same > symptoms on pltcl. So I guess we need the same hack in pltcl. > The fun never stops ... Pltcl still fails. tclxx.dll lives in bin directory not in lib directory. The attached patch would fix the problem. regards, Hiroshi Inoue
Attachment
Marco, Andrew: On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: > ../../src/timezone/localtime.o ../../src/timezone/strftime.o > ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a > ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o > postgres > libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' > libpq/auth.o:auth.c:(.text+0x1954): undefined reference to `in6addr_any' > libpq/auth.o:auth.c:(.text+0x196d): undefined reference to `in6addr_any' > libpq/auth.o:auth.c:(.text+0x1979): undefined reference to `in6addr_any' > /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: > libpq/auth.o: bad reloc address 0xce4 in section `.rdata' > collect2: error: ld returned 1 exit status > Makefile:66: recipe for target 'postgres' failed > make[2]: *** [postgres] Error 1 > make[2]: Leaving directory > '/pub/devel/postgresql/git/postgresql_build/src/backend' Could either of you try whether compiling with the attached hack fixes anything on cygwin? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Hiroshi Inoue <inoue@tpf.co.jp> writes: > (2014/02/15 2:32), Tom Lane wrote: >> And what happens is this: >> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-14%2017%3A00%3A02 >> namely, it gets through plperl now and then chokes with the same >> symptoms on pltcl. So I guess we need the same hack in pltcl. >> The fun never stops ... > Pltcl still fails. > tclxx.dll lives in bin directory not in lib directory. > The attached patch would fix the problem. Pushed, thanks. regards, tom lane
On 16/02/2014 15:43, Andres Freund wrote: > Marco, Andrew: > > On 2014-02-15 22:11:37 +0100, Marco Atzeri wrote: >> ../../src/timezone/localtime.o ../../src/timezone/strftime.o >> ../../src/timezone/pgtz.o ../../src/port/libpgport_srv.a >> ../../src/common/libpgcommon_srv.a -lintl -lssl -lcrypto -lcrypt -lldap -o >> postgres >> libpq/auth.o:auth.c:(.text+0x1940): undefined reference to `in6addr_any' >> libpq/auth.o:auth.c:(.text+0x1954): undefined reference to `in6addr_any' >> libpq/auth.o:auth.c:(.text+0x196d): undefined reference to `in6addr_any' >> libpq/auth.o:auth.c:(.text+0x1979): undefined reference to `in6addr_any' >> /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: >> libpq/auth.o: bad reloc address 0xce4 in section `.rdata' >> collect2: error: ld returned 1 exit status >> Makefile:66: recipe for target 'postgres' failed >> make[2]: *** [postgres] Error 1 >> make[2]: Leaving directory >> '/pub/devel/postgresql/git/postgresql_build/src/backend' > > Could either of you try whether compiling with the attached hack fixes > anything on cygwin? > > Greetings, > > Andres Freund > on cygwin32 bit it works, but it stops later on ------------------------------------------- sl -lcrypto -lz -lreadline -lcrypt -o psql.exe tab-complete.o:tab-complete.c:(.text+0xa98): undefined reference to `rl_line_buffer' tab-complete.o:tab-complete.c:(.text+0xa387): undefined reference to `rl_attempted_completion_function' tab-complete.o:tab-complete.c:(.text+0xa391): undefined reference to `rl_basic_word_break_characters' tab-complete.o:tab-complete.c:(.text+0xa3a4): undefined reference to `rl_readline_name' /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: tab-complete.o: bad reloc address 0x30ec in section `.rdata' /usr/lib/gcc/i686-pc-cygwin/4.8.2/../../../../i686-pc-cygwin/bin/ld: final link failed: Invalid operation collect2: error: ld returned 1 exit status Makefile:33: recipe for target 'psql' failed ----------------------------------------------- on cygwin 64bit, that I was not testing before, something is strange ------------------------------------------------------ -lintl -lssl -lcrypto -lcrypt -lldap -lwsock32 -lws2_32 -o postgres postmaster/postmaster.o:postmaster.c:(.rdata$.refptr.environ[.refptr.environ]+0x0): undefined reference to `environ' collect2: error: ld returned 1 exit status Makefile:68: recipe for target 'postgres' failed make[2]: *** [postgres] Error 1 ------------------------------------------------------- of course 9.3.2 builds fine on cygwin 64bit. Regards Marco
Marco Atzeri <marco.atzeri@gmail.com> writes: > On 16/02/2014 15:43, Andres Freund wrote: >> Could either of you try whether compiling with the attached hack fixes >> anything on cygwin? > on cygwin32 bit it works, but it stops later on > ------------------------------------------- > sl -lcrypto -lz -lreadline -lcrypt -o psql.exe > tab-complete.o:tab-complete.c:(.text+0xa98): undefined reference to > `rl_line_buffer' > on cygwin 64bit, that I was not testing before, > something is strange > ------------------------------------------------------ > -lintl -lssl -lcrypto -lcrypt -lldap -lwsock32 -lws2_32 -o postgres > postmaster/postmaster.o:postmaster.c:(.rdata$.refptr.environ[.refptr.environ]+0x0): > undefined reference to `environ' So what we currently know is that on cygwin, some of the core system include files have been declspec'd, but others haven't; and headers for third-party libraries like libxml and libreadline mostly haven't. I'm starting to get the feeling that we're going to have to admit defeat and not try to use --disable-auto-import on cygwin builds. That platform is evidently not capable of supporting it. We seem to be pretty nearly there on getting the MSVC and Mingw builds to reliably complain about missing PGDLLIMPORTs, so maybe it's good enough if those builds do it. regards, tom lane
On 2014-02-16 12:57:46 -0500, Tom Lane wrote: > Marco Atzeri <marco.atzeri@gmail.com> writes: > > On 16/02/2014 15:43, Andres Freund wrote: > >> Could either of you try whether compiling with the attached hack fixes > >> anything on cygwin? > > > on cygwin32 bit it works, but it stops later on > > ------------------------------------------- > > sl -lcrypto -lz -lreadline -lcrypt -o psql.exe > > tab-complete.o:tab-complete.c:(.text+0xa98): undefined reference to > > `rl_line_buffer' > > > on cygwin 64bit, that I was not testing before, > > something is strange > > ------------------------------------------------------ > > -lintl -lssl -lcrypto -lcrypt -lldap -lwsock32 -lws2_32 -o postgres > > postmaster/postmaster.o:postmaster.c:(.rdata$.refptr.environ[.refptr.environ]+0x0): > > undefined reference to `environ' That's in this case because it's our own extern, that itself would probably be fixable, but: > So what we currently know is that on cygwin, some of the core system > include files have been declspec'd, but others haven't; and headers > for third-party libraries like libxml and libreadline mostly haven't. it's not going to work for the external libraries. > I'm starting to get the feeling that we're going to have to admit > defeat and not try to use --disable-auto-import on cygwin builds. > That platform is evidently not capable of supporting it. Agreed. It's probably doable if somebody actually using cygwin themselves would invest a day or two and work on upstreaming the changes, but it looks painful to do indirectly. > We seem to be pretty nearly there on getting the MSVC and Mingw builds > to reliably complain about missing PGDLLIMPORTs, so maybe it's good > enough if those builds do it. Is there anything missing on that end? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-16 12:57:46 -0500, Tom Lane wrote: >> I'm starting to get the feeling that we're going to have to admit >> defeat and not try to use --disable-auto-import on cygwin builds. >> That platform is evidently not capable of supporting it. > Agreed. It's probably doable if somebody actually using cygwin > themselves would invest a day or two and work on upstreaming the > changes, but it looks painful to do indirectly. Yeah, and I doubt that anybody in the Cygwin project is going to see the point of maintaining such patches anyway. As Marco said, their idea is to provide a Unix-ish platform as best they can, not adopt all the worst features of Windows. I kinda wonder why they have declspec's on the getopt variables at all. >> We seem to be pretty nearly there on getting the MSVC and Mingw builds >> to reliably complain about missing PGDLLIMPORTs, so maybe it's good >> enough if those builds do it. > Is there anything missing on that end? I think it's all fixed, but I want to wait for another buildfarm cycle just to be sure all the relevant critters are failing where we expect them to and not somewhere else (like pltcl). regards, tom lane
On 2014-02-16 13:25:58 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-16 12:57:46 -0500, Tom Lane wrote: > >> I'm starting to get the feeling that we're going to have to admit > >> defeat and not try to use --disable-auto-import on cygwin builds. > >> That platform is evidently not capable of supporting it. > > > Agreed. It's probably doable if somebody actually using cygwin > > themselves would invest a day or two and work on upstreaming the > > changes, but it looks painful to do indirectly. > > Yeah, and I doubt that anybody in the Cygwin project is going to see > the point of maintaining such patches anyway. As Marco said, their > idea is to provide a Unix-ish platform as best they can, not adopt all > the worst features of Windows. I kinda wonder why they have declspec's > on the getopt variables at all. When searching for others having issues, I discovered several other projects also disabling auto-import, so it's not just us. And, afaics, it's mostly upstream bugs, e.g. libxml's header just misses an "extern" in one of the cygwin specific defines. I don't think we've actually found bugs in cygwin yet. We'd overridden the provided declspec for getopt et al, that's why it failed, and we're failing to link to several of the libraries we're using. I think the only reason the latter isn't breaking our neck is that they are indirectly included and the auto-import code is saving our neck. That said, even if the upstream projects were fixed, we probably wouldn't want to enable this unconditionally for a fair while. And I am surely not volunteering to do the necessary work, and I somehow think neither are you... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Feb 14, 2014 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> One thing I'm wondering about is that plperl is linking perlxx.lib >>> not libperlxx.a. I made a patch following plpython and it also >>> works here. >>> Is it worth trying? > >> I hadn't noticed that part of plpython's Makefile before. Man, >> that's an ugly technique :-(. Still, there's little about this >> platform that isn't ugly. Let's try it and see what happens. > > And what happens is this: > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=narwhal&dt=2014-02-14%2017%3A00%3A02 > namely, it gets through plperl now and then chokes with the same > symptoms on pltcl. So I guess we need the same hack in pltcl. > The fun never stops ... > > (BTW, narwhal is evidently not trying to build plpython. I wonder > why not?) Not sure - it's certainly installed on the box. I've enabled it for now, and will see what happens. -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 02/16/2014 07:03 AM, Andres Freund wrote: > On 2014-02-15 17:48:00 -0500, Tom Lane wrote: >> Marco Atzeri <marco.atzeri@gmail.com> writes: >>> 32 $ grep -rH in6addr_any * >>> cygwin/in6.h:extern const struct in6_addr in6addr_any; >>> cygwin/version.h: in6addr_any, in6addr_loopback. >> >> So how come there's a declspec on the getopt.h variables, but not this >> one? > > Well, those are real constants, so they probably are fully contained in > the static link library, no need to dynamically resolve them. If they're externs that're then defined in a single module, that's not the case. Only if they're *not* declared extern, and thus added to each compilation unit then de-duplicated during linking, may they be safely referred to without a __declspec(dllimport) because each module gets its own copy. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Hi, I just wanted to mention that it should probably not be too hard to emulate the current windows behaviour in gcc/clang elf targets. Somebody (won't be me) could add a --emulate-windows-linkage configure flag or such. By mapping PGDLLIMPORT to__attribute__((section(...))) it should be relatively straightforward to put all exported variables into a special section and use a linker script to change the visibility of the default data, bss, rodata sections... </crazytalk> Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Dave Page <dpage@pgadmin.org> writes: > On Fri, Feb 14, 2014 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> (BTW, narwhal is evidently not trying to build plpython. I wonder >> why not?) > Not sure - it's certainly installed on the box. I've enabled it for > now, and will see what happens. Sigh ... stop the presses. In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no other Windows critter is unhappy about: dlltool --export-all --output-def worker_spi.def worker_spi.o dllwrap -o worker_spi.dll --def worker_spi.def worker_spi.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition-L/mingw/lib -Wl,--as-needed -L../../src/backend -lpostgres Info: resolving _MyBgworkerEntry by linking to __imp__MyBgworkerEntry (auto-import) fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' fu000002.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' nmth000000.o(.idata$4+0x0): undefined reference to `_nm__MyBgworkerEntry' collect2: ld returned 1 exit status So we are back to square one AFAICS: we still have no idea why narwhal is pickier than everything else. (BTW, to save people the trouble of looking: MyBgworkerEntry is marked PGDLLIMPORT in HEAD but not 9.3.) Also, in HEAD narwhal is building things OK, but then seems to be dumping core in the dblink regression test, leaving one with not a very warm feeling about whether the contrib executables it's building are any good. regards, tom lane
On Mon, Feb 17, 2014 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dave Page <dpage@pgadmin.org> writes: >> On Fri, Feb 14, 2014 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> (BTW, narwhal is evidently not trying to build plpython. I wonder >>> why not?) > >> Not sure - it's certainly installed on the box. I've enabled it for >> now, and will see what happens. > > Sigh ... stop the presses. > > In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no > other Windows critter is unhappy about: > > dlltool --export-all --output-def worker_spi.def worker_spi.o > dllwrap -o worker_spi.dll --def worker_spi.def worker_spi.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition-L/mingw/lib -Wl,--as-needed -L../../src/backend -lpostgres > Info: resolving _MyBgworkerEntry by linking to __imp__MyBgworkerEntry (auto-import) > fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' > fu000002.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' > nmth000000.o(.idata$4+0x0): undefined reference to `_nm__MyBgworkerEntry' > collect2: ld returned 1 exit status > > So we are back to square one AFAICS: we still have no idea why narwhal > is pickier than everything else. (BTW, to save people the trouble of > looking: MyBgworkerEntry is marked PGDLLIMPORT in HEAD but not 9.3.) > > Also, in HEAD narwhal is building things OK, but then seems to be > dumping core in the dblink regression test, leaving one with not a very > warm feeling about whether the contrib executables it's building are > any good. Well, as we know, Narwhal is really quite old now. I think I built it seven+ years ago. Is it really worth banging heads against walls to support something that noone in their right mind should be using for a build these days? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2014-02-17 15:02:15 +0000, Dave Page wrote: > On Mon, Feb 17, 2014 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Not sure - it's certainly installed on the box. I've enabled it for > >> now, and will see what happens. > > > > Sigh ... stop the presses. > > > > In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no > > other Windows critter is unhappy about: > > > > dlltool --export-all --output-def worker_spi.def worker_spi.o > > dllwrap -o worker_spi.dll --def worker_spi.def worker_spi.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition-L/mingw/lib -Wl,--as-needed -L../../src/backend -lpostgres > > Info: resolving _MyBgworkerEntry by linking to __imp__MyBgworkerEntry (auto-import) > > fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' > > fu000002.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' > > nmth000000.o(.idata$4+0x0): undefined reference to `_nm__MyBgworkerEntry' > > collect2: ld returned 1 exit status > > > > So we are back to square one AFAICS: we still have no idea why narwhal > > is pickier than everything else. (BTW, to save people the trouble of > > looking: MyBgworkerEntry is marked PGDLLIMPORT in HEAD but not 9.3.) > > > > Also, in HEAD narwhal is building things OK, but then seems to be > > dumping core in the dblink regression test, leaving one with not a very > > warm feeling about whether the contrib executables it's building are > > any good. > > Well, as we know, Narwhal is really quite old now. I think I built it > seven+ years ago. Is it really worth banging heads against walls to > support something that noone in their right mind should be using for a > build these days? The problem is that lots of those issues are bugs that actually cause problems for msvc builds. If there were tests in worker_spi it'd quite possibly crash when run in 9.3. The problem is rather that the other animals are *not* erroring. Unless I am missing something. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-17 15:02:15 +0000, Dave Page wrote: >> On Mon, Feb 17, 2014 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no >>> other Windows critter is unhappy about: >> Well, as we know, Narwhal is really quite old now. I think I built it >> seven+ years ago. Is it really worth banging heads against walls to >> support something that noone in their right mind should be using for a >> build these days? > The problem is that lots of those issues are bugs that actually cause > problems for msvc builds. If there were tests in worker_spi it'd quite > possibly crash when run in 9.3. The problem is rather that the other > animals are *not* erroring. Exactly. Although on second thought, the lack of complaints from other Windows animals can probably be blamed on the fact that we didn't back-port any of the recent hacking on the Windows build processes. Maybe we should think about doing so, now that the dust seems to have settled. We still need to know why narwhal is crashing on dblink though. I have a bad feeling that that may indicate still-unresolved linkage problems. regards, tom lane
On 2014-02-17 10:21:12 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On 2014-02-17 15:02:15 +0000, Dave Page wrote: > >> On Mon, Feb 17, 2014 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >>> In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no > >>> other Windows critter is unhappy about: > > >> Well, as we know, Narwhal is really quite old now. I think I built it > >> seven+ years ago. Is it really worth banging heads against walls to > >> support something that noone in their right mind should be using for a > >> build these days? > > > The problem is that lots of those issues are bugs that actually cause > > problems for msvc builds. If there were tests in worker_spi it'd quite > > possibly crash when run in 9.3. The problem is rather that the other > > animals are *not* erroring. > > Exactly. > > Although on second thought, the lack of complaints from other Windows > animals can probably be blamed on the fact that we didn't back-port > any of the recent hacking on the Windows build processes. Maybe we > should think about doing so, now that the dust seems to have settled. Yea, at the very least the gendef.pl thing should be backported, possibly the mingw --disable-auto-import thing as well. But it's probably a gooid idea to wait till the branches are stamped? > We still need to know why narwhal is crashing on dblink though. > I have a bad feeling that that may indicate still-unresolved > linkage problems. It's odd: [53019d05.f58:2] LOG: server process (PID 2428) exited with exit code 128 [53019d05.f58:3] DETAIL: Failed process was running: SELECT *FROM dblink('dbname=contrib_regression','SELECT * FROM foo')AS t(a int, b text, c text[])WHERE t.a > 7; [53019d05.f58:4] LOG: server process (PID 2428) exited with exit code 0 [53019d05.f58:5] LOG: terminating any other active server processes [53019d06.e9c:2] WARNING: terminating connection because of crash of another server process [53019d06.e9c:3] DETAIL: The postmaster has commanded this server process to roll back the curreGreetings, Not sure if that's actually a segfault and not something else. Why is the same death reported twice? With different exit codes? Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On 2014-02-17 10:21:12 -0500, Tom Lane wrote: >> Although on second thought, the lack of complaints from other Windows >> animals can probably be blamed on the fact that we didn't back-port >> any of the recent hacking on the Windows build processes. Maybe we >> should think about doing so, now that the dust seems to have settled. > Yea, at the very least the gendef.pl thing should be backported, > possibly the mingw --disable-auto-import thing as well. But it's > probably a gooid idea to wait till the branches are stamped? Certainly --- I'm not touching that till the releases are out ;-). Anything that's broken now has been broken in past releases too, so it's not worth the risk. regards, tom lane
(2014/02/12 15:31), Inoue, Hiroshi wrote: > (2014/02/12 3:03), Tom Lane wrote: >> Hiroshi Inoue <inoue@tpf.co.jp> writes: >>> (2014/02/09 8:06), Andrew Dunstan wrote: >>>> Yeah. Incidentally, we didn't quite get rid of dlltool for Cygwin. We >>>> did get rid of dllwrap. But I agree this is worth trying for Mingw. >> >>> I tried MINGW port with the attached change and successfully built >>> src and contrib and all pararell regression tests were OK. >> >> I cleaned this up a bit (the if-nesting in Makefile.shlib was making >> my head hurt, not to mention that it left a bunch of dead code) and >> committed it. > > Thanks. > >> By my count, the only remaining usage of dlltool is in plpython's >> Makefile. Can we get rid of that? > > Maybe this is one of the few use cases of dlltool. > Because python doesn't ship with its MINGW import library, the > Makefile uses dlltool to generate an import library from the python > DLL. > >> Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile. >> Do we need that either? > > Maybe this can be removed. > I would make a patch later. Sorry for the late reply. Attached is a patch to remove dllwarp from pgevent/Makefile. regards, Hiroshi Inoue
Attachment
Hiroshi Inoue <inoue@tpf.co.jp> writes: >> (2014/02/12 3:03), Tom Lane wrote: >>> Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile. >>> Do we need that either? > Sorry for the late reply. > Attached is a patch to remove dllwarp from pgevent/Makefile. Pushed, thanks. regards, tom lane
Hiroshi Inoue <inoue@tpf.co.jp> writes: > Attached is a patch to remove dllwarp from pgevent/Makefile. Actually, it looks like this patch doesn't work at all: http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53 Did I fat-finger the commit somehow? I made a couple of cosmetic changes, or at least I thought they were cosmetic. regards, tom lane
Tom Lane escribió: > Hiroshi Inoue <inoue@tpf.co.jp> writes: > > Attached is a patch to remove dllwarp from pgevent/Makefile. > > Actually, it looks like this patch doesn't work at all: > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53 > > Did I fat-finger the commit somehow? I made a couple of cosmetic > changes, or at least I thought they were cosmetic. The format of the file is completely unlike that of the other *dll.def files we use elsewhere. AFAICT each line is <tab>symbolname@<tab>some_number -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/20 10:32), Tom Lane wrote: > Hiroshi Inoue <inoue@tpf.co.jp> writes: >> Attached is a patch to remove dllwarp from pgevent/Makefile. > > Actually, it looks like this patch doesn't work at all: Strangely enough it works here though I see double EXPORTS lines in libpgeventdll.def. > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53 > > Did I fat-finger the commit somehow? I made a couple of cosmetic > changes, or at least I thought they were cosmetic. Seems EXPORTS line in exports.txt should be removed. regards, Hiroshi Inoue
Hiroshi Inoue <inoue@tpf.co.jp> writes: > Seems EXPORTS line in exports.txt should be removed. Done. regards, tom lane
(Top post, on phone) The @number part is optional. It indicates an export ordinal. (You don't want to know, if you do, see MSDN). If you remove them or change them then binaries linked to the older version will fail to link to the newer; it breaks binarycompat. The ordinals are part of the library ABI. On 20 Feb 2014 07:28, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Tom Lane escribió: > > Hiroshi Inoue <inoue@tpf.co.jp> writes: > > > Attached is a patch to remove dllwarp from pgevent/Makefile. > > > > Actually, it looks like this patch doesn't work at all: > > > > http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacana&dt=2014-02-20%2001%3A00%3A53 > > > > Did I fat-finger the commit somehow? I made a couple of cosmetic > > changes, or at least I thought they were cosmetic. > > The format of the file is completely unlike that of the other *dll.def > files we use elsewhere. AFAICT each line is > > <tab>symbolname@<tab>some_number > > -- > Álvaro Herrera http://www.2ndQuadrant.com/ > PostgreSQL Development, 24x7 Support, Training & Services
(2014/02/18 0:02), Dave Page wrote: > On Mon, Feb 17, 2014 at 2:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Dave Page <dpage@pgadmin.org> writes: >>> On Fri, Feb 14, 2014 at 5:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>>> (BTW, narwhal is evidently not trying to build plpython. I wonder >>>> why not?) >> >>> Not sure - it's certainly installed on the box. I've enabled it for >>> now, and will see what happens. >> >> Sigh ... stop the presses. >> >> In 9.3, narwhal is *still* showing a PGDLLIMPORT-type failure that no >> other Windows critter is unhappy about: >> >> dlltool --export-all --output-def worker_spi.def worker_spi.o >> dllwrap -o worker_spi.dll --def worker_spi.def worker_spi.o -L../../src/port -L../../src/common -Wl,--allow-multiple-definition-L/mingw/lib -Wl,--as-needed -L../../src/backend -lpostgres >> Info: resolving _MyBgworkerEntry by linking to __imp__MyBgworkerEntry (auto-import) >> fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' >> fu000002.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname' >> nmth000000.o(.idata$4+0x0): undefined reference to `_nm__MyBgworkerEntry' >> collect2: ld returned 1 exit status >> >> So we are back to square one AFAICS: we still have no idea why narwhal >> is pickier than everything else. (BTW, to save people the trouble of >> looking: MyBgworkerEntry is marked PGDLLIMPORT in HEAD but not 9.3.) >> >> Also, in HEAD narwhal is building things OK, but then seems to be >> dumping core in the dblink regression test, leaving one with not a very >> warm feeling about whether the contrib executables it's building are >> any good. > > Well, as we know, Narwhal is really quite old now. I think I built it > seven+ years ago. There is a difference between narwhal and my pretty old machine (Windows Vista gcc 3.4.5). Linking perl58.lib or tcl84.lib works on my machine but neither works on narwhal. Therefore dlltool still remains in Makefiles of plperl, pltcl and plpython (linking pythonxx.lib doesn't work even on a newer machine). I tried another way which links the dll directly instead of msvc import library and got successful results on both old and newer machines except for plpython on an old machine which fails with a compilation error before linkage phase. I attached a patch which eliminates pexports and dlltool from makefile of plperl. The patch links mingw gcc import library if it exists, otherwise links the dll directly. When it works on narwhal, I can provide similar patches for pltcl and plpython. regards, Hiroshi Inoue > >
Attachment
It seems we left this in broken state. Do we need to do more here to fix narwhal, or do we want to retire narwhal now? Something else? Are we waiting on someone in particular to do something specific? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > It seems we left this in broken state. Do we need to do more here to > fix narwhal, or do we want to retire narwhal now? Something else? Are > we waiting on someone in particular to do something specific? I think we're hoping that somebody will step up and investigate how narwhal's problem might be fixed. However, the machine's owner (Dave) doesn't appear to have the time/interest to do that. That means that our realistic choices are to retire narwhal or revert the linker changes that broke it. Since those linker changes were intended to help expose missing-PGDLLIMPORT bugs, I don't much care for the second alternative. regards, tom lane
On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> It seems we left this in broken state. Do we need to do more here to >> fix narwhal, or do we want to retire narwhal now? Something else? Are >> we waiting on someone in particular to do something specific? > > I think we're hoping that somebody will step up and investigate how > narwhal's problem might be fixed. However, the machine's owner (Dave) > doesn't appear to have the time/interest to do that. That means that > our realistic choices are to retire narwhal or revert the linker changes > that broke it. Since those linker changes were intended to help expose > missing-PGDLLIMPORT bugs, I don't much care for the second alternative. It's a time issue right now I'm afraid (always interested in fixing bugs). However, if "fixing" it comes down to upgrading the seriously old compiler and toolchain on that box (which frankly is so obsolete, I can't see why anyone would want to use anything like it these days), then I think the best option is to retire it, and replace it with Windows 2012R2 and a modern release of MinGW/Msys which is far more likely to be similar to what someone would want to use at present. Does anyone really think there's a good reason to keep maintaining such an obsolete animal? -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 10/14/2014 06:44 PM, Dave Page wrote: > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> It seems we left this in broken state. Do we need to do more here to >>> fix narwhal, or do we want to retire narwhal now? Something else? Are >>> we waiting on someone in particular to do something specific? >> I think we're hoping that somebody will step up and investigate how >> narwhal's problem might be fixed. However, the machine's owner (Dave) >> doesn't appear to have the time/interest to do that. That means that >> our realistic choices are to retire narwhal or revert the linker changes >> that broke it. Since those linker changes were intended to help expose >> missing-PGDLLIMPORT bugs, I don't much care for the second alternative. > It's a time issue right now I'm afraid (always interested in fixing bugs). > > However, if "fixing" it comes down to upgrading the seriously old > compiler and toolchain on that box (which frankly is so obsolete, I > can't see why anyone would want to use anything like it these days), > then I think the best option is to retire it, and replace it with > Windows 2012R2 and a modern release of MinGW/Msys which is far more > likely to be similar to what someone would want to use at present. > > Does anyone really think there's a good reason to keep maintaining > such an obsolete animal? > I do not. I upgraded from this ancient toolset quite a few years ago, and I'm actually thinking of retiring what I replaced it with. cheers andrew
Dave Page <dpage@pgadmin.org> writes: > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think we're hoping that somebody will step up and investigate how >> narwhal's problem might be fixed. However, the machine's owner (Dave) >> doesn't appear to have the time/interest to do that. > It's a time issue right now I'm afraid (always interested in fixing bugs). > However, if "fixing" it comes down to upgrading the seriously old > compiler and toolchain on that box (which frankly is so obsolete, I > can't see why anyone would want to use anything like it these days), > then I think the best option is to retire it, and replace it with > Windows 2012R2 and a modern release of MinGW/Msys which is far more > likely to be similar to what someone would want to use at present. No argument here. I would kind of like to have more than zero understanding of *why* it's failing, just in case there's more to it than "oh, probably a bug in this old toolchain". But finding that out might well take significant time, and in the end not tell us anything very useful. > Does anyone really think there's a good reason to keep maintaining > such an obsolete animal? Is it likely that anyone is still using Windows 2003 in the field? A possible compromise is to update the toolchain but stay on the same OS release, so that we still have testing that's relevant to people using older OS releases. regards, tom lane
On Tue, Oct 14, 2014 at 07:07:17PM -0400, Tom Lane wrote: > Dave Page <dpage@pgadmin.org> writes: > > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I think we're hoping that somebody will step up and investigate how > >> narwhal's problem might be fixed. I have planned to look at reproducing narwhal's problem once the dust settles on orangutan, but I wouldn't mind if narwhal went away instead. > > However, if "fixing" it comes down to upgrading the seriously old > > compiler and toolchain on that box (which frankly is so obsolete, I > > can't see why anyone would want to use anything like it these days), > > then I think the best option is to retire it, and replace it with > > Windows 2012R2 and a modern release of MinGW/Msys which is far more > > likely to be similar to what someone would want to use at present. If you upgrade the toolchain, you really have a new animal. > No argument here. I would kind of like to have more than zero > understanding of *why* it's failing, just in case there's more to it > than "oh, probably a bug in this old toolchain". But finding that out > might well take significant time, and in the end not tell us anything > very useful. Agreed on all those points. > Is it likely that anyone is still using Windows 2003 in the field? > A possible compromise is to update the toolchain but stay on the > same OS release, so that we still have testing that's relevant to > people using older OS releases. Windows Server 2003 isn't even EOL yet. I'd welcome a buildfarm member with that OS and a modern toolchain.
On 10/15/2014 12:53 PM, Noah Misch wrote: > Windows Server 2003 isn't even EOL yet. I'd welcome a buildfarm member with > that OS and a modern toolchain. It's possible to run multiple buildfarm animals on a single Windows instance, each with a different toolchain. There's the chance of interactions that can't occur if each SDK is used in isolation on its own machine, but it should be pretty minimal. I have a machine that currently does this with Jenkins. I'll look at bringing up buildfarm members on it. It's win2k8 though. -- Craig Ringer http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On 10/15/2014 01:53 AM, Craig Ringer wrote: > On 10/15/2014 12:53 PM, Noah Misch wrote: >> Windows Server 2003 isn't even EOL yet. I'd welcome a buildfarm member with >> that OS and a modern toolchain. > It's possible to run multiple buildfarm animals on a single Windows > instance, each with a different toolchain. Indeed, and even using the same buildroot. That's been built into the buildfarm for a long time. For example, nightjar and friarbird do this. > > There's the chance of interactions that can't occur if each SDK is used > in isolation on its own machine, but it should be pretty minimal. Make sure each has a distinct base_port. cheers andrew
On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote: > On Tue, Oct 14, 2014 at 07:07:17PM -0400, Tom Lane wrote: > > Dave Page <dpage@pgadmin.org> writes: > > > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > >> I think we're hoping that somebody will step up and investigate how > > >> narwhal's problem might be fixed. > > I have planned to look at reproducing narwhal's problem once the dust settles > on orangutan, but I wouldn't mind if narwhal went away instead. > > No argument here. I would kind of like to have more than zero > > understanding of *why* it's failing, just in case there's more to it > > than "oh, probably a bug in this old toolchain". But finding that out > > might well take significant time, and in the end not tell us anything > > very useful. > > Agreed on all those points. I reproduced narwhal's problem using its toolchain on another 32-bit Windows Server 2003 system. The crash happens at the SHGetFolderPath() call in pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or via shell32.dll; we've used the former method since commit 889f038, for better compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version loads and unloads shell32.dll. In PostgreSQL built using this older compiler, shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! That started with commit 846e91e. I don't expect to understand the mechanism behind it, but I recommend we switch back to linking libpq with shell32.dll. The MSVC build already does that in all supported branches, and it feels right for the MinGW build to follow suit in 9.4+. Windows versions that lack the symbol in shell32.dll are now ancient history. I happened to try the same contrib/dblink test suite on PostgreSQL built with modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave a crash-like symptom starting with commit 846e91e. Specifically, a backend that LOADed any module linked to libpq (libpqwalreceiver, dblink, postgres_fdw) would suffer this after calling exit(0): === 3056 2014-10-20 00:40:15.163 GMT LOG: disconnection: session time: 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1port=3936 This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. This application has requested the Runtime to terminate it in an unusual way. Please contact the application's support team for more information. 9300 2014-10-20 00:40:15.163 GMT LOG: server process (PID 3056) exited with exit code 3 === The mechanism turned out to be disjoint from the mechanism behind the ancient-compiler crash. Based on the functions called from exit(), my best guess is that exit() encountered recursion and used something like an abort() to escape. (I can send the gdb transcript if anyone is curious to see the gory details.) The proximate cause was commit 846e91e allowing modules to use shared libgcc. A 32-bit libpq acquires 64-bit integer division from libgcc. Passing -static-libgcc to the link restores the libgcc situation as it stood before commit 846e91e. The main beneficiary of shared libgcc is C++/Java exception handling, so PostgreSQL doesn't care. No doubt there's some deeper bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc should not disrupt exit(). I'm content with this workaround. Patches attached. For readers inclined toward animal husbandry, we could benefit from more MinGW buildfarm coverage. There are three compiler strains (classic MinGW, 32-bit MinGW-w64, 64-bit MinGW-w64) and three supported Windows Server versions. [1] http://www.postgresql.org/message-id/flat/6BCB9D8A16AC4241919521715F4D8BCE476665@algol.sollentuna.se
Attachment
Noah Misch <noah@leadboat.com> writes: > I reproduced narwhal's problem using its toolchain on another 32-bit Windows > Server 2003 system. The crash happens at the SHGetFolderPath() call in > pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or > via shell32.dll; we've used the former method since commit 889f038, for better > compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version > loads and unloads shell32.dll. In PostgreSQL built using this older compiler, > shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! > That started with commit 846e91e. Thanks for doing the detective work on this! > I don't expect to understand the mechanism > behind it, but I recommend we switch back to linking libpq with shell32.dll. > The MSVC build already does that in all supported branches, and it feels right > for the MinGW build to follow suit in 9.4+. Windows versions that lack the > symbol in shell32.dll are now ancient history. This is basically reverting 889f038, right? It looks to me like we made that change only to support NT4, which was obsolete even in 2005, so no objection from me. Magnus might have a different idea though. > I happened to try the same contrib/dblink test suite on PostgreSQL built with > modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave > a crash-like symptom starting with commit 846e91e. Ick. > Passing -static-libgcc to the link restores the libgcc situation as it stood > before commit 846e91e. The main beneficiary of shared libgcc is C++/Java > exception handling, so PostgreSQL doesn't care. No doubt there's some deeper > bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc > should not disrupt exit(). I'm content with this workaround. Works for me too, until such time as somebody feels like digging deeper. regards, tom lane
On Mon, Oct 20, 2014 at 11:06:54AM -0400, Tom Lane wrote: > Noah Misch <noah@leadboat.com> writes: > > I don't expect to understand the mechanism > > behind it, but I recommend we switch back to linking libpq with shell32.dll. > > The MSVC build already does that in all supported branches, and it feels right > > for the MinGW build to follow suit in 9.4+. Windows versions that lack the > > symbol in shell32.dll are now ancient history. > > This is basically reverting 889f038, right? Mostly so. Keeping the SHGetSpecialFolderPath() -> SHGetFolderPath() part, but reverting the shell32.dll -> shfolder.dll part.
On 2014-10-20 01:03:31 -0400, Noah Misch wrote: > On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote: > > On Tue, Oct 14, 2014 at 07:07:17PM -0400, Tom Lane wrote: > > > Dave Page <dpage@pgadmin.org> writes: > > > > On Tue, Oct 14, 2014 at 11:38 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > >> I think we're hoping that somebody will step up and investigate how > > > >> narwhal's problem might be fixed. > > > > I have planned to look at reproducing narwhal's problem once the dust settles > > on orangutan, but I wouldn't mind if narwhal went away instead. > > > > No argument here. I would kind of like to have more than zero > > > understanding of *why* it's failing, just in case there's more to it > > > than "oh, probably a bug in this old toolchain". But finding that out > > > might well take significant time, and in the end not tell us anything > > > very useful. > > > > Agreed on all those points. > > I reproduced narwhal's problem using its toolchain on another 32-bit Windows > Server 2003 system. The crash happens at the SHGetFolderPath() call in > pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or > via shell32.dll; we've used the former method since commit 889f038, for better > compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version > loads and unloads shell32.dll. In PostgreSQL built using this older compiler, > shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! > That started with commit 846e91e. I don't expect to understand the mechanism > behind it, but I recommend we switch back to linking libpq with shell32.dll. > The MSVC build already does that in all supported branches, and it feels right > for the MinGW build to follow suit in 9.4+. Windows versions that lack the > symbol in shell32.dll are now ancient history. Ick. Nice detective work of a ugly situation. > I happened to try the same contrib/dblink test suite on PostgreSQL built with > modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave > a crash-like symptom starting with commit 846e91e. Specifically, a backend > that LOADed any module linked to libpq (libpqwalreceiver, dblink, > postgres_fdw) would suffer this after calling exit(0): > > === > 3056 2014-10-20 00:40:15.163 GMT LOG: disconnection: session time: 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1port=3936 > > This application has requested the Runtime to terminate it in an unusual way. > Please contact the application's support team for more information. > > This application has requested the Runtime to terminate it in an unusual way. > Please contact the application's support team for more information. > 9300 2014-10-20 00:40:15.163 GMT LOG: server process (PID 3056) exited with exit code 3 > === > > The mechanism turned out to be disjoint from the mechanism behind the > ancient-compiler crash. Based on the functions called from exit(), my best > guess is that exit() encountered recursion and used something like an abort() > to escape. Hm. > (I can send the gdb transcript if anyone is curious to see the > gory details.) That would be interesting. > The proximate cause was commit 846e91e allowing modules to use > shared libgcc. A 32-bit libpq acquires 64-bit integer division from libgcc. > Passing -static-libgcc to the link restores the libgcc situation as it stood > before commit 846e91e. The main beneficiary of shared libgcc is C++/Java > exception handling, so PostgreSQL doesn't care. No doubt there's some deeper > bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc > should not disrupt exit(). I'm content with this workaround. I'm unconvinced by this reasoning. Popular postgres extensions like postgis do use C++. It's imo not hard to imagine situations where switching to a statically linked libgcc statically could cause problems. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Mon, Oct 20, 2014 at 10:24:47PM +0200, Andres Freund wrote: > On 2014-10-20 01:03:31 -0400, Noah Misch wrote: > > On Wed, Oct 15, 2014 at 12:53:03AM -0400, Noah Misch wrote: > > I happened to try the same contrib/dblink test suite on PostgreSQL built with > > modern MinGW-w64 (i686-4.9.1-release-win32-dwarf-rt_v3-rev1). That, too, gave > > a crash-like symptom starting with commit 846e91e. Specifically, a backend > > that LOADed any module linked to libpq (libpqwalreceiver, dblink, > > postgres_fdw) would suffer this after calling exit(0): > > > > === > > 3056 2014-10-20 00:40:15.163 GMT LOG: disconnection: session time: 0:00:00.515 user=cyg_server database=template1 host=127.0.0.1port=3936 > > > > This application has requested the Runtime to terminate it in an unusual way. > > Please contact the application's support team for more information. > > > > This application has requested the Runtime to terminate it in an unusual way. > > Please contact the application's support team for more information. > > 9300 2014-10-20 00:40:15.163 GMT LOG: server process (PID 3056) exited with exit code 3 > > === > > > > The mechanism turned out to be disjoint from the mechanism behind the > > ancient-compiler crash. Based on the functions called from exit(), my best > > guess is that exit() encountered recursion and used something like an abort() > > to escape. > > Hm. > > > (I can send the gdb transcript if anyone is curious to see the > > gory details.) > > That would be interesting. Attached. ("rep 100 s" calls a macro equivalent to issuing "s" 100 times.) > > The proximate cause was commit 846e91e allowing modules to use > > shared libgcc. A 32-bit libpq acquires 64-bit integer division from libgcc. > > Passing -static-libgcc to the link restores the libgcc situation as it stood > > before commit 846e91e. The main beneficiary of shared libgcc is C++/Java > > exception handling, so PostgreSQL doesn't care. No doubt there's some deeper > > bug in libgcc or in PostgreSQL; loading a module that links with shared libgcc > > should not disrupt exit(). I'm content with this workaround. > > I'm unconvinced by this reasoning. Popular postgres extensions like > postgis do use C++. It's imo not hard to imagine situations where > switching to a statically linked libgcc statically could cause problems. True; I was wrong to say that PostgreSQL doesn't care. MinGW builds of every released PostgreSQL version use static libgcc. That changed as an unintended consequence of a patch designed to remove our reliance on dlltool. Given the lack of complaints about our historic use of static libgcc, I think it's fair to revert to 9.3's use thereof. Supporting shared libgcc would be better still, should someone make that effort.
Attachment
On Mon, Oct 20, 2014 at 01:03:31AM -0400, Noah Misch wrote: > I reproduced narwhal's problem using its toolchain on another 32-bit Windows > Server 2003 system. The crash happens at the SHGetFolderPath() call in > pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or > via shell32.dll; we've used the former method since commit 889f038, for better > compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version > loads and unloads shell32.dll. In PostgreSQL built using this older compiler, > shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! > That started with commit 846e91e. I don't expect to understand the mechanism > behind it, but I recommend we switch back to linking libpq with shell32.dll. > The MSVC build already does that in all supported branches, and it feels right > for the MinGW build to follow suit in 9.4+. Windows versions that lack the > symbol in shell32.dll are now ancient history. That allowed narwhal to proceed a bit further than before, but it crashes later in the dblink test suite. Will test again...
On Wed, Oct 22, 2014 at 12:12:36AM -0400, Noah Misch wrote: > On Mon, Oct 20, 2014 at 01:03:31AM -0400, Noah Misch wrote: > > I reproduced narwhal's problem using its toolchain on another 32-bit Windows > > Server 2003 system. The crash happens at the SHGetFolderPath() call in > > pqGetHomeDirectory(). A program can acquire that function via shfolder.dll or > > via shell32.dll; we've used the former method since commit 889f038, for better > > compatibility[1] with Windows NT 4.0. On this system, shfolder.dll's version > > loads and unloads shell32.dll. In PostgreSQL built using this older compiler, > > shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32! > > That started with commit 846e91e. I don't expect to understand the mechanism > > behind it, but I recommend we switch back to linking libpq with shell32.dll. > > The MSVC build already does that in all supported branches, and it feels right > > for the MinGW build to follow suit in 9.4+. Windows versions that lack the > > symbol in shell32.dll are now ancient history. > > That allowed narwhal to proceed a bit further than before, but it crashes > later in the dblink test suite. Will test again... Calls to ldap_init() exhibit the same problem shfolder.dll:SHGetFolderPath() exhibited: it loads and unloads some DLLs, and it manages to unload libpq in passing. There's nothing comparable to the above workaround this time, but I found a more fundamental fix. Each DLL header records the DLL's presumed name, viewable with "objdump -x libpq.dll | grep ^Name". Per the GNU ld documentation, one can override it in a .def file: The optional LIBRARY <name> command indicates the internal name of the output DLL. If `<name>' does not include a suffix, the default library suffix, `.DLL' is appended. libpqdll.def uses "LIBRARY LIBPQ", which yields an internal name of "LIBPQ.dll" under MinGW-w64 i686-4.9.1-release-win32-dwarf-rt_v3-rev1. Our MSVC build process also gives "LIBPQ.dll". Under narwhal's older toolchain, libpq.dll gets a name of just "LIBPQ". The attached patch brings the product of narwhal's toolchain in line with the others. I don't know by what magic wldap32.dll:ldap_init() and shfolder.dll:SHGetFolderPath() care about finding ".dll" in the names of loaded libraries, but it does fix the bug. This erases the impetus for my recent commit 53566fc. I'm inclined to keep that commit in the name of consistency with the MSVC build, but one could argue for reverting its 9.4 counterpart. I don't feel strongly either way, so I expect to let 2f51f42 stand. FYI, the problem is reproducible in a 32-bit PostgreSQL build running on 32-bit or 64-bit Windows Server 2003. It does not occur on 64-bit Windows Server 2008, even after marking postgres.exe to run in the compatibility mode for Windows Server 2003. (I did not try 32-bit Windows Server 2008.)
Attachment
Noah Misch <noah@leadboat.com> writes: > Calls to ldap_init() exhibit the same problem shfolder.dll:SHGetFolderPath() > exhibited: it loads and unloads some DLLs, and it manages to unload libpq in > passing. There's nothing comparable to the above workaround this time, but I > found a more fundamental fix. > ... > libpqdll.def uses "LIBRARY LIBPQ", which yields an internal name of > "LIBPQ.dll" under MinGW-w64 i686-4.9.1-release-win32-dwarf-rt_v3-rev1. Our > MSVC build process also gives "LIBPQ.dll". Under narwhal's older toolchain, > libpq.dll gets a name of just "LIBPQ". The attached patch brings the product > of narwhal's toolchain in line with the others. I don't know by what magic > wldap32.dll:ldap_init() and shfolder.dll:SHGetFolderPath() care about finding > ".dll" in the names of loaded libraries, but it does fix the bug. Nice detective work! > This erases the impetus for my recent commit 53566fc. I'm inclined to keep > that commit in the name of consistency with the MSVC build, but one could > argue for reverting its 9.4 counterpart. I don't feel strongly either way, so > I expect to let 2f51f42 stand. Yeah, I lean the same way. The fewer differences in the results of our assorted Windows build processes, the better. regards, tom lane