Thread: mingw check hung
Something happened about 80 hours ago that caused my mingw buildfarm member (gcc 3.4.2 on Win XP Pro SP2) to hang at the check stage. It looks like it's hung in initdb. I wonder if it could be this commit: Log Message: ----------- Make win32 builds always do SetEnvironmentVariable() when doing putenv(). Also, if linked against other versions than the default MSVCRT library (for example the MSVC build which links against MSVCRT80), also update the cache in the default MSVCRT at the same time. I note that the change is not apparently limited to MSVC builds. The MSVC animal that runs on the same machine appears unaffected.I see one other mingw buildfarm member that is having problems that started a few days ago (yak) and another thatlooks like it is a few hours overdue to report, so it might also be hung (vaquita). cheers andrew
Andrew Dunstan wrote: > > Something happened about 80 hours ago that caused my mingw buildfarm > member (gcc 3.4.2 on Win XP Pro SP2) to hang at the check stage. It > looks like it's hung in initdb. > > I wonder if it could be this commit: > > Log Message: > ----------- > Make win32 builds always do SetEnvironmentVariable() when doing putenv(). > Also, if linked against other versions than the default MSVCRT library > (for example the MSVC build which links against MSVCRT80), also update > the cache in the default MSVCRT at the same time. > > I note that the change is not apparently limited to MSVC builds. The > MSVC animal that runs on the same machine appears unaffected. I see > one other mingw buildfarm member that is having problems that started > a few days ago (yak) and another that looks like it is a few hours > overdue to report, so it might also be hung (vaquita). > > Further to this: I see that vaquita has now reported in, and is happy. Also, I can run happily on my Vista box (vaquita is also a Vista box). I therefore suspect that we have a problem specifically with XP (both dawn_bat and yak are XP boxes). cheers andrew
Andrew Dunstan wrote: > > > Andrew Dunstan wrote: >> >> Something happened about 80 hours ago that caused my mingw buildfarm >> member (gcc 3.4.2 on Win XP Pro SP2) to hang at the check stage. It >> looks like it's hung in initdb. >> >> I wonder if it could be this commit: >> >> Log Message: >> ----------- >> Make win32 builds always do SetEnvironmentVariable() when doing putenv(). >> Also, if linked against other versions than the default MSVCRT library >> (for example the MSVC build which links against MSVCRT80), also update >> the cache in the default MSVCRT at the same time. >> >> I note that the change is not apparently limited to MSVC builds. The >> MSVC animal that runs on the same machine appears unaffected. I see >> one other mingw buildfarm member that is having problems that started >> a few days ago (yak) and another that looks like it is a few hours >> overdue to report, so it might also be hung (vaquita). >> >> > > Further to this: > > I see that vaquita has now reported in, and is happy. Also, I can run > happily on my Vista box (vaquita is also a Vista box). I therefore > suspect that we have a problem specifically with XP (both dawn_bat and > yak are XP boxes). Have you managed to get gdb running on that box, and if so, can you try to grab a stacktrace? If not, try a stacktrace from process explorer. It doesn't actually work with mingw, but it gives you a hint based on DLL exports... //Magnus
Magnus Hagander wrote: > > > Have you managed to get gdb running on that box, and if so, can you try > to grab a stacktrace? If not, try a stacktrace from process explorer. It > doesn't actually work with mingw, but it gives you a hint based on DLL > exports... > > > I'll see what I can do. By the time I get to see the problem Dr Watson already has the process - in fact the run is hanging waiting on a Dr Watson dialog box ;-( I've installed drmingw to handle exceptions instead, so we'll see if that gives us useful info. If not, I'll see what I can do with gdb. cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> >> >> Have you managed to get gdb running on that box, and if so, can you try >> to grab a stacktrace? If not, try a stacktrace from process explorer. It >> doesn't actually work with mingw, but it gives you a hint based on DLL >> exports... >> >> >> > > > I'll see what I can do. By the time I get to see the problem Dr Watson > already has the process - in fact the run is hanging waiting on a Dr > Watson dialog box ;-( There's a commandline parameter to drwatson, iirc, that will make it stop grabbing them automatically. > I've installed drmingw to handle exceptions instead, so we'll see if > that gives us useful info. If not, I'll see what I can do with gdb. Hadn't heard of drwmingw, I see how that can be useful :-) //Magnus
Magnus Hagander wrote: > Andrew Dunstan wrote: > > >> I've installed drmingw to handle exceptions instead, so we'll see if >> that gives us useful info. If not, I'll see what I can do with gdb. >> > > Hadn't heard of drwmingw, I see how that can be useful :-) > > > report from DrMingw is below cheers andrew initdb.exe caused an Access Violation at location 7c91b1fa in module ntdll.dll Writing to location 20202030. Registers: eax=20202020 ebx=00000000 ecx=00000000 edx=003eab70 esi=003eab70 edi=00000000 eip=7c91b1fa esp=0022b820 ebp=0022b894 iopl=0 nv up ei pl nz na po nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000206 Call stack: 7C91B1FA ntdll.dll:7C91B1FA RtlpWaitForCriticalSection 7C901046 ntdll.dll:7C901046 RtlEnterCriticalSection 77C3F34F msvcrt.dll:77C3F34F _popen 00401493 initdb.exe:00401493 popen_check initdb.c:477 static FILE * popen_check( const char * command = , const char * mode = ) ... errno = 0; cmdfd = popen(command, mode);> if (cmdfd == NULL) fprintf(stderr, _("%s: could not execute command\"%s\": %s\n"), progname, command, strerror(errno)); ... 00404DA0 initdb.exe:00404DA0 main initdb.c:1650 int main( int argc = 7, char * * argv = &0x003e3d21 ) ... DEVNULL); > PG_CMD_OPEN; for (line = sysviews_setup; *line != NULL; line++) ... 004011E7 initdb.exe:004011E7 00401238 initdb.exe:00401238 7C817067 kernel32.dll:7C817067 RegisterWaitForInputIdle
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> Andrew Dunstan wrote: >> >> >>> I've installed drmingw to handle exceptions instead, so we'll see if >>> that gives us useful info. If not, I'll see what I can do with gdb. >>> >> >> Hadn't heard of drwmingw, I see how that can be useful :-) >> >> >> > > report from DrMingw is below > > Further data point: The suspect patch is quite definitely the source of the problem. I undid the configure changes and surrounded the additions to port/win32.h with #ifdef WIN32_ONLY_COMPILER ... #endif. Result: the problem disappeared, and "make check" completed perfectly. cheers andrew
Andrew Dunstan wrote: > > > Andrew Dunstan wrote: >> >> >> Magnus Hagander wrote: >>> Andrew Dunstan wrote: >>> >>> >>>> I've installed drmingw to handle exceptions instead, so we'll see if >>>> that gives us useful info. If not, I'll see what I can do with gdb. >>>> >>> >>> Hadn't heard of drwmingw, I see how that can be useful :-) >>> >>> >>> >> >> report from DrMingw is below >> >> > Further data point: > > The suspect patch is quite definitely the source of the problem. I undid > the configure changes and surrounded the additions to port/win32.h with > #ifdef WIN32_ONLY_COMPILER ... #endif. Result: the problem disappeared, > and "make check" completed perfectly. Per discussion I looked at just reverting that part, but that won't work. If we do that, the call to SetEnvironmentVariable() will not be run, which certainly isn't right.. The problem has to be in win32env.c. I originally thought we accidentally called the putenv function twice in this case, but that code seems properly #ifdef:ed to MSVC. I'm not sure I trust the crash point at all - is this compiled with debug info enabled? It seems like a *very* strange line to crash on... I can't spot the error right off :-( Can you try to see if it's the putenv() or the unsetenv() that gets broken? (by making sure just one of them get replaced) //Magnus
Magnus Hagander <magnus@hagander.net> writes: > Andrew Dunstan wrote: >> The suspect patch is quite definitely the source of the problem. > I can't spot the error right off :-( Can you try to see if it's the > putenv() or the unsetenv() that gets broken? Are we sure pgwin32_unsetenv works in this environment? (Or worse, maybe it's trying to use port/unsetenv.c?) regards, tom lane
Magnus Hagander wrote: > Per discussion I looked at just reverting that part, but that won't > work. If we do that, the call to SetEnvironmentVariable() will not be > run, which certainly isn't right.. > > The problem has to be in win32env.c. I originally thought we > accidentally called the putenv function twice in this case, but that > code seems properly #ifdef:ed to MSVC. > > I'm not sure I trust the crash point at all - is this compiled with > debug info enabled? It seems like a *very* strange line to crash on... > > I can't spot the error right off :-( Can you try to see if it's the > putenv() or the unsetenv() that gets broken? (by making sure just one of > them get replaced) > > //Magnus Hi guys, Don't know if this is relevant at all, but it reminds me of a problem I had with environment variables in PostGIS with MingW. It was something along the lines of environment variables set in a MingW program using putenv() for PGPORT, PGHOST etc. weren't visible to a MSVC-compiled libpq but were to a MingW-compiled libpq. It's fairly easy to knock up a quick test program in C to verify this. I eventually gave up and just built a connection string instead - for reference the final patch is here http://postgis.refractions.net/pipermail/postgis-commits/2008-January/000199.html. I appreciate it may not be 100% relevant, but I thought I'd flag it up as possibly being a fault with the MingW putenv implementation. HTH, Mark. -- Mark Cave-Ayland Sirius Corporation - The Open Source Experts http://www.siriusit.co.uk T: +44 870 608 0063
Mark Cave-Ayland wrote: > Magnus Hagander wrote: > >> Per discussion I looked at just reverting that part, but that won't >> work. If we do that, the call to SetEnvironmentVariable() will not be >> run, which certainly isn't right.. >> >> The problem has to be in win32env.c. I originally thought we >> accidentally called the putenv function twice in this case, but that >> code seems properly #ifdef:ed to MSVC. >> >> I'm not sure I trust the crash point at all - is this compiled with >> debug info enabled? It seems like a *very* strange line to crash on... >> >> I can't spot the error right off :-( Can you try to see if it's the >> putenv() or the unsetenv() that gets broken? (by making sure just one of >> them get replaced) >> >> //Magnus > > Hi guys, > > Don't know if this is relevant at all, but it reminds me of a problem I > had with environment variables in PostGIS with MingW. It was something > along the lines of environment variables set in a MingW program using > putenv() for PGPORT, PGHOST etc. weren't visible to a MSVC-compiled > libpq but were to a MingW-compiled libpq. It's fairly easy to knock up a > quick test program in C to verify this. That's the reason for this patch to go in in the first place. That has been fixed. It also seems to have caused crashes on mingw, which was not expected :-) It's not actually a fault with mingw putenv, it's just that those go into the cached environment only. //Magnus
Tom Lane wrote: > Magnus Hagander <magnus@hagander.net> writes: > >> Andrew Dunstan wrote: >> >>> The suspect patch is quite definitely the source of the problem. >>> > > >> I can't spot the error right off :-( Can you try to see if it's the >> putenv() or the unsetenv() that gets broken? >> > > Are we sure pgwin32_unsetenv works in this environment? (Or worse, > maybe it's trying to use port/unsetenv.c?) > > > It is the pgwin32_unsetenv() call that is causing the trouble somehow. That much I have just managed to isolate. cheers andrew
Andrew Dunstan wrote: > > > Tom Lane wrote: >> Magnus Hagander <magnus@hagander.net> writes: >> >>> Andrew Dunstan wrote: >>> >>>> The suspect patch is quite definitely the source of the problem. >>>> >> >> >>> I can't spot the error right off :-( Can you try to see if it's the >>> putenv() or the unsetenv() that gets broken? >>> >> >> Are we sure pgwin32_unsetenv works in this environment? (Or worse, >> maybe it's trying to use port/unsetenv.c?) >> >> >> > > It is the pgwin32_unsetenv() call that is causing the trouble somehow. > That much I have just managed to isolate. > > Specifically, it's the SetEnvironmentVariable() call from pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled things work just fine. cheers andrew
Andrew Dunstan wrote: > > > Andrew Dunstan wrote: >> >> >> Tom Lane wrote: >>> Magnus Hagander <magnus@hagander.net> writes: >>> >>>> Andrew Dunstan wrote: >>>> >>>>> The suspect patch is quite definitely the source of the problem. >>>>> >>> >>> >>>> I can't spot the error right off :-( Can you try to see if it's the >>>> putenv() or the unsetenv() that gets broken? >>>> >>> >>> Are we sure pgwin32_unsetenv works in this environment? (Or worse, >>> maybe it's trying to use port/unsetenv.c?) >>> >>> >> >> It is the pgwin32_unsetenv() call that is causing the trouble somehow. >> That much I have just managed to isolate. >> >> > > > Specifically, it's the SetEnvironmentVariable() call from > pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled > things work just fine. That's strange :( What arguments are it sent to the function? Since this is an API function, it really shouldn't behave differently between mingw and msvc, so it must be something that goes wrong with the arguments. Also, Tom mentioned earlier that we may be including *two* replacements for unsetenv(), which could be what's causing the problem. Can you check if that is happening and try to disable the one in port/unsetenv.c and see if that changes things? //Magnus
Magnus Hagander wrote: >> Specifically, it's the SetEnvironmentVariable() call from >> pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled >> things work just fine. >> > > That's strange :( What arguments are it sent to the function? Since this > is an API function, it really shouldn't behave differently between mingw > and msvc, so it must be something that goes wrong with the arguments. > > Also, Tom mentioned earlier that we may be including *two* replacements > for unsetenv(), which could be what's causing the problem. Can you check > if that is happening and try to disable the one in port/unsetenv.c and > see if that changes things? > > > I've already ruled out that hypothesis by forcing the call direct to pgwin32_unsetenv() instead of relying on the macro, in initdb.c. There are only two such calls in initdb.c: the arguments are "LC_ALL" and "PGCLIENTENCODING". I wonder if this version of SetEnvironmentVariable is sufficiently dumb that it fails badly if given a NULL second argument for a value that is not in fact in the environment (as I would normally expect of these on Windows)? cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >>> Specifically, it's the SetEnvironmentVariable() call from >>> pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled >>> things work just fine. >>> >> >> That's strange :( What arguments are it sent to the function? Since this >> is an API function, it really shouldn't behave differently between mingw >> and msvc, so it must be something that goes wrong with the arguments. >> >> Also, Tom mentioned earlier that we may be including *two* replacements >> for unsetenv(), which could be what's causing the problem. Can you check >> if that is happening and try to disable the one in port/unsetenv.c and >> see if that changes things? >> >> >> > > I've already ruled out that hypothesis by forcing the call direct to > pgwin32_unsetenv() instead of relying on the macro, in initdb.c. > > There are only two such calls in initdb.c: the arguments are "LC_ALL" > and "PGCLIENTENCODING". > > I wonder if this version of SetEnvironmentVariable is sufficiently dumb > that it fails badly if given a NULL second argument for a value that is > not in fact in the environment (as I would normally expect of these on > Windows)? But that should be a win32 API call. It's not a runtime call. So it should be identical between mingw and msvc! Try removing the code that sets it to NULL if it's empty string. Having it as empty string made it fail on MSVC, and the API documentation says it should be NULL, but maybe mingw is somehow intercepting the call and breaking it... //Magnus
Magnus Hagander wrote: > Andrew Dunstan wrote: > >> Magnus Hagander wrote: >> >>>> Specifically, it's the SetEnvironmentVariable() call from >>>> pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled >>>> things work just fine. >>>> >>>> >>> That's strange :( What arguments are it sent to the function? Since this >>> is an API function, it really shouldn't behave differently between mingw >>> and msvc, so it must be something that goes wrong with the arguments. >>> >>> Also, Tom mentioned earlier that we may be including *two* replacements >>> for unsetenv(), which could be what's causing the problem. Can you check >>> if that is happening and try to disable the one in port/unsetenv.c and >>> see if that changes things? >>> >>> >>> >>> >> I've already ruled out that hypothesis by forcing the call direct to >> pgwin32_unsetenv() instead of relying on the macro, in initdb.c. >> >> There are only two such calls in initdb.c: the arguments are "LC_ALL" >> and "PGCLIENTENCODING". >> >> I wonder if this version of SetEnvironmentVariable is sufficiently dumb >> that it fails badly if given a NULL second argument for a value that is >> not in fact in the environment (as I would normally expect of these on >> Windows)? >> > > But that should be a win32 API call. It's not a runtime call. So it > should be identical between mingw and msvc! > > Try removing the code that sets it to NULL if it's empty string. Having > it as empty string made it fail on MSVC, and the API documentation says > it should be NULL, but maybe mingw is somehow intercepting the call and > breaking it... > > > Mingw is just passing the call on. You're right. When I comment out the NULL assignment, it all works. MSDN says this (<http://msdn.microsoft.com/en-us/library/z46c489x.aspx>): If the value parameter is not empty and the environment variable named by the variable parameter does not exist, theenvironment variable is created and assigned the contents of value. Solely for purposes of this operation, value isconsidered empty if it is a null reference (Nothing in Visual Basic), contains a zero-length string, or contains aninitial hexadecimal zero character (0x00). If variable contains a non-initial hexadecimal zero character, the characters before the zero character are consideredthe environment variable name and all subsequent characters are ignored. If value contains a non-initial hexadecimal zero character, the characters before the zero character are assigned tothe environment variable and all subsequent characters are ignored. If value is empty and the environment variable named by variable exists, the environment variable is deleted. If variabledoes not exist, no error occurs even though the operation cannot be performed. So it looks like we could remove that NULL assignment happily and expect the right thing to be done. cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> Andrew Dunstan wrote: >> >>> Magnus Hagander wrote: >>> >>>>> Specifically, it's the SetEnvironmentVariable() call from >>>>> pgwin32_putenv() called from pgwin32_unsetenv(). When this is disabled >>>>> things work just fine. >>>>> >>>> That's strange :( What arguments are it sent to the function? Since >>>> this >>>> is an API function, it really shouldn't behave differently between >>>> mingw >>>> and msvc, so it must be something that goes wrong with the arguments. >>>> >>>> Also, Tom mentioned earlier that we may be including *two* replacements >>>> for unsetenv(), which could be what's causing the problem. Can you >>>> check >>>> if that is happening and try to disable the one in port/unsetenv.c and >>>> see if that changes things? >>>> >>>> >>>> >>> I've already ruled out that hypothesis by forcing the call direct to >>> pgwin32_unsetenv() instead of relying on the macro, in initdb.c. >>> >>> There are only two such calls in initdb.c: the arguments are "LC_ALL" >>> and "PGCLIENTENCODING". >>> >>> I wonder if this version of SetEnvironmentVariable is sufficiently dumb >>> that it fails badly if given a NULL second argument for a value that is >>> not in fact in the environment (as I would normally expect of these on >>> Windows)? >>> >> >> But that should be a win32 API call. It's not a runtime call. So it >> should be identical between mingw and msvc! >> >> Try removing the code that sets it to NULL if it's empty string. Having >> it as empty string made it fail on MSVC, and the API documentation says >> it should be NULL, but maybe mingw is somehow intercepting the call and >> breaking it... >> >> >> > > Mingw is just passing the call on. > > You're right. When I comment out the NULL assignment, it all works. > > MSDN says this (<http://msdn.microsoft.com/en-us/library/z46c489x.aspx>): > > If the value parameter is not empty and the environment variable > named by the variable parameter does not exist, the environment > variable is created and assigned the contents of value. Solely for > purposes of this operation, value is considered empty if it is a > null reference (Nothing in Visual Basic), contains a zero-length > string, or contains an initial hexadecimal zero character (0x00). > > If variable contains a non-initial hexadecimal zero character, the > characters before the zero character are considered the environment > variable name and all subsequent characters are ignored. > > If value contains a non-initial hexadecimal zero character, the > characters before the zero character are assigned to the environment > variable and all subsequent characters are ignored. > > If value is empty and the environment variable named by variable > exists, the environment variable is deleted. If variable does not > exist, no error occurs even though the operation cannot be performed. > > > So it looks like we could remove that NULL assignment happily and expect > the right thing to be done. I'm doing training all day today, but I can hopefully look at it this weekend if you haven't already. However, I do recall *adding* that part specifically for MSVC compatibility - I got a crash without it. Perhaps we need to #ifdef it on mingw, but I'd like to understand *why*, since it's just an API call... Are we *sure*, btw, that this is actually a mingw issue, and not something else in the environment? Could you try a MSVC compiled binary on the same machine? //Magnus
Magnus Hagander wrote: > > Are we *sure*, btw, that this is actually a mingw issue, and not > something else in the environment? Could you try a MSVC compiled binary > on the same machine? > My MSVC buildfarm animal runs on the same machine, and does not suffer the same problem. cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> >> Are we *sure*, btw, that this is actually a mingw issue, and not >> something else in the environment? Could you try a MSVC compiled binary >> on the same machine? >> > > My MSVC buildfarm animal runs on the same machine, and does not suffer > the same problem. Meh. Stupid mingw :-) So how about we #ifdef out that NULL setting based on WIN32_ONLY_COMPILER, does that seem reasonable? //Magnus
Magnus Hagander wrote: > Andrew Dunstan wrote: > >> Magnus Hagander wrote: >> >>> Are we *sure*, btw, that this is actually a mingw issue, and not >>> something else in the environment? Could you try a MSVC compiled binary >>> on the same machine? >>> >>> >> My MSVC buildfarm animal runs on the same machine, and does not suffer >> the same problem. >> > > Meh. Stupid mingw :-) > > So how about we #ifdef out that NULL setting based on > WIN32_ONLY_COMPILER, does that seem reasonable? > > > The odd thing is that it doesn't seem to affect Vista, only XP. Anyway, yes, I think that would be OK. How do we then test to see if the original problem is still fixed? cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> Andrew Dunstan wrote: >> >>> Magnus Hagander wrote: >>> >>>> Are we *sure*, btw, that this is actually a mingw issue, and not >>>> something else in the environment? Could you try a MSVC compiled >>>> binary >>>> on the same machine? >>>> >>> My MSVC buildfarm animal runs on the same machine, and does not suffer >>> the same problem. >>> >> >> Meh. Stupid mingw :-) >> >> So how about we #ifdef out that NULL setting based on >> WIN32_ONLY_COMPILER, does that seem reasonable? >> >> >> > > The odd thing is that it doesn't seem to affect Vista, only XP. > > Anyway, yes, I think that would be OK. How do we then test to see if > the original problem is still fixed? > > Further proof that this is a Windows version issue: I took the problem build from my XP and put it on my Vista box: the same build that causes a problem on XP runs perfectly on Vista. Go figure. Maybe we need a version check at runtime? That would be icky. cheers andrew
Andrew Dunstan wrote: > > Anyway, yes, I think that would be OK. How do we then test to see if > > the original problem is still fixed? > > > > > > Further proof that this is a Windows version issue: I took the problem > build from my XP and put it on my Vista box: the same build that causes > a problem on XP runs perfectly on Vista. Go figure. Maybe we need a > version check at runtime? That would be icky. At a minimum we need to document this behavior in a source code comment. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Andrew Dunstan wrote: > > > Andrew Dunstan wrote: >> >> >> Magnus Hagander wrote: >>> Andrew Dunstan wrote: >>> >>>> Magnus Hagander wrote: >>>> >>>>> Are we *sure*, btw, that this is actually a mingw issue, and not >>>>> something else in the environment? Could you try a MSVC compiled >>>>> binary >>>>> on the same machine? >>>>> >>>> My MSVC buildfarm animal runs on the same machine, and does not suffer >>>> the same problem. >>>> >>> Meh. Stupid mingw :-) >>> >>> So how about we #ifdef out that NULL setting based on >>> WIN32_ONLY_COMPILER, does that seem reasonable? >>> >> The odd thing is that it doesn't seem to affect Vista, only XP. >> >> Anyway, yes, I think that would be OK. How do we then test to see if >> the original problem is still fixed? >> > Further proof that this is a Windows version issue: I took the problem > build from my XP and put it on my Vista box: the same build that causes > a problem on XP runs perfectly on Vista. Go figure. Maybe we need a > version check at runtime? That would be icky. Eventually does the crash come from the call SetEnvironemntVariable (.., NULL) on mingw-XP(or older?)? I'm also interested in this issue and want to know the cause. However is it necessary to call SetEnvironmentVariable() in the first place? My original patch doesn't contain SetEnvironmentVariable call in pg_unsetenv() because _putenv() seems to call SetEnvironmentVariable internally. regards, Hiroshi Inoue
Hiroshi Inoue wrote: > > Eventually does the crash come from the call SetEnvironemntVariable > (.., NULL) on mingw-XP(or older?)? > I'm also interested in this issue and want to know the cause. > > The debugger shows that we actually fail on a popen() call in intdb. However, if we replace the calls to SetEnvironmentVariable("foo",NULL) with calls to SetEnvironmentVariable("foo","") then there is no failure. My theory is that on XP somehow the former is corrupting the environment such that when popen() tries to copy the environment for the new child process, it barfs. cheers andrew
Andrew Dunstan wrote: > > > Hiroshi Inoue wrote: >> >> Eventually does the crash come from the call SetEnvironemntVariable >> (.., NULL) on mingw-XP(or older?)? >> I'm also interested in this issue and want to know the cause. >> >> > > The debugger shows that we actually fail on a popen() call in intdb. > However, if we replace the calls to SetEnvironmentVariable("foo",NULL) > with calls to SetEnvironmentVariable("foo","") then there is no failure. > My theory is that on XP somehow the former is corrupting the environment > such that when popen() tries to copy the environment for the new child > process, it barfs. Well, XP only does it when it's built with mingw! Or is this actually dependent on if the binary is run under msys or cmd? //Magnus
Hiroshi Inoue wrote: > Andrew Dunstan wrote: >> >> >> Andrew Dunstan wrote: >>> >>> >>> Magnus Hagander wrote: >>>> Andrew Dunstan wrote: >>>> >>>>> Magnus Hagander wrote: >>>>> >>>>>> Are we *sure*, btw, that this is actually a mingw issue, and not >>>>>> something else in the environment? Could you try a MSVC compiled >>>>>> binary >>>>>> on the same machine? >>>>>> >>>>> My MSVC buildfarm animal runs on the same machine, and does not suffer >>>>> the same problem. >>>>> >>>> Meh. Stupid mingw :-) >>>> >>>> So how about we #ifdef out that NULL setting based on >>>> WIN32_ONLY_COMPILER, does that seem reasonable? >>>> >>> The odd thing is that it doesn't seem to affect Vista, only XP. >>> >>> Anyway, yes, I think that would be OK. How do we then test to see if >>> the original problem is still fixed? >>> >> Further proof that this is a Windows version issue: I took the problem >> build from my XP and put it on my Vista box: the same build that >> causes a problem on XP runs perfectly on Vista. Go figure. Maybe we >> need a version check at runtime? That would be icky. > > Eventually does the crash come from the call SetEnvironemntVariable > (.., NULL) on mingw-XP(or older?)? > I'm also interested in this issue and want to know the cause. > > However is it necessary to call SetEnvironmentVariable() in the first > place? My original patch doesn't contain SetEnvironmentVariable call > in pg_unsetenv() because _putenv() seems to call SetEnvironmentVariable > internally. It's because I factored in another place where we *did* call it explicitly. Perhaps this code was put in for compatibility with some old version of mingw or something? If everything works if we remove that call in both msvc and mingw, we can just do that, yes. It still doesn't really explain *why* it crashes though. //Magnus
Hi. > Well, XP only does it when it's built with mingw! > > Or is this actually dependent on if the binary is run under msys or cmd? Both they look at a problem. http://winpg.jp/~saito/pg_bug/20090124/ Then, If SetEnvironmentVariable of Andrew-san point is removed, a problem will clear....very strange... Regards, Hiroshi Saito
Magnus Hagander wrote: > Andrew Dunstan wrote: > >> Hiroshi Inoue wrote: >> >>> Eventually does the crash come from the call SetEnvironemntVariable >>> (.., NULL) on mingw-XP(or older?)? >>> I'm also interested in this issue and want to know the cause. >>> >>> >>> >> The debugger shows that we actually fail on a popen() call in intdb. >> However, if we replace the calls to SetEnvironmentVariable("foo",NULL) >> with calls to SetEnvironmentVariable("foo","") then there is no failure. >> My theory is that on XP somehow the former is corrupting the environment >> such that when popen() tries to copy the environment for the new child >> process, it barfs. >> > > Well, XP only does it when it's built with mingw! > > Or is this actually dependent on if the binary is run under msys or cmd? > > > Even weirder. It has now started working. For no apparent reason. I am seriously confused. cheers andrew
Andrew Dunstan wrote: > > Even weirder. It has now started working. For no apparent reason. I am > seriously confused. > > I spoke too soon :-( http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=dawn_bat&dt=2009-01-31%2016:28:16 cheers andrew
Andrew Dunstan wrote: > > > Magnus Hagander wrote: >> Andrew Dunstan wrote: >> >>> Hiroshi Inoue wrote: >>> >>>> Eventually does the crash come from the call SetEnvironemntVariable >>>> (.., NULL) on mingw-XP(or older?)? >>>> I'm also interested in this issue and want to know the cause. >>>> >>>> >>>> >>> The debugger shows that we actually fail on a popen() call in intdb. >>> However, if we replace the calls to SetEnvironmentVariable("foo",NULL) >>> with calls to SetEnvironmentVariable("foo","") then there is no failure. >>> My theory is that on XP somehow the former is corrupting the environment >>> such that when popen() tries to copy the environment for the new child >>> process, it barfs. >>> >> >> Well, XP only does it when it's built with mingw! >> >> Or is this actually dependent on if the binary is run under msys or cmd? >> >> >> > > Even weirder. It has now started working. For no apparent reason. I am > seriously confused. This is just strange :S We could #ifdef out that thing on mingw, but I'm still worried that it will not work in all cases. I'd like to think there's a reason that thing was in there in the first place. Hmm. Actually, if I look at how things were before, I think we only called SetEnvironmentVariable() in case we set a variable, and never if we removed one. I'm not sure that's correct behavior, but it's apparently non-crashing behavior. Perhaps we need to restore that one? I'd be in favor of restoring it for both mingw and msvc in that case - that way we keep the platforms as close to each other as possible. Comments? //Magnus
Magnus Hagander wrote: > Hmm. Actually, if I look at how things were before, I think we only > called SetEnvironmentVariable() in case we set a variable, and never if > we removed one. I'm not sure that's correct behavior, but it's > apparently non-crashing behavior. Perhaps we need to restore that one? > > I'd be in favor of restoring it for both mingw and msvc in that case - > that way we keep the platforms as close to each other as possible. > > Comments? > > > works for me. cheers andrew
On Mon, Feb 02, 2009 at 07:37:46AM -0500, Andrew Dunstan wrote: > > > Magnus Hagander wrote: > >Hmm. Actually, if I look at how things were before, I think we only > >called SetEnvironmentVariable() in case we set a variable, and never if > >we removed one. I'm not sure that's correct behavior, but it's > >apparently non-crashing behavior. Perhaps we need to restore that one? > > > >I'd be in favor of restoring it for both mingw and msvc in that case - > >that way we keep the platforms as close to each other as possible. > > > >Comments? > > > > > > > > works for me. Patch applied for this. //Magnus