Thread: mingw configure failure workaround
The really horrid hack below fixes for me a MINGW/MSys problem that otherwise occurs inconsistently (fails on different links, and pretends to have succeeded), but reliably (every run at least one link will not actually have happened). There are 2 parts - first we loop a few times until we succeed, and second after the loop we test that we have actually succeeded, and complain loudly otherwise. The second part seems well worth doing. Nobody has yet come up with a reasonable alternative to the first part (other than making the user do it by hand, which defeats the whole purpose of configure). So, the questions are: what parts of this should we do? 1) failure test only or 2) loop plus failure test or 3) nothing And if not 3), is there some autoconf wizard out there who can help do this properly? It would probably take me many hours to work out, as I have never touched the beast. cheers andrew Index: configure =================================================================== RCS file: /projects/cvsroot/pgsql-server/configure,v retrieving revision 1.351 diff -c -w -r1.351 configure *** configure 27 Apr 2004 20:09:27 -0000 1.351 --- configure 29 Apr 2004 20:17:06 -0000 *************** *** 19141,19151 **** --- 19141,19160 ---- esac # Make a symlink if possible; otherwise try a hard link. + for linktry in 1 2 3 4 5; do ln -s $ac_rel_source $ac_dest 2>/dev/null || ln $srcdir/$ac_source $ac_dest || { {echo "$as_me:$LINENO: error: cannot link $ac_dest to $srcdir/$ac_source" >&5echo "$as_me: error: cannot link $ac_dest to $srcdir/$ac_source" >&2;} { (exit 1); exit 1; }; } + test -e $ac_dest && break + done + test -e $ac_dest || + { { echo "$as_me:$LINENO: error: failed to link $ac_dest to $srcdir/$ac_source" + >&5 + echo "$as_me: error: failed to link $ac_dest to $srcdir/$ac_source" >&2;} + { (exit 1); exit 1; }; } done_ACEOF
Andrew Dunstan <andrew@dunslane.net> writes: > And if not 3), is there some autoconf wizard out there who can help do > this properly? It would probably take me many hours to work out, as I > have never touched the beast. Obviously, or you would know that configure is a generated file that there is no point in editing by hand. The real issue in my mind is why is "ln" unreliable in mingw? I cannot see any point in a retry kluge when we do not know what's really going on. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>And if not 3), is there some autoconf wizard out there who can help do >>this properly? It would probably take me many hours to work out, as I >>have never touched the beast. >> >> > >Obviously, or you would know that configure is a generated file that >there is no point in editing by hand. > er ... that's why I asked how to do it properly. I simply included the diff to show what I had been able to make work, not because I wanted it applied. > >The real issue in my mind is why is "ln" unreliable in mingw? I cannot >see any point in a retry kluge when we do not know what's really going >on. > > > I'm still trying to find out. But I don't see why this is different from the kludge we already have for unlink, and that one is right inside postgresql. In fact. it's more or less the same solution. At the very least, until we can find a better solution we should have something like the checking part of what I did. We've seen quite a number of obscure failure reports that have all been traced back to this failure, which is currently quite unreported by configure. cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> The real issue in my mind is why is "ln" unreliable in mingw? I cannot >> see any point in a retry kluge when we do not know what's really going >> on. > I'm still trying to find out. But I don't see why this is different from > the kludge we already have for unlink, and that one is right inside > postgresql. It's different because we know why we need that one: we understand the cause of the behavior and we therefore can have some confidence that the kluge will fix it (or not, as the case may be). I have zero confidence in looping five times around an "ln" call. regards, tom lane
Tom Lane wrote: >Andrew Dunstan <andrew@dunslane.net> writes: > > >>Tom Lane wrote: >> >> >>>The real issue in my mind is why is "ln" unreliable in mingw? I cannot >>>see any point in a retry kluge when we do not know what's really going >>>on. >>> >>> > > > >>I'm still trying to find out. But I don't see why this is different from >>the kludge we already have for unlink, and that one is right inside >>postgresql. >> >> > >It's different because we know why we need that one: we understand the >cause of the behavior and we therefore can have some confidence that the >kluge will fix it (or not, as the case may be). I have zero confidence >in looping five times around an "ln" call. > > > Even if we don't do that can we *please* put in something that detects the error, and tells the user what they will have to do to fix it? Failing in a situation which we know we can detect and not telling the user is intolerable, IMNSHO. cheers andrew
Andrew Dunstan wrote: > >It's different because we know why we need that one: we understand the > >cause of the behavior and we therefore can have some confidence that the > >kluge will fix it (or not, as the case may be). I have zero confidence > >in looping five times around an "ln" call. > > > > > > > > Even if we don't do that can we *please* put in something that detects > the error, and tells the user what they will have to do to fix it? > Failing in a situation which we know we can detect and not telling the > user is intolerable, IMNSHO. Agreed. At a minium we have to throw an error and tell them to run it again. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Andrew Dunstan wrote: > Even if we don't do that can we *please* put in something that > detects the error, and tells the user what they will have to do to > fix it? Failing in a situation which we know we can detect and not > telling the user is intolerable, IMNSHO. Can you try a more recent version of autoconf and see if that behaves more tolerably?
Peter Eisentraut wrote: >Andrew Dunstan wrote: > > >>Even if we don't do that can we *please* put in something that >>detects the error, and tells the user what they will have to do to >>fix it? Failing in a situation which we know we can detect and not >>telling the user is intolerable, IMNSHO. >> >> > >Can you try a more recent version of autoconf and see if that behaves >more tolerably? > > > tested with autoconf 2.59. Unfortunately, it does not. It does try to copy if a link fails, unlike what we have now: ln -s $ac_rel_source $ac_dest 2>/dev/null || ln $srcdir/$ac_source $ac_dest 2>/dev/null || cp -p $srcdir/$ac_source$ac_dest || We don't have the last line, which must have been added since autoconf 2.53. However, the problem is that the first line will actually appear to have succeeded, i.e. MSys's ln is lying to us ;-( This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its status.m4, which I guess is what we'd need to override (is that possible?) if we are going to detect the failure, or maybe there's some more magical way that in my unfamiliarity with autoconf I am unaware of. cheers andrew
Andrew Dunstan wrote: > However, the problem is that the first line will actually appear to > have succeeded, i.e. MSys's ln is lying to us ;-( Then msys needs to be fixed. There is certainly a bunch of autoconfiscated software that gets compiled on mingw/msys every day. I would like to know why we are the only ones having this problem. Has anyone contacted the msys authors about this? > This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its > status.m4, which I guess is what we'd need to override (is that > possible?) No > if we are going to detect the failure, or maybe there's > some more magical way that in my unfamiliarity with autoconf I am > unaware of. No
Peter Eisentraut wrote: >Andrew Dunstan wrote: > > >>However, the problem is that the first line will actually appear to >>have succeeded, i.e. MSys's ln is lying to us ;-( >> >> > >Then msys needs to be fixed. There is certainly a bunch of >autoconfiscated software that gets compiled on mingw/msys every day. I >would like to know why we are the only ones having this problem. Has >anyone contacted the msys authors about this? > > I don't know - I recall hearing something, but I have found no trace. I will follow it up, but I do not think this absolves us of all responsibility. We work around all sorts of problems on all sorts of platforms. > > >>This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its >>status.m4, which I guess is what we'd need to override (is that >>possible?) >> >> > >No > > I will take your word for it, but see below. > > >>if we are going to detect the failure, or maybe there's >>some more magical way that in my unfamiliarity with autoconf I am >>unaware of. >> >> > >No > > > > "No" is our answer too often. A lot of reading and some experimentation showed that putting this in configure.in: AC_OUTPUT_COMMANDS([ for linktarget in src/backend/port/dynloader.c src/backend/port/pg_sema.c src/backend/port/pg_shmem.c src/include/dynloader.h src/include/pg_config_os.h src/Makefile.port ; do test -e $linktarget || echo " ***" link for $linktargetfailed - please fix by hand done ]) yielded results looking like this: config.status: executing default-1 commands*** link for src/backend/port/pg_shmem.c failed - please fix by hand*** link forsrc/include/dynloader.h failed - please fix by hand Which is more or less what I wanted as a minimum. cheers andrew
Andrew Dunstan wrote: > Peter Eisentraut wrote: > >> Andrew Dunstan wrote: >> >> >>> Even if we don't do that can we *please* put in something that >>> detects the error, and tells the user what they will have to do to >>> fix it? Failing in a situation which we know we can detect and not >>> telling the user is intolerable, IMNSHO. >>> >> >> >> Can you try a more recent version of autoconf and see if that behaves >> more tolerably? >> >> >> > > tested with autoconf 2.59. > > Unfortunately, it does not. It does try to copy if a link fails, > unlike what we have now: > > ln -s $ac_rel_source $ac_dest 2>/dev/null || > ln $srcdir/$ac_source $ac_dest 2>/dev/null || > cp -p $srcdir/$ac_source $ac_dest || > > We don't have the last line, which must have been added since autoconf > 2.53. I was ahead of myself. It does appear to work, (tested in the platform I was using to get reliable failure, with autoconf 2.56 from the MSysDTK). I'm damned if I know why, though. I still think we should cosider the little error detection macro I just posted. cheers andrew
> tested with autoconf 2.59. > > Unfortunately, it does not. It does try to copy if a link > fails, unlike what we have now: > > ln -s $ac_rel_source $ac_dest 2>/dev/null || > ln $srcdir/$ac_source $ac_dest 2>/dev/null || > cp -p $srcdir/$ac_source $ac_dest || > > We don't have the last line, which must have been added since > autoconf 2.53. > > However, the problem is that the first line will actually > appear to have succeeded, i.e. MSys's ln is lying to us ;-( Ok, how's this for a really ugly solution: * Provide our own ln (in the form of a shellscript, even) * Make sure this one gets in ahead of the system supplied one in the path (from the code above it looks like it's not calling it with a specific path, so just force-adding somethign to the path of configure should work?) This ln can then do a cp directly, and not even bother trying the mingw ln function which we know will only do cp anyway if it succeeds. If there is a less ugly solution to be had, by all means stay away from thsi oen :-) //Magnus