Thread: mingw configure failure workaround

mingw configure failure workaround

From
Andrew Dunstan
Date:
The really horrid hack below fixes for me a MINGW/MSys problem that 
otherwise occurs inconsistently (fails on different links, and pretends 
to have succeeded), but reliably (every run at least one link will not 
actually have happened).

There are 2 parts - first we loop a few times until we succeed, and 
second after the loop we test that we have actually succeeded, and 
complain loudly otherwise.

The second part seems well worth doing. Nobody has yet come up with a 
reasonable alternative to the first part (other than making the user do 
it by hand, which defeats the whole purpose of configure).

So, the questions are: what parts of this should we do?
1) failure test only or
2) loop plus failure test or
3) nothing

And if not 3), is there some autoconf wizard out there who can help do 
this properly? It would probably take me many hours to work out, as I 
have never touched the beast.

cheers

andrew


Index: configure
===================================================================
RCS file: /projects/cvsroot/pgsql-server/configure,v
retrieving revision 1.351
diff -c -w -r1.351 configure
*** configure   27 Apr 2004 20:09:27 -0000      1.351
--- configure   29 Apr 2004 20:17:06 -0000
***************
*** 19141,19151 ****
--- 19141,19160 ----  esac
  # Make a symlink if possible; otherwise try a hard link.
+ for linktry in 1 2 3 4 5; do  ln -s $ac_rel_source $ac_dest 2>/dev/null ||    ln $srcdir/$ac_source $ac_dest ||    {
{echo "$as_me:$LINENO: error: cannot link $ac_dest to 
 
$srcdir/$ac_source" >&5echo "$as_me: error: cannot link $ac_dest to $srcdir/$ac_source" >&2;}   { (exit 1); exit 1; };
}
+   test -e $ac_dest && break
+ done
+   test -e $ac_dest || + { { echo "$as_me:$LINENO: error: failed to 
link $ac_dest to $srcdir/$ac_source"
+  >&5
+ echo "$as_me: error: failed to link $ac_dest to $srcdir/$ac_source" >&2;}
+    { (exit 1); exit 1; }; }  done_ACEOF




Re: mingw configure failure workaround

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> And if not 3), is there some autoconf wizard out there who can help do 
> this properly? It would probably take me many hours to work out, as I 
> have never touched the beast.

Obviously, or you would know that configure is a generated file that
there is no point in editing by hand.

The real issue in my mind is why is "ln" unreliable in mingw?  I cannot
see any point in a retry kluge when we do not know what's really going
on.
        regards, tom lane


Re: mingw configure failure workaround

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>And if not 3), is there some autoconf wizard out there who can help do 
>>this properly? It would probably take me many hours to work out, as I 
>>have never touched the beast.
>>    
>>
>
>Obviously, or you would know that configure is a generated file that
>there is no point in editing by hand.
>

er ... that's why I asked how to do it properly. I simply included the 
diff to show what I had been able to make work, not because I wanted it 
applied.

>
>The real issue in my mind is why is "ln" unreliable in mingw?  I cannot
>see any point in a retry kluge when we do not know what's really going
>on.
>
>    
>

I'm still trying to find out. But I don't see why this is different from 
the kludge we already have for unlink, and that one is right inside 
postgresql. In fact. it's more or less the same solution.

At the very least, until we can find a better solution we should have 
something like the checking part of what I did. We've seen quite a 
number of obscure failure reports that have all been traced back to this 
failure, which is currently quite unreported by configure.

cheers

andrew



Re: mingw configure failure workaround

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> The real issue in my mind is why is "ln" unreliable in mingw?  I cannot
>> see any point in a retry kluge when we do not know what's really going
>> on.

> I'm still trying to find out. But I don't see why this is different from 
> the kludge we already have for unlink, and that one is right inside 
> postgresql.

It's different because we know why we need that one: we understand the
cause of the behavior and we therefore can have some confidence that the
kluge will fix it (or not, as the case may be).  I have zero confidence
in looping five times around an "ln" call.
        regards, tom lane


Re: mingw configure failure workaround

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>Tom Lane wrote:
>>    
>>
>>>The real issue in my mind is why is "ln" unreliable in mingw?  I cannot
>>>see any point in a retry kluge when we do not know what's really going
>>>on.
>>>      
>>>
>
>  
>
>>I'm still trying to find out. But I don't see why this is different from 
>>the kludge we already have for unlink, and that one is right inside 
>>postgresql.
>>    
>>
>
>It's different because we know why we need that one: we understand the
>cause of the behavior and we therefore can have some confidence that the
>kluge will fix it (or not, as the case may be).  I have zero confidence
>in looping five times around an "ln" call.
>
>  
>

Even if we don't do that can we *please* put in something that detects 
the error, and tells the user what they will have to do to fix it? 
Failing in a situation which we know we can detect and not telling the 
user is intolerable, IMNSHO.

cheers

andrew



Re: mingw configure failure workaround

From
Bruce Momjian
Date:
Andrew Dunstan wrote:
> >It's different because we know why we need that one: we understand the
> >cause of the behavior and we therefore can have some confidence that the
> >kluge will fix it (or not, as the case may be).  I have zero confidence
> >in looping five times around an "ln" call.
> >
> >  
> >
> 
> Even if we don't do that can we *please* put in something that detects 
> the error, and tells the user what they will have to do to fix it? 
> Failing in a situation which we know we can detect and not telling the 
> user is intolerable, IMNSHO.

Agreed.  At a minium we have to throw an error and tell them to run it
again.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: mingw configure failure workaround

From
Peter Eisentraut
Date:
Andrew Dunstan wrote:
> Even if we don't do that can we *please* put in something that
> detects the error, and tells the user what they will have to do to
> fix it? Failing in a situation which we know we can detect and not
> telling the user is intolerable, IMNSHO.

Can you try a more recent version of autoconf and see if that behaves 
more tolerably?



Re: mingw configure failure workaround

From
Andrew Dunstan
Date:
Peter Eisentraut wrote:

>Andrew Dunstan wrote:
>  
>
>>Even if we don't do that can we *please* put in something that
>>detects the error, and tells the user what they will have to do to
>>fix it? Failing in a situation which we know we can detect and not
>>telling the user is intolerable, IMNSHO.
>>    
>>
>
>Can you try a more recent version of autoconf and see if that behaves 
>more tolerably?
>
>  
>

tested with autoconf 2.59.

Unfortunately, it does not. It does try to copy if a link fails, unlike 
what we have now:
 ln -s $ac_rel_source $ac_dest 2>/dev/null ||   ln $srcdir/$ac_source $ac_dest 2>/dev/null ||   cp -p
$srcdir/$ac_source$ac_dest ||
 

We don't have the last line, which must have been added since autoconf 2.53.

However, the problem is that the first line will actually appear to have 
succeeded, i.e. MSys's ln is lying to us ;-(

This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its 
status.m4, which I guess is what we'd need to override (is that 
possible?) if we are going to detect the failure, or maybe there's some 
more magical way that in my unfamiliarity with autoconf I am unaware of.

cheers

andrew




Re: mingw configure failure workaround

From
Peter Eisentraut
Date:
Andrew Dunstan wrote:
> However, the problem is that the first line will actually appear to
> have succeeded, i.e. MSys's ln is lying to us ;-(

Then msys needs to be fixed.  There is certainly a bunch of 
autoconfiscated software that gets compiled on mingw/msys every day.  I 
would like to know why we are the only ones having this problem.  Has 
anyone contacted the msys authors about this?

> This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its
> status.m4, which I guess is what we'd need to override (is that
> possible?)

No

> if we are going to detect the failure, or maybe there's
> some more magical way that in my unfamiliarity with autoconf I am
> unaware of.

No



Re: mingw configure failure workaround

From
Andrew Dunstan
Date:
Peter Eisentraut wrote:

>Andrew Dunstan wrote:
>  
>
>>However, the problem is that the first line will actually appear to
>>have succeeded, i.e. MSys's ln is lying to us ;-(
>>    
>>
>
>Then msys needs to be fixed.  There is certainly a bunch of 
>autoconfiscated software that gets compiled on mingw/msys every day.  I 
>would like to know why we are the only ones having this problem.  Has 
>anyone contacted the msys authors about this?
>  
>

I don't know -  I recall hearing something, but I have found no trace. I 
will follow it up, but I do not think this absolves us of all 
responsibility. We work around all sorts of problems on all sorts of 
platforms.

>  
>
>>This comes from the autoconf macro _AC_OUTPUT_LINKS defined in its
>>status.m4, which I guess is what we'd need to override (is that
>>possible?)
>>    
>>
>
>No
>  
>

I will take your word for it, but see below.

>  
>
>>if we are going to detect the failure, or maybe there's
>>some more magical way that in my unfamiliarity with autoconf I am
>>unaware of.
>>    
>>
>
>No
>
>
>  
>
"No" is our answer too often.

A lot of reading and some experimentation showed that putting this in 
configure.in:

AC_OUTPUT_COMMANDS([ for linktarget in src/backend/port/dynloader.c 
src/backend/port/pg_sema.c src/backend/port/pg_shmem.c 
src/include/dynloader.h src/include/pg_config_os.h src/Makefile.port ; do   test -e $linktarget || echo " ***" link for
$linktargetfailed - 
 
please fix by hand done
])


yielded results looking like this:

config.status: executing default-1 commands*** link for src/backend/port/pg_shmem.c failed - please fix by hand*** link
forsrc/include/dynloader.h failed - please fix by hand
 

Which is more or less what I wanted as a minimum.


cheers

andrew




Re: mingw configure failure workaround

From
Andrew Dunstan
Date:
Andrew Dunstan wrote:

> Peter Eisentraut wrote:
>
>> Andrew Dunstan wrote:
>>  
>>
>>> Even if we don't do that can we *please* put in something that
>>> detects the error, and tells the user what they will have to do to
>>> fix it? Failing in a situation which we know we can detect and not
>>> telling the user is intolerable, IMNSHO.
>>>   
>>
>>
>> Can you try a more recent version of autoconf and see if that behaves 
>> more tolerably?
>>
>>  
>>
>
> tested with autoconf 2.59.
>
> Unfortunately, it does not. It does try to copy if a link fails, 
> unlike what we have now:
>
>  ln -s $ac_rel_source $ac_dest 2>/dev/null ||
>    ln $srcdir/$ac_source $ac_dest 2>/dev/null ||
>    cp -p $srcdir/$ac_source $ac_dest ||
>
> We don't have the last line, which must have been added since autoconf 
> 2.53.


I was ahead of myself. It does appear to work, (tested in the platform I 
was using to get reliable failure, with autoconf 2.56 from the MSysDTK).

I'm damned if I know why, though.

I still think we should cosider the little error detection macro I just 
posted.

cheers

andrew





Re: mingw configure failure workaround

From
"Magnus Hagander"
Date:
> tested with autoconf 2.59.
>
> Unfortunately, it does not. It does try to copy if a link
> fails, unlike what we have now:
>
>   ln -s $ac_rel_source $ac_dest 2>/dev/null ||
>     ln $srcdir/$ac_source $ac_dest 2>/dev/null ||
>     cp -p $srcdir/$ac_source $ac_dest ||
>
> We don't have the last line, which must have been added since
> autoconf 2.53.
>
> However, the problem is that the first line will actually
> appear to have succeeded, i.e. MSys's ln is lying to us ;-(

Ok, how's this for a really ugly solution:

* Provide our own ln (in the form of a shellscript, even)
* Make sure this one gets in ahead of the system supplied one in the
path (from the code above it looks like it's not calling it with a
specific path, so just force-adding somethign to the path of configure
should work?)

This ln can then do a cp directly, and not even bother trying the mingw
ln function which we know will only do cp anyway if it succeeds.

If there is a less ugly solution to be had, by all means stay away from
thsi oen :-)

//Magnus