On 12/2/24 04:46, Noah Misch wrote:
> On Mon, Dec 02, 2024 at 02:20:35AM +0100, Tomas Vondra wrote:
>> turaco seems to be having some strange git issues - some of the
>> buildfarm runs fail like this:
>>
>>
>> turaco:REL_16_STABLE [22:41:11] OK
>> Sun Dec 1 22:41:27 2024: buildfarm run for turaco:REL_17_STABLE starting
>> turaco:REL_17_STABLE [22:41:27] checking out source ...
>> Missing checked out branch bf_REL_17_STABLE:
>> fatal: not a git repository (or any parent up to mount point /mnt)
>> Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
>> turaco:REL_17_STABLE [22:41:32] failed at stage pgsql-Git
>> Sun Dec 1 22:41:33 2024: buildfarm run for turaco:HEAD starting
>> turaco:HEAD [22:41:33] checking out source ...
>>
>>
>> I initially suspected this might be due to aging storage (SD card on
>> rpi), but I replaced that, and there's nothing strange in dmesg. Also,
>> other branches seem to be working fine ...
>>
>> Any ideas what could be causing this?
>
> I had this happen ~9 times on the host of my AIX buildfarm members. Example:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mandrill&dt=2024-07-10%2019%3A51%3A28
>
> I figured it was some system problem, so I didn't root-cause it. I carry the
> following workaround in my fork of the buildfarm client code. The unknown
> problem caused failure reports and work stoppage ~4 times before I installed
> this workaround, then logs show the workaround prevented damage 5 times. The
> last "removed intruder .git" log message appeared on 2024-07-23. There was no
> kernel reboot, and logs don't point to buildfarm client processes getting
> involuntary termination, either.
>
Thanks. I suspect some system issue too, but I didn't want to blame the
system without some kind of proof. I applied your patch, let's see if
that helped after a couple runs.
regards
--
Tomas Vondra