Thread: [HACKERS] frogmouth failures

[HACKERS] frogmouth failures

From
Andrew Dunstan
Date:
I've been trying to track down the cause of recent failures at the "make
check" stage on frogmouth, a 32-bit Windows/Mingw instance running on XP.

I couldn't see any obvious reason for the failures, and a reboot didn't
cure the problem.

Then I tried running (offline mode) the serial schedule instead of the
parallel schedule, and it went through with no error. So then I tried
setting MAX_CONNECTIONS=10 and that also worked - see
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth&dt=2017-04-27%2018%3A10%3A08>

I've reverted that setting, but if errors start to occur again we'll
have some slight notion of where to look.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] frogmouth failures

From
Tom Lane
Date:
Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> I've been trying to track down the cause of recent failures at the "make
> check" stage on frogmouth, a 32-bit Windows/Mingw instance running on XP.

I've been wondering about that too.

> Then I tried running (offline mode) the serial schedule instead of the
> parallel schedule, and it went through with no error. So then I tried
> setting MAX_CONNECTIONS=10 and that also worked - see
> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth&dt=2017-04-27%2018%3A10%3A08>
> I've reverted that setting, but if errors start to occur again we'll
> have some slight notion of where to look.

Judging by the recent history,
https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=frogmouth&br=HEAD
it's not 100% reproducible.  (Either that, or we un-broke it and re-broke
it within the last week, which seems improbable.)  So unless you made
quite a few successful runs with the lower MAX_CONNECTIONS setting,
I'm dubious that there's really a connection.

Having said that, I won't be a bit surprised if it is some sort of
parallelism effect.  I just don't think one test proves much.
        regards, tom lane



Re: [HACKERS] frogmouth failures

From
Andrew Dunstan
Date:

On 04/27/2017 04:30 PM, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
>> I've been trying to track down the cause of recent failures at the "make
>> check" stage on frogmouth, a 32-bit Windows/Mingw instance running on XP.
> I've been wondering about that too.
>
>> Then I tried running (offline mode) the serial schedule instead of the
>> parallel schedule, and it went through with no error. So then I tried
>> setting MAX_CONNECTIONS=10 and that also worked - see
>> <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=frogmouth&dt=2017-04-27%2018%3A10%3A08>
>> I've reverted that setting, but if errors start to occur again we'll
>> have some slight notion of where to look.
> Judging by the recent history,
> https://buildfarm.postgresql.org/cgi-bin/show_history.pl?nm=frogmouth&br=HEAD
> it's not 100% reproducible.  (Either that, or we un-broke it and re-broke
> it within the last week, which seems improbable.)  So unless you made
> quite a few successful runs with the lower MAX_CONNECTIONS setting,
> I'm dubious that there's really a connection.
>
> Having said that, I won't be a bit surprised if it is some sort of
> parallelism effect.  I just don't think one test proves much.
>

I'll leave it on for a week and then remove it, that should give us a larger sample.

cheers

andrew 


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] frogmouth failures

From
Andres Freund
Date:
On 2017-04-27 16:30:35 -0400, Tom Lane wrote:
> Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:
> > I've been trying to track down the cause of recent failures at the "make
> > check" stage on frogmouth, a 32-bit Windows/Mingw instance running on XP.
> 
> I've been wondering about that too.

Same here.  Over the years there've been a number of bug reports with
the same error code, so it's not necessarily specific to master.  Could
just be a question of backend spawn rate or such.

- Andres