Thread: Version 4.10 of buildfarm client released.

Version 4.10 of buildfarm client released.

From
Andrew Dunstan
Date:
Version 4.10 of the buildfarm client has been released.

Following GitHub's abandonment of their download feature, releases will 
now be published on the buildfarm server. The latest release will always 
be available at <http://www.pgbuildfarm.org/downloads/latest-client.tgz> 
This particular release is available at 
<http://www.pgbuildfarm.org/downloads/releases/build-farm-4_10.tgz>

The main feature of this release is that it does better logging of 
pg_upgrade failures (which is why I hope Heikki applies it to chipmunk 
right away ;-) )

The rest is minor bug fixes and very small enhancements.

cheers

andrew



Re: [Pgbuildfarm-members] Version 4.10 of buildfarm client released.

From
Heikki Linnakangas
Date:
On 11.01.2013 18:38, Andrew Dunstan wrote:
> The main feature of this release is that it does better logging of
> pg_upgrade failures (which is why I hope Heikki applies it to chipmunk
> right away ;-) )

Heh, ok :-)

I've upgraded it, and launched a new buildfarm run, so we'll now more in 
a moment. This box has a very small disk (a 4GB sd card), so it's quite 
possible it simply ran out of disk space.

There was a stray postgres instance running on the box, which I killed:

pgbfarm@raspberrypi ~ $ ps ax | grep pg_upg 5993 pts/0    S+     0:00 grep --color=auto pg_upg
20200 ?        S      0:00 

/home/pgbfarm/buildroot/HEAD/pgsql.8210/contrib/pg_upgrade/tmp_check/install/home/pgbfarm/buildroot/HEAD/inst/bin/postgres

-F -c listen_addresses=

The directory /home/pgbfarm/buildroot/HEAD/pgsql.8210 did not exist 
anymore when I looked. Apparently the server was running within an 
already-deleted directory.

- Heikki



Re: [Pgbuildfarm-members] Version 4.10 of buildfarm client released.

From
Tom Lane
Date:
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> There was a stray postgres instance running on the box, which I killed:

FWIW, we've seen an awful lot of persistent buildfarm failures that
seemed to be due to port conflicts with leftover postmasters.  I think
the buildfarm script needs to try harder to ensure that it's killed
everything after a run.  No good ideas how to go about that exactly.
You could look through "ps" output for postmasters, but what if there's
a regular Postgres installation on the same box?  Can we just document
that the buildfarm had better not be run as "postgres"?  (If so, its
attempt to kill an unowned postmaster would fail anyway; else we need
a reliable way to tell which ones to kill.)
        regards, tom lane



Re: [Pgbuildfarm-members] Version 4.10 of buildfarm client released.

From
Andrew Dunstan
Date:
On 01/11/2013 01:39 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> There was a stray postgres instance running on the box, which I killed:
> FWIW, we've seen an awful lot of persistent buildfarm failures that
> seemed to be due to port conflicts with leftover postmasters.  I think
> the buildfarm script needs to try harder to ensure that it's killed
> everything after a run.  No good ideas how to go about that exactly.
> You could look through "ps" output for postmasters, but what if there's
> a regular Postgres installation on the same box?  Can we just document
> that the buildfarm had better not be run as "postgres"?  (If so, its
> attempt to kill an unowned postmaster would fail anyway; else we need
> a reliable way to tell which ones to kill.)
>
>             


The buildfarm never builds with the standard port unless someone is 
quite perverse indeed. The logic that governs it is:
   $buildport = $PGBuild::conf{base_port};   if ($branch =~ /REL(\d+)_(\d+)/)   {        $buildport += (10 * ($1 - 7))
+$2;   }
 

Certainly the script should not be run as the standard postgres user.

Part of the trouble with detecting rogue postmasters it might have left 
lying around is that various things like to decide what port to run on, 
so it's not always easy for the buildfarm to know what it should be 
looking for.

For branches >= 9.2 this is somewhat ameliorated by the existence of 
EXTRA_REGRESS_OPTS, although we might need a slight adjustment to 
pg_upgrade's test.sh to stop it from trampling on that willy-nilly.

I'm certainly reluctant to be trying to kill anything we aren't dead 
certain is ours. We could possibly detect very early that there is a 
suspected rogue postmaster.

One major source of these rogue processes has almost certainly been this 
piece of logic in pg_ctl:
   * The postmaster should create postmaster.pid very soon after being   * started.  If it's not there after we've
waited5 or more seconds,   * assume startup failed and give up waiting.
 

WHen that happens, pg_ctl fails, and thus so does the buildfarmj client, 
but if it has in fact started a postmaster that was just very slow in 
writing its pid file it has left a postmastr lying around.

ISTR we discussed this phenomenon relatively recently, but I can't find 
a reference to it readily. In any case, nothing has changed on that front.

cheers

andrew