Thread: Release 13 of the PostgreSQL BuildFarm client

Release 13 of the PostgreSQL BuildFarm client

From
Andrew Dunstan
Date:
I have pushed Release 13 of the PostgreSQL BuildFarm client.


Change highlights:

  * add tests for a few cases that were previously missing
  * more fine-grained control over which TAP test sets run
  * --no-keepall can be specified on the command line
  * repair a repo if it crashed during a copy operation
  * make sure the repo is really clean (including free of ignored files)
  * generate stack traces on CYGWIN
  * collect stack traces for TAP tests
  * Extract MSVC settings at runtime rather than had coding them in the
    config file (see below)
  * cross version upgrade tests now run on Windows, both for msys2 and
    MSVC builds
  * add support for inhibit-runs and force-one-run trigger files( see below)
  * add experimental module for running arbitrary out of tree TAP tests
  * Adjust if an upstream repo changes the default branch name (see below)
  * add use_discard_caches caches setting (see below)


MSVC animals are now very much simpler to set up, and to upgrade to a
new compiler. Using the new mechanism, as shown in the sample config
file, all that's needed is to specify a location where the standard
script vcvarsall.bat can be found. The script will then run that script
and extract the settings and apply them. Tha means that upgrading to a
new version of Visual Studio would entail just a one line change in the
config file.

If you put a file called <animalname>.inhibit-runs in the build root,
all runs will be stopped until the file is removed. If you put a file
called <animalname>.force-one-run in the build root, each configured
branch will forced to run once, and the file will be removed. These only
apply if you use the run_branches.pl script.

The client should transparently deal with any change that is made in the
upstream repository's default branch name. This avoids the need for a
flag day when we eventually change the default branch name for
postgresql, as I assume we will do before long. The branch bf_HEAD which
the client creates now refers to the upstream default whatever it might
be, rather than the hardcoded name 'master'. The code of the SCM module
underwent quite a lot of change in order to make this happen; the
checkout code had become quite long and convoluted and I had to refactor
it somewhat before I was able to make and test this change. The changes
have been fairly extensively tested, but I'm still slightly nervous
about them. Owners are asked to report any issues promptly.

the `use_discard_caches` setting reflects a change in the way postgres
handles this - it's now a runtime setting rather than a compile time
setting. On older branches it sets "-DCLOBBER_CACHE_ALWAYS". If you use
this setting don't use that define.


Downloads are available at
<https://github.com/PGBuildFarm/client-code/releases> and
<https://buildfarm.postgresql.org/downloads>


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Andrew Dunstan <andrew@dunslane.net> writes:
> I have pushed Release 13 of the PostgreSQL BuildFarm client.
> ...
> the `use_discard_caches` setting reflects a change in the way postgres
> handles this - it's now a runtime setting rather than a compile time
> setting. On older branches it sets "-DCLOBBER_CACHE_ALWAYS". If you use
> this setting don't use that define.

I'd just like to take a moment to recommend that owners of
CLOBBER_CACHE_ALWAYS animals adopt this new way of doing things.

For PG 13 and earlier branches, this makes no difference --- it's
just an indirect way of adding "-DCLOBBER_CACHE_ALWAYS".  However,
for v14 and up, it does not do that but builds the binaries normally.
Then, cache-clobber testing is performed by adding "use_discard_caches=1"
as a GUC setting.  The reason this is useful is that we can skip running
individual tests in cache-clobber mode when we choose to.  As of
the new buildfarm client, this is exploited by skipping clobber mode
for initdb runs that are part of other tests.  (We still run initdb in
clobber mode in the initdb-LOCALE steps, so we aren't losing coverage.)
That makes a noticeable difference in the standard set of tests, but
where it really shines is if you enable TAP testing --- those tests
otherwise spend nearly half their time running initdb :-(.

The last I checked, no buildfarm animals were running with both
--enable-tap-tests and CLOBBER_CACHE_ALWAYS, which was reasonable
because it just took insanely long.  However, if you've got a fast
machine you may find that --enable-tap-tests plus use_discard_caches
is tolerable now ... at least in v14 and later branches.

I don't normally run my animal "sifaka" in cache-clobber mode, but
I did do a couple of runs that way while testing the new buildfarm
client.  Here's a run with buildfarm client v13, --enable-tap-tests,
and use_discard_caches:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-08-02%2022%3A09%3A03

Total time 9:21:52.  That compares well to this previous run with the
v12 client:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16

... total time 16:15:43.

Also of interest is that a month ago, it took twice as long (32:53:24):

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09

I don't have comparable runs from sifaka before that, but extrapolating
from avocet's times, it would have taken ~ 58.5 hours a year ago.
Those reductions are from various other changes we've implemented to
reduce the cost of cache-clobber testing.  Hopefully, these fixes
make it practical to be more ambitious about how much testing can
be done by cache-clobber animals.

            regards, tom lane



Andrew Dunstan <andrew@dunslane.net> writes:
> I have pushed Release 13 of the PostgreSQL BuildFarm client.
> ...
> the `use_discard_caches` setting reflects a change in the way postgres
> handles this - it's now a runtime setting rather than a compile time
> setting. On older branches it sets "-DCLOBBER_CACHE_ALWAYS". If you use
> this setting don't use that define.

I'd just like to take a moment to recommend that owners of
CLOBBER_CACHE_ALWAYS animals adopt this new way of doing things.

For PG 13 and earlier branches, this makes no difference --- it's
just an indirect way of adding "-DCLOBBER_CACHE_ALWAYS".  However,
for v14 and up, it does not do that but builds the binaries normally.
Then, cache-clobber testing is performed by adding "use_discard_caches=1"
as a GUC setting.  The reason this is useful is that we can skip running
individual tests in cache-clobber mode when we choose to.  As of
the new buildfarm client, this is exploited by skipping clobber mode
for initdb runs that are part of other tests.  (We still run initdb in
clobber mode in the initdb-LOCALE steps, so we aren't losing coverage.)
That makes a noticeable difference in the standard set of tests, but
where it really shines is if you enable TAP testing --- those tests
otherwise spend nearly half their time running initdb :-(.

The last I checked, no buildfarm animals were running with both
--enable-tap-tests and CLOBBER_CACHE_ALWAYS, which was reasonable
because it just took insanely long.  However, if you've got a fast
machine you may find that --enable-tap-tests plus use_discard_caches
is tolerable now ... at least in v14 and later branches.

I don't normally run my animal "sifaka" in cache-clobber mode, but
I did do a couple of runs that way while testing the new buildfarm
client.  Here's a run with buildfarm client v13, --enable-tap-tests,
and use_discard_caches:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-08-02%2022%3A09%3A03

Total time 9:21:52.  That compares well to this previous run with the
v12 client:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16

... total time 16:15:43.

Also of interest is that a month ago, it took twice as long (32:53:24):

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09

I don't have comparable runs from sifaka before that, but extrapolating
from avocet's times, it would have taken ~ 58.5 hours a year ago.
Those reductions are from various other changes we've implemented to
reduce the cost of cache-clobber testing.  Hopefully, these fixes
make it practical to be more ambitious about how much testing can
be done by cache-clobber animals.

            regards, tom lane