CLOBBER_CACHE_ALWAYS testing (was Re: Release 13 of the PostgreSQL BuildFarm client) - Mailing list pgsql-hackers

From Tom Lane
Subject CLOBBER_CACHE_ALWAYS testing (was Re: Release 13 of the PostgreSQL BuildFarm client)
Date
Msg-id 1714227.1628002897@sss.pgh.pa.us
Whole thread Raw
In response to Release 13 of the PostgreSQL BuildFarm client  (Andrew Dunstan <andrew@dunslane.net>)
List pgsql-hackers
Andrew Dunstan <andrew@dunslane.net> writes:
> I have pushed Release 13 of the PostgreSQL BuildFarm client.
> ...
> the `use_discard_caches` setting reflects a change in the way postgres
> handles this - it's now a runtime setting rather than a compile time
> setting. On older branches it sets "-DCLOBBER_CACHE_ALWAYS". If you use
> this setting don't use that define.

I'd just like to take a moment to recommend that owners of
CLOBBER_CACHE_ALWAYS animals adopt this new way of doing things.

For PG 13 and earlier branches, this makes no difference --- it's
just an indirect way of adding "-DCLOBBER_CACHE_ALWAYS".  However,
for v14 and up, it does not do that but builds the binaries normally.
Then, cache-clobber testing is performed by adding "use_discard_caches=1"
as a GUC setting.  The reason this is useful is that we can skip running
individual tests in cache-clobber mode when we choose to.  As of
the new buildfarm client, this is exploited by skipping clobber mode
for initdb runs that are part of other tests.  (We still run initdb in
clobber mode in the initdb-LOCALE steps, so we aren't losing coverage.)
That makes a noticeable difference in the standard set of tests, but
where it really shines is if you enable TAP testing --- those tests
otherwise spend nearly half their time running initdb :-(.

The last I checked, no buildfarm animals were running with both
--enable-tap-tests and CLOBBER_CACHE_ALWAYS, which was reasonable
because it just took insanely long.  However, if you've got a fast
machine you may find that --enable-tap-tests plus use_discard_caches
is tolerable now ... at least in v14 and later branches.

I don't normally run my animal "sifaka" in cache-clobber mode, but
I did do a couple of runs that way while testing the new buildfarm
client.  Here's a run with buildfarm client v13, --enable-tap-tests,
and use_discard_caches:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-08-02%2022%3A09%3A03

Total time 9:21:52.  That compares well to this previous run with the
v12 client:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16

... total time 16:15:43.

Also of interest is that a month ago, it took twice as long (32:53:24):

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09

I don't have comparable runs from sifaka before that, but extrapolating
from avocet's times, it would have taken ~ 58.5 hours a year ago.
Those reductions are from various other changes we've implemented to
reduce the cost of cache-clobber testing.  Hopefully, these fixes
make it practical to be more ambitious about how much testing can
be done by cache-clobber animals.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: when the startup process doesn't (logging startup delays)
Next
From: vignesh C
Date:
Subject: Re: Added schema level support for publication.