Re: Adding CI to our tree - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: Adding CI to our tree
Date
Msg-id 20220213214213.GS31460@telsasoft.com
Whole thread Raw
In response to Re: Adding CI to our tree  (Andres Freund <andres@anarazel.de>)
Responses Re: Adding CI to our tree  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Feb 12, 2022 at 02:26:25PM -0800, Andres Freund wrote:
> On 2022-02-12 16:06:40 -0600, Justin Pryzby wrote:
> > I had some success with that, but it doesn't seem to be significantly faster -
> > it looks a lot like the tests are not actually running in parallel.

Note that the total test time is close to the sum of the individual test times.
But I think that may be an artifact of how prove is showing/attributing times
to each test (which, if so, is misleading).

> Note that prove unfortunately serializes the test output to be in the order it
> started them, even when tests run concurrently. Extremely unhelpful, but ...

Are you sure ?  When I run it locally, I see:
rm -fr src/test/recovery/tmp_check ; time PERL5LIB=`pwd`/src/test/perl TESTDIR=`pwd`/src/test/recovery
PATH=`pwd`/tmp_install/usr/local/pgsql/bin:$PATHPG_REGRESS=`pwd`/src/test/regress/pg_regress
REGRESS_SHLIB=`pwd`/src/test/regress/regress.soprove --time -j4 --ext '*.pl' `find src -name t`
 
...
[15:34:48] src/bin/scripts/t/101_vacuumdb_all.pl ....................... ok      104 ms ( 0.00 usr  0.00 sys +  2.35
cusr 0.47 csys =  2.82 CPU)
 
[15:34:49] src/bin/scripts/t/090_reindexdb.pl .......................... ok     8894 ms ( 0.06 usr  0.01 sys + 14.45
cusr 3.38 csys = 17.90 CPU)
 
[15:34:50] src/bin/pg_config/t/001_pg_config.pl ........................ ok       79 ms ( 0.00 usr  0.01 sys +  0.23
cusr 0.04 csys =  0.28 CPU)
 
[15:34:50] src/bin/pg_waldump/t/001_basic.pl ........................... ok       35 ms ( 0.00 usr  0.00 sys +  0.26
cusr 0.02 csys =  0.28 CPU)
 
[15:34:51] src/bin/pg_test_fsync/t/001_basic.pl ........................ ok      100 ms ( 0.01 usr  0.00 sys +  0.24
cusr 0.04 csys =  0.29 CPU)
 
[15:34:51] src/bin/pg_archivecleanup/t/010_pg_archivecleanup.pl ........ ok      177 ms ( 0.02 usr  0.00 sys +  0.26
cusr 0.03 csys =  0.31 CPU)
 
[15:34:55] src/bin/scripts/t/100_vacuumdb.pl ........................... ok    11267 ms ( 0.12 usr  0.04 sys + 13.47
cusr 3.20 csys = 16.83 CPU)
 
[15:34:57] src/bin/scripts/t/102_vacuumdb_stages.pl .................... ok     5802 ms ( 0.06 usr  0.01 sys +  7.70
cusr 1.37 csys =  9.14 CPU)
 
...

=> scripts/ stuff, followed by other stuff, followed by more, slower, scripts/ stuff.

But I never saw that in cirrus.

> Isn't this kind of a good test time? I thought earlier your alltaptests target
> took a good bit longer?

The original alltaptests runs in 16m 21s.
https://cirrus-ci.com/task/6679061752709120

2 weeks ago, it was ~14min with your patch to cache initdb.
https://cirrus-ci.com/task/5439320633901056

As I recall, that didn't seem to improve runtime when combined with my parallel
patch.

> One nice bit is that the output is a *lot* easier to read.
> 
> You could try improving the total time by having prove remember slow tests and
> use that file to run the slowest tests first next time. --state slow,save or
> such I believe. Of course we'd have to save that state file...

In a test, this hurt rather than helped (13m 42s).
https://cirrus-ci.com/task/6359167186239488

I'm not surprised - it makes sense to run 10 fast tests at once, but usually
doesn't make sense to run 10 slow tests tests at once (at least a couple of
which are doing something intensive).  It was faster (12m16s) to do it
backwards (fastest tests first).
https://cirrus-ci.com/task/5745115443494912

BTW, does it make sense to remove test_regress_parallel_script ?  The
pg_upgrade run would do the same things, no ?  If so, it might make sense to
run that first.  OTOH, you suggested to run the upgrade tests with checksums
enabled, which seems like a good idea.

Note that in the attached patches, I changed the msvc "warnings" to use "tee".

I don't know how to fix the pipeline test in a less hacky way...

You said the docs build should be a separate task, but then said that it'd be
okay to remove the dependency.  So I did it both ways.  There's currently some
duplication between the docs patch and code coverage patch.

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Adding CI to our tree
Next
From: Andres Freund
Date:
Subject: Re: [PATCH] Avoid open and lock the table Extendend Statistics (src/backend/commands/statscmds.c)