Re: Do we need to rethink how to parallelize regression tests to speedup CLOBBER_CACHE_ALWAYS? - Mailing list pgsql-hackers

From David Rowley
Subject Re: Do we need to rethink how to parallelize regression tests to speedup CLOBBER_CACHE_ALWAYS?
Date
Msg-id CAApHDvoi8-nH=vSU886pkRHUJZPYQcUd75fFfjZ-=V_gdQSFkA@mail.gmail.com
Whole thread Raw
In response to Re: Do we need to rethink how to parallelize regression tests to speedup CLOBBER_CACHE_ALWAYS?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, 13 May 2021 at 01:50, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> There are a whole lot of cases where test Y depends on an earlier test X.
> Some of those dependencies are annotated in parallel_schedule, but I fear
> most are not.
>
> If we had a full list of such dependencies then we could imagine building
> a job scheduler that would dispatch any script that has no remaining
> dependencies.

I wonder if it could be done by starting a new parallel group and then
just move existing tests into it first verifying that:

1.  The test does not display results from any pg_catalog table, or if
it does the filter is restrictive enough that there's no possibility
that the results will change due to other sessions changing the
catalogues.
2.  If the test creates any new objects that those objects have a name
that's unlikely to conflict with other tests. e.g no tablenames like
t1
3.  The test does not INSERT/DELETE/UPDATE/VACUUM/ALTER/ANALYZE any
tables that exist for more than 1 test.
4. Does not globally modify the system state. e.g ALTER SYSTEM.

We could document in parallel_schedule that tests in this particular
group must meet the above requirement, plus any others I've not
thought about.  That list of reasons could be updated when we discover
other things I've neglected to think about.

I hope that now since we no longer have serial_schedule that just
having one source of truth for tests that the comments in the
parallel_schedule are more likely to be read and kept up to date.

I imagine there are many tests that could also just be run entirely in
a single begin; commit;. That would mean any catalogue changes they
made would not be visible to any other test which happens to query
that.

David



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: SearchCatCacheList()/SearchSysCacheList() is O(n)
Next
From: Alvaro Herrera
Date:
Subject: Re: PG 14 release notes, first draft