Re: Test to dump and restore objects left behind by regression - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: Test to dump and restore objects left behind by regression |
Date | |
Msg-id | CAExHW5s6k4T4MSShhxfzx_YxSywrMqWEBqpLrjpd=9OpsZv_NQ@mail.gmail.com Whole thread Raw |
In response to | Re: Test to dump and restore objects left behind by regression (Michael Paquier <michael@paquier.xyz>) |
Responses |
Re: Test to dump and restore objects left behind by regression
|
List | pgsql-hackers |
On Fri, Mar 28, 2025 at 7:07 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Thu, Mar 27, 2025 at 06:15:06PM +0100, Alvaro Herrera wrote: > > BTW another idea to shorten this tests's runtime might be to try and > > identify which of parallel_schedule tests leave objects behind and > > create a shorter schedule with only those (a possible implementation > > might keep a list of the slow tests that don't leave any useful object > > behind, then filter parallel_schedule to exclude those; this ensures > > test files created in the future are still used.) > > I'm not much a fan of approaches that require an extra schedule, > because this is prone to forget the addition of objects that we'd want > to cover for the scope of this thread with the dump/restore > inter-dependencies, failing our goal of having more coverage. And > history has proven that we are quite bad at maintaining multiple > schedules for the regression test suite (remember the serial one or > the standby one in pg_regress?). So we should really do things so as > the schedules are down to a strict minimum: 1. I see Alvaro's point about using a different and minimal schedule. We already have 002_pg_upgrade and 027_stream_ as candidates which could use schedules other than default and avoid wasting CPU cycles. But I also agree with your opinion that maintaining multiple schedules is painful and prone to errors. What we could do is to create the schedule files automatically during build. The automation script will require to know which file to place in which schedules. That information could be either part of the sql file itself or could be in a separate text file. For example, every SQL file has the following line listing all the schedules that this SQL file should be part of. E.g. -- schedules: parallel, serial, upgrade The automated script looks at every .sql file in a given sql directory and creates the schedule files containing all the SQL files which had respective schedules mentioned in their "schedule" annotation. The automation script would flag SQL files that do not have scheduled annotation so any new file added won't be missed. However, we will still miss a SQL file if it wasn't part of a given schedule and later acquired some changes which required it to be added to a new schedule. If we go this route, we could make 'make check-tests' better. We could add another annotation for depends listing all the SQL files that a given SQL file depends upon. make check-tests would collect all dependencies, sort them and run all the dependencies as well. Of course that's out of scope for this patch. We don't have time left for this in PG 18. > > If we're worried about the time taken by the test (spoiler: I am and > the upgrade tests already show always as last to finish in parallel > runs), I would recommend to put that under a PG_TEST_EXTRA. I'm OK to > add the switch to my buildfarm animals if this option is the consensus > and if it gets into the tree. I would prefer to run this test by default as Alvaro mentioned previously. But if that means that we won't get this test committed at all, I am ok putting it under PG_TEST_EXTRA. (Hence I have kept 0001 and 0002 separate.) But I will be disappointed if the test, which has unearthed four bugs in a year alone, does not get committed to PG 18 because of this debate. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: