Thread: [PATCH] big test separation POC
Dear hackers, Per various discussion about the potential impact of Robins non regression tests, here is a poc patch to separate big tests from others. "paralle_schedule" holds usual tests, "big_schedule" holds big tests. The makefile derives serial_schedule, parallel_big_schedule and serial_big_schedule from the above descriptions. Then: - make check in regress does usual tests - make bigcheck in regress does both usual and big tests I'm not sure about what happens under vpath. I have no opinion about what should be considered a big test. -- Fabien.
Note about the POC patch limitations/questions: - is deriving a schedule with a piece of shell okay? or should perl/python/whatever scripting be better? - the big_schedule is assumed "sequential", i.e. one test per line. maybe it could/should be parallel? - I'm not sure of the "parallel_schedule" and "big_schedule" file names are the best possible choices. - I'm really not sure about VPATH stuff. - I do not understand why the makefile specifies $(srcdir) before local files in some places. - should the "bigcheck" target be accessible from the project root? that is should "make bigcheck" from ../../.. work? - the documentation is not updated, I guess something should be done somewhere. -- Fabien.
On 06/30/2013 02:54 PM, Fabien COELHO wrote: > > Note about the POC patch limitations/questions: > > - is deriving a schedule with a piece of shell okay? > or should perl/python/whatever scripting be better? I would think all we need are the results, i.e. the schedule files, plus some Makefile entries for them. > > - the big_schedule is assumed "sequential", i.e. one test per line. > maybe it could/should be parallel? > > - I'm not sure of the "parallel_schedule" and "big_schedule" > file names are the best possible choices. > > - I'm really not sure about VPATH stuff. This should be totally transparent to VPATH builds. > > - I do not understand why the makefile specifies $(srcdir) before > local files in some places. > For VPATH builds :-) > - should the "bigcheck" target be accessible from the project root? > that is should "make bigcheck" from ../../.. work? > Yes, possibly, but it's not terribly important (for example, the buildfarm does "cd src/test/regress && make <testname>") cheers andrew
>> Note about the POC patch limitations/questions: >> >> - is deriving a schedule with a piece of shell okay? >> or should perl/python/whatever scripting be better? > > I would think all we need are the results, i.e. the schedule files, plus > some Makefile entries for them. You can replicate data, but maintaining a set of files consistently looks like a bad idea to me, because it means that you have to update all replicated data for all changes. The current status is that there are two files, parallel & sequential, so it is not too bad. With big tests that could be 4, so it seems reasonnable to have at least some automatic derivation. >> - I'm really not sure about VPATH stuff. > > This should be totally transparent to VPATH builds. Sure:-) It means that I have not tested that, so it may or may not work. >> - I do not understand why the makefile specifies $(srcdir) before >> local files in some places. > > For VPATH builds :-) Hmmm. That is not what I call "transparent":-) So I understand that derived files should not have them, because they would be put in the build tree instead of the source tree. -- Fabien.
>> - I do not understand why the makefile specifies $(srcdir) before >> local files in some places. > > For VPATH builds :-) Here is a v2 which is more likely to work under VPATH. -- Fabien.
Hi Fabien,
On Mon, Jul 1, 2013 at 10:42 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
Here is a v2 which is more likely to work under VPATH.- I do not understand why the makefile specifies $(srcdir) before
local files in some places.
For VPATH builds :-)
I really appreciate your efforts. I am reviewing your patch.
While testing patch, I found that make installcheck breaks with your patch and gives following error:
============== running regression test queries ==============
pg_regress: could not open file "./serial_schedule" for reading: No such file or directory
looks like you forgot to add entry for serial_schedule.
Following sequence of commands will reproduces this error:
1) ./configure
2) make
3) make install
4) initialize the database cluster (initdb)
5) start the server
6) make installcheck
I will post more review comments if there are any.
--
Regards,
Samrat Revgade
> While testing patch, I found that make installcheck breaks with your patch > and gives following error: Indeed, I did not put the dependency for that target, I really tested "check" & "bigcheck". The attached patch adds the needed dependency for installcheck, and I could run it. I checked that no other target seems to be missing a dependency... -- Fabien.
>> Here is a v2 which is more likely to work under VPATH. Here is a POC v4 which relies on multiple --schedule instead of creating concatenated schedule files. -- Fabien.
On 2013-07-03 21:07:03 +0200, Fabien COELHO wrote: > > >>Here is a v2 which is more likely to work under VPATH. > > Here is a POC v4 which relies on multiple --schedule instead of creating > concatenated schedule files. > > -- > Fabien. > diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile > index d5935b6..8a39f7d 100644 > --- a/src/test/regress/GNUmakefile > +++ b/src/test/regress/GNUmakefile > @@ -86,7 +86,7 @@ regress_data_files = \ > $(wildcard $(srcdir)/output/*.source) \ > $(filter-out $(addprefix $(srcdir)/,$(input_files)),$(wildcard $(srcdir)/sql/*.sql)) \ > $(wildcard $(srcdir)/data/*.data) \ > - $(srcdir)/parallel_schedule $(srcdir)/serial_schedule $(srcdir)/resultmap > + $(srcdir)/parallel_schedule $(srcdir)/parallel_big_schedule $(srcdir)/resultmap > > install-tests: all install install-lib installdirs-tests > $(MAKE) -C $(top_builddir)/contrib/spi install > @@ -137,19 +137,43 @@ tablespace-setup: > ## Run tests > ## > > +# installcheck vs check: > +# - whether test is run against installed or compiled version > +# test schedules: parallel, parallel_big, standby > +# serial schedules can be derived from parallel schedules > + > +derived_schedules = serial_schedule serial_big_schedule > + > +serial_%: parallel_% > + echo "# this file is generated automatically, do not edit!" > $@ > + egrep '^(test|ignore):' $< | \ > + while read op list ; do \ > + for test in $$list ; do \ > + echo "$$op $$test" ; \ > + done ; \ > + done >> $@ > + This won't work on windows all that easily. Maybe we should instead add a "--run-serially" parameter to pg_regress? > -installcheck: all tablespace-setup > - $(pg_regress_installcheck) $(REGRESS_OPTS) --schedule=$(srcdir)/serial_schedule $(EXTRA_TESTS) > +# after installation, serial > +installcheck: all tablespace-setup serial_schedule > + $(pg_regress_installcheck) $(REGRESS_OPTS) \ > + --schedule=serial_schedule $(EXTRA_TESTS) Why do we use the serial schedule for installcheck anyway? Just because of max_connections? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
>> +serial_%: parallel_% >> + echo "# this file is generated automatically, do not edit!" > $@ >> + egrep '^(test|ignore):' $< | \ >> + while read op list ; do \ >> + for test in $$list ; do \ >> + echo "$$op $$test" ; \ >> + done ; \ >> + done >> $@ >> + > > This won't work on windows all that easily. Hmmm. I made the assumption that a system with "gnu make" would also have a shell and basic unix commands available, but it is possibly quite naive. ISTM that there are perl scripts used elsewhere for building postgresql. Would assuming that perl is available be admissible? > Maybe we should instead add a "--run-serially" parameter to pg_regress? I guess that it is quite possible and easy to do. I'm not sure whether it is desirable. >> -installcheck: all tablespace-setup >> - $(pg_regress_installcheck) $(REGRESS_OPTS) --schedule=$(srcdir)/serial_schedule $(EXTRA_TESTS) >> +# after installation, serial >> +installcheck: all tablespace-setup serial_schedule >> + $(pg_regress_installcheck) $(REGRESS_OPTS) \ >> + --schedule=serial_schedule $(EXTRA_TESTS) > > Why do we use the serial schedule for installcheck anyway? Just because > of max_connections? I asked myself the same question, and found no obvious answer. Maybe if a check is run against an installed version, it is assumed that postgresql is being used so the database should not be put under heavy load? -- Fabien.
Hi Fabien, While applying latest version of the patch (regress-big-v4.patch) on latest PostgreSQL version i encountered following errors: a) Using git: $git apply --index regress-big-v4.patch regress-big-v4.patch:10: trailing whitespace. $(srcdir)/parallel_schedule $(srcdir)/parallel_big_schedule $(srcdir)/resultmap regress-big-v4.patch:18: trailing whitespace. # installcheck vs check: regress-big-v4.patch:19: trailing whitespace. # - whether test is run against installed or compiled version regress-big-v4.patch:20: trailing whitespace. # test schedules: parallel, parallel_big, standby regress-big-v4.patch:21: trailing whitespace. # serial schedules can be derived from parallel schedules fatal: git apply: bad git-diff - expected /dev/null on line 97 b) Using patch: $patch -d. -p1 < regress-big-v4.patch (Stripping trailing CRs from patch.) patching file src/test/regress/GNUmakefile (Stripping trailing CRs from patch.) patching file src/test/regress/parallel_big_schedule (Stripping trailing CRs from patch.) patching file src/test/regress/serial_schedule Reversed (or previously applied) patch detected! Assume -R? [n] Is that a problem ?
> While applying latest version of the patch (regress-big-v4.patch) on > latest PostgreSQL version i encountered following errors: [...] > Is that a problem ? Yes and no:-) My understanding is that there is a conflict because of commits between this patch and head: a file that this patch deletes (it is derived by make rules) has been updated. It seems that git is not too good at detecting this and providing a meaningful message. Please find attached an updated version which seemed to work for me. Note that this is really a POC. How to derive a file is under discussion: it has been suggested that the unix shell approach would not work on Windows. I've suggested perl or python (which version?) but I'm not sure that it is okay either. -- Fabien.
Fabien COELHO escribió: > Note that this is really a POC. How to derive a file is under > discussion: it has been suggested that the unix shell approach would > not work on Windows. I've suggested perl or python (which version?) > but I'm not sure that it is okay either. The other option, suggested by Andres somewhere, is to have a new parameter to pg_regress, something like --run-serially. So you would use the same parallel schedule file, but serially instead of following the parallel specification. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On 07/11/2013 09:19 AM, Alvaro Herrera wrote: > Fabien COELHO escribió: > >> Note that this is really a POC. How to derive a file is under >> discussion: it has been suggested that the unix shell approach would >> not work on Windows. I've suggested perl or python (which version?) >> but I'm not sure that it is okay either. > > The other option, suggested by Andres somewhere, is to have a new > parameter to pg_regress, something like --run-serially. So you would > use the same parallel schedule file, but serially instead of following > the parallel specification. Ok, this sounds like it needs a *lot* of discussion before it's a patch.Marking "returned with feedback", and we'll discussit until September (or beyond). -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
> The other option, suggested by Andres somewhere, is to have a new > parameter to pg_regress, something like --run-serially. After looking at the source, ISTM that this option already exists under a different signature: --max-connections 1 > So you would use the same parallel schedule file, but serially instead > of following the parallel specification. Yep. And there is nothing to do, which is even better:-) -- Fabien.