Thread: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members
I had an idea for another way to attack $SUBJECT, now that we have the ability to adjust debug_invalidate_system_caches_always at runtime. Namely, that a lot of time goes into repeated initdb runs (especially if the TAP tests are enabled); but surely we learn little from CCA initdb runs after the first one. We could trim this fat by: (1) Instead of applying CLOBBER_CACHE_ALWAYS as a compile option, add "debug_invalidate_system_caches_always = 1" to the buildfarm's "extra_config" options, which are added to postgresql.conf after initdb. Thus, initdb will run without that option but all the actual test cases will have it. (2) To close the testing gap that now we have *no* CCA coverage of initdb runs, adjust either the buildfarm's initdb-only steps or initdb's 001_initdb.pl TAP script to set "debug_invalidate_system_caches_always = 1" in one of the runs. I think we can make that happen so far as the bootstrap backend is concerned by including "-c debug_invalidate_system_caches_always=1" on its command line; but of course initdb itself has no way to be told to do that. I think we could invent a "-c NAME=VALUE" switch for initdb to tell it to pass down that switch to its child backends. Then there'd have to be some way to tell the calling tests whether to do that. (3) Since this only works in v14 and up, older branches would have to fall back to -DCLOBBER_CACHE_ALWAYS. Perhaps we could improve the buildfarm client script so that buildfarm owners just configure "clobber_cache_testing => 1" and then the script would do the right branch-dependent thing. Of course, we could eliminate the need for branch-dependent logic if we cared to back-patch the addition of the debug_invalidate_system_caches_always GUC, but that's probably a bridge too far. It looks to me like this would cut around an hour off of the roughly-a-day cycle times of the existing CCA animals. None of them run any TAP tests, presumably because that would make their cycle time astronomical, but maybe this change will help make that practical. Thoughts? regards, tom lane
On 6/20/21 6:10 PM, Tom Lane wrote: > (3) Since this only works in v14 and up, older branches would > have to fall back to -DCLOBBER_CACHE_ALWAYS. Perhaps we could > improve the buildfarm client script so that buildfarm owners > just configure "clobber_cache_testing => 1" and then the script > would do the right branch-dependent thing. Maybe. Let's see what you come up with. > > Of course, we could eliminate the need for branch-dependent > logic if we cared to back-patch the addition of the > debug_invalidate_system_caches_always GUC, but that's probably > a bridge too far. Yeah, I think so. > > It looks to me like this would cut around an hour off of the > roughly-a-day cycle times of the existing CCA animals. None > of them run any TAP tests, presumably because that would make > their cycle time astronomical, but maybe this change will help > make that practical. > It might. I'm fairly sure there are a lot of repetitive cycles wasted in the TAP tests, quite apart from initdb. We've become rather profligate in our use of time and resources. cheers adrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 6/20/21 6:10 PM, Tom Lane wrote: >> (3) Since this only works in v14 and up, older branches would >> have to fall back to -DCLOBBER_CACHE_ALWAYS. Perhaps we could >> improve the buildfarm client script so that buildfarm owners >> just configure "clobber_cache_testing => 1" and then the script >> would do the right branch-dependent thing. > Maybe. Let's see what you come up with. Here's a couple of draft-quality patches --- one for initdb, one for the buildfarm --- to implement this idea. These are just lightly tested; in particular I've not had the patience to run full BF cycles to see how much is actually saved. regards, tom lane diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml index afd344b4c0..3077530c7b 100644 --- a/doc/src/sgml/ref/initdb.sgml +++ b/doc/src/sgml/ref/initdb.sgml @@ -388,6 +388,17 @@ PostgreSQL documentation Other, less commonly used, options are also available: <variablelist> + <varlistentry> + <term><option>--clobber-cache</option></term> + <listitem> + <para> + Run the bootstrap backend with the + <literal>debug_invalidate_system_caches_always=1</literal> option. + This takes a very long time and is only of use for deep debugging. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>-d</option></term> <term><option>--debug</option></term> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c index 152d21e88b..0945d70061 100644 --- a/src/bin/initdb/initdb.c +++ b/src/bin/initdb/initdb.c @@ -202,6 +202,9 @@ static bool authwarning = false; static const char *boot_options = "-F"; static const char *backend_options = "--single -F -O -j -c search_path=pg_catalog -c exit_on_error=true"; +/* Additional switches to pass to backend (either boot or standalone) */ +static char *extra_options = ""; + static const char *const subdirs[] = { "global", "pg_wal/archive_status", @@ -962,12 +965,12 @@ test_config_settings(void) test_buffs = MIN_BUFS_FOR_CONNS(test_conns); snprintf(cmd, sizeof(cmd), - "\"%s\" --boot -x0 %s " + "\"%s\" --boot -x0 %s %s " "-c max_connections=%d " "-c shared_buffers=%d " "-c dynamic_shared_memory_type=%s " "< \"%s\" > \"%s\" 2>&1", - backend_exec, boot_options, + backend_exec, boot_options, extra_options, test_conns, test_buffs, dynamic_shared_memory_type, DEVNULL, DEVNULL); @@ -998,12 +1001,12 @@ test_config_settings(void) } snprintf(cmd, sizeof(cmd), - "\"%s\" --boot -x0 %s " + "\"%s\" --boot -x0 %s %s " "-c max_connections=%d " "-c shared_buffers=%d " "-c dynamic_shared_memory_type=%s " "< \"%s\" > \"%s\" 2>&1", - backend_exec, boot_options, + backend_exec, boot_options, extra_options, n_connections, test_buffs, dynamic_shared_memory_type, DEVNULL, DEVNULL); @@ -1403,11 +1406,11 @@ bootstrap_template1(void) unsetenv("PGCLIENTENCODING"); snprintf(cmd, sizeof(cmd), - "\"%s\" --boot -x1 -X %u %s %s %s", + "\"%s\" --boot -x1 -X %u %s %s %s %s", backend_exec, wal_segment_size_mb * (1024 * 1024), data_checksums ? "-k" : "", - boot_options, + boot_options, extra_options, debug ? "-d 5" : ""); @@ -2263,6 +2266,7 @@ usage(const char *progname) printf(_(" -X, --waldir=WALDIR location for the write-ahead log directory\n")); printf(_(" --wal-segsize=SIZE size of WAL segments, in megabytes\n")); printf(_("\nLess commonly used options:\n")); + printf(_(" --clobber-cache use cache-clobbering debug option\n")); printf(_(" -d, --debug generate lots of debugging output\n")); printf(_(" -L DIRECTORY where to find the input files\n")); printf(_(" -n, --no-clean do not clean up after errors\n")); @@ -2863,8 +2867,8 @@ initialize_data_directory(void) fflush(stdout); snprintf(cmd, sizeof(cmd), - "\"%s\" %s template1 >%s", - backend_exec, backend_options, + "\"%s\" %s %s template1 >%s", + backend_exec, backend_options, extra_options, DEVNULL); PG_CMD_OPEN; @@ -2943,6 +2947,7 @@ main(int argc, char *argv[]) {"wal-segsize", required_argument, NULL, 12}, {"data-checksums", no_argument, NULL, 'k'}, {"allow-group-access", no_argument, NULL, 'g'}, + {"clobber-cache", no_argument, NULL, 14}, {NULL, 0, NULL, 0} }; @@ -3084,6 +3089,11 @@ main(int argc, char *argv[]) case 'g': SetDataDirectoryCreatePerm(PG_DIR_MODE_GROUP); break; + case 14: + extra_options = psprintf("%s %s", + extra_options, + "-c debug_invalidate_system_caches_always=1"); + break; default: /* getopt_long already emitted a complaint */ fprintf(stderr, _("Try \"%s --help\" for more information.\n"), diff --git a/build-farm.conf.sample b/build-farm.conf.sample index 7e80451..d7e6a62 100644 --- a/build-farm.conf.sample +++ b/build-farm.conf.sample @@ -98,6 +98,11 @@ my $confdir = File::Spec->rel2abs(File::Basename::dirname(__FILE__)); --leak-check=no --gen-suppressions=all --error-limit=no} ), + # if true run tests with debug_invalidate_system_caches_always=1, or + # the equivalent on older branches. Do not set CLOBBER_CACHE_ALWAYS + # if you use this. + use_clobber_cache => undef, + # path to directory with auxiliary web script # if relative, the must be relative to buildroot/branch # Now only used on older Msys installations diff --git a/run_build.pl b/run_build.pl index 9ed6056..5c945b0 100755 --- a/run_build.pl +++ b/run_build.pl @@ -178,6 +178,7 @@ my ( $wait_timeout, $use_accache, $use_valgrind, $valgrind_options, $use_installcheck_parallel, $max_load_avg, + $use_clobber_cache, $archive_reports ) = @PGBuild::conf{ @@ -186,7 +187,7 @@ my ( use_vpath tar_log_cmd using_msvc extra_config make_jobs core_file_glob ccache_failure_remove wait_timeout use_accache use_valgrind valgrind_options use_installcheck_parallel max_load_avg - archive_reports) + use_clobber_cache archive_reports) }; $ts_prefix = sprintf('%s:%-13s ', $animal, $branch); @@ -651,6 +652,19 @@ if ($extra_config && $extra_config->{DEFAULT}) } } +if ($use_clobber_cache && ($branch eq 'HEAD' || $branch ge 'REL_14')) +{ + if (!exists $extra_config->{$branch}) + { + $extra_config->{$branch} = ["debug_invalidate_system_caches_always = 1"]; + } + else + { + push(@{ $extra_config->{$branch} }, + "debug_invalidate_system_caches_always = 1"); + } +} + if ($extra_config && $extra_config->{$branch}) { my $tmpname = "$tmpdir/bfextra.conf"; @@ -1340,9 +1354,15 @@ sub initdb chdir $installdir; + my $initdbopts = qq{-A trust -U buildfarm --locale=$locale}; + + if ($use_clobber_cache && ($branch eq 'HEAD' || $branch ge 'REL_14')) + { + $initdbopts .= " --clobber-cache"; + } + @initout = - run_log( - qq{"bin/initdb" -A trust -U buildfarm --locale=$locale data-$locale}); + run_log(qq{"bin/initdb" $initdbopts data-$locale}); my $status = $? >> 8; @@ -2370,6 +2390,17 @@ sub configure } } } + if ($use_clobber_cache && $branch ne 'HEAD' && $branch lt 'REL_14') + { + if (defined $env->{CPPFLAGS}) + { + $env->{CPPFLAGS} .= " -DCLOBBER_CACHE_ALWAYS"; + } + else + { + $env->{CPPFLAGS} = "-DCLOBBER_CACHE_ALWAYS"; + } + } my $envstr = ""; while (my ($key, $val) = each %$env)
On 6/22/21 5:11 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 6/20/21 6:10 PM, Tom Lane wrote: >>> (3) Since this only works in v14 and up, older branches would >>> have to fall back to -DCLOBBER_CACHE_ALWAYS. Perhaps we could >>> improve the buildfarm client script so that buildfarm owners >>> just configure "clobber_cache_testing => 1" and then the script >>> would do the right branch-dependent thing. >> Maybe. Let's see what you come up with. > Here's a couple of draft-quality patches --- one for initdb, one > for the buildfarm --- to implement this idea. These are just > lightly tested; in particular I've not had the patience to run > full BF cycles to see how much is actually saved. > > Looks OK for the buildfarm patch. I wonder if we just want to run initdb once with --clobber-cache instead of once per locale? cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > Looks OK for the buildfarm patch. I wonder if we just want to run initdb > once with --clobber-cache instead of once per locale? I thought about that, but I'm not sure it's appropriate for the buildfarm client to be making that decision. I do not think any of the CCA animals run more than one locale anyway, so it's likely moot. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > On 6/22/21 5:11 PM, Tom Lane wrote: >> Here's a couple of draft-quality patches --- one for initdb, one >> for the buildfarm --- to implement this idea. These are just >> lightly tested; in particular I've not had the patience to run >> full BF cycles to see how much is actually saved. > Looks OK for the buildfarm patch. I wonder if we just want to run initdb > once with --clobber-cache instead of once per locale? So, where do we want to go with these? I'm inclined to argue that it's okay to sneak the initdb change into v14, on the grounds that it's needed to fully exploit the change from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always. Without it, there is no way to do CCA testing on the bootstrap process except by reverting to the old hard-wired way of doing things. Having pushed that, we could try out the buildfarm side of the change and verify it's okay. regards, tom lane
On 7/1/21 11:01 AM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> On 6/22/21 5:11 PM, Tom Lane wrote: >>> Here's a couple of draft-quality patches --- one for initdb, one >>> for the buildfarm --- to implement this idea. These are just >>> lightly tested; in particular I've not had the patience to run >>> full BF cycles to see how much is actually saved. >> Looks OK for the buildfarm patch. I wonder if we just want to run initdb >> once with --clobber-cache instead of once per locale? > So, where do we want to go with these? > > I'm inclined to argue that it's okay to sneak the initdb change into > v14, on the grounds that it's needed to fully exploit the change > from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always. > Without it, there is no way to do CCA testing on the bootstrap process > except by reverting to the old hard-wired way of doing things. > > Having pushed that, we could try out the buildfarm side of the > change and verify it's okay. > > Seems reasonable. I don't have a CCA animal any more, but I guess I could set up a test. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 7/1/21 11:01 AM, Tom Lane wrote: >> I'm inclined to argue that it's okay to sneak the initdb change into >> v14, on the grounds that it's needed to fully exploit the change >> from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always. >> Without it, there is no way to do CCA testing on the bootstrap process >> except by reverting to the old hard-wired way of doing things. >> >> Having pushed that, we could try out the buildfarm side of the >> change and verify it's okay. > Seems reasonable. I don't have a CCA animal any more, but I guess I > could set up a test. I can run a test here --- I'll commandeer sifaka for awhile, since that's the fastest animal I have. regards, tom lane
I wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Seems reasonable. I don't have a CCA animal any more, but I guess I >> could set up a test. > I can run a test here --- I'll commandeer sifaka for awhile, > since that's the fastest animal I have. Done, and here's the results: Traditional way (-DCLOBBER_CACHE_ALWAYS): 32:53:24 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09 New way: 16:15:43 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16 That's running sifaka's full test schedule including TAP tests, which ordinarily takes it a shade under 10 minutes. The savings on a non-TAP run would be a lot less, of course, thanks to fewer initdb invocations. Although I lacked the patience to run a full back-branch cycle with -DCLOBBER_CACHE_ALWAYS, I did verify that the patch correctly injects that #define when running an old branch. So I think it's ready to go into the buildfarm, modulo any cosmetic work you might want to do. regards, tom lane
On 7/3/21 6:59 PM, Tom Lane wrote: > I wrote: >> Andrew Dunstan <andrew@dunslane.net> writes: >>> Seems reasonable. I don't have a CCA animal any more, but I guess I >>> could set up a test. >> I can run a test here --- I'll commandeer sifaka for awhile, >> since that's the fastest animal I have. > Done, and here's the results: > > Traditional way (-DCLOBBER_CACHE_ALWAYS): 32:53:24 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09 > > New way: 16:15:43 > https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16 > > That's running sifaka's full test schedule including TAP tests, > which ordinarily takes it a shade under 10 minutes. The savings > on a non-TAP run would be a lot less, of course, thanks to > fewer initdb invocations. > > Although I lacked the patience to run a full back-branch cycle > with -DCLOBBER_CACHE_ALWAYS, I did verify that the patch > correctly injects that #define when running an old branch. > So I think it's ready to go into the buildfarm, modulo any > cosmetic work you might want to do. > > Yeah, I'm looking at it now. A couple of things: I think we should probably call the setting 'use_clobber_cache_always' since that's what it does. And I think we should probably put in a sanity check to make it incompatible with any -DCLOBBER_CACHE_* define in CPPFLAGS. Thoughts? There is one item I want to complete before putting out a new client release - making provision for a change in the name of the default git branch - the aim is that with the new release in place that will be completely seamless whenever it happens and whatever name is chosen. I hope to have that done in a week or so., so the new release would be out in about two weeks, if all goes well. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes: > On 7/3/21 6:59 PM, Tom Lane wrote: >> So I think it's ready to go into the buildfarm, modulo any >> cosmetic work you might want to do. > Yeah, I'm looking at it now. A couple of things: I think we should > probably call the setting 'use_clobber_cache_always' since that's what > it does. And I think we should probably put in a sanity check to make it > incompatible with any -DCLOBBER_CACHE_* define in CPPFLAGS. > Thoughts? No objections here. regards, tom lane