Thread: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

From
Tom Lane
Date:
I had an idea for another way to attack $SUBJECT, now that we have
the ability to adjust debug_invalidate_system_caches_always at
runtime.  Namely, that a lot of time goes into repeated initdb
runs (especially if the TAP tests are enabled); but surely we
learn little from CCA initdb runs after the first one.  We could
trim this fat by:

(1) Instead of applying CLOBBER_CACHE_ALWAYS as a compile option,
add "debug_invalidate_system_caches_always = 1" to the buildfarm's
"extra_config" options, which are added to postgresql.conf after
initdb.  Thus, initdb will run without that option but all the
actual test cases will have it.

(2) To close the testing gap that now we have *no* CCA coverage
of initdb runs, adjust either the buildfarm's initdb-only steps
or initdb's 001_initdb.pl TAP script to set
"debug_invalidate_system_caches_always = 1" in one of the runs.
I think we can make that happen so far as the bootstrap backend is
concerned by including "-c debug_invalidate_system_caches_always=1"
on its command line; but of course initdb itself has no way to be
told to do that.  I think we could invent a "-c NAME=VALUE" switch
for initdb to tell it to pass down that switch to its child
backends.  Then there'd have to be some way to tell the calling
tests whether to do that.

(3) Since this only works in v14 and up, older branches would
have to fall back to -DCLOBBER_CACHE_ALWAYS.  Perhaps we could
improve the buildfarm client script so that buildfarm owners
just configure "clobber_cache_testing => 1" and then the script
would do the right branch-dependent thing.

Of course, we could eliminate the need for branch-dependent
logic if we cared to back-patch the addition of the
debug_invalidate_system_caches_always GUC, but that's probably
a bridge too far.

It looks to me like this would cut around an hour off of the
roughly-a-day cycle times of the existing CCA animals.  None
of them run any TAP tests, presumably because that would make
their cycle time astronomical, but maybe this change will help
make that practical.

Thoughts?
            regards, tom lane



Re: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

From
Andrew Dunstan
Date:
On 6/20/21 6:10 PM, Tom Lane wrote:
> (3) Since this only works in v14 and up, older branches would
> have to fall back to -DCLOBBER_CACHE_ALWAYS.  Perhaps we could
> improve the buildfarm client script so that buildfarm owners
> just configure "clobber_cache_testing => 1" and then the script
> would do the right branch-dependent thing.


Maybe. Let's see what you come up with.


>
> Of course, we could eliminate the need for branch-dependent
> logic if we cared to back-patch the addition of the
> debug_invalidate_system_caches_always GUC, but that's probably
> a bridge too far.


Yeah, I think so.


>
> It looks to me like this would cut around an hour off of the
> roughly-a-day cycle times of the existing CCA animals.  None
> of them run any TAP tests, presumably because that would make
> their cycle time astronomical, but maybe this change will help
> make that practical.
>

It might. I'm fairly sure there are a lot of repetitive cycles wasted in
the TAP tests, quite apart from initdb. We've become rather profligate
in our use of time and resources.


cheers


adrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Andrew Dunstan <andrew@dunslane.net> writes:
> On 6/20/21 6:10 PM, Tom Lane wrote:
>> (3) Since this only works in v14 and up, older branches would
>> have to fall back to -DCLOBBER_CACHE_ALWAYS.  Perhaps we could
>> improve the buildfarm client script so that buildfarm owners
>> just configure "clobber_cache_testing => 1" and then the script
>> would do the right branch-dependent thing.

> Maybe. Let's see what you come up with.

Here's a couple of draft-quality patches --- one for initdb, one
for the buildfarm --- to implement this idea.  These are just
lightly tested; in particular I've not had the patience to run
full BF cycles to see how much is actually saved.

            regards, tom lane

diff --git a/doc/src/sgml/ref/initdb.sgml b/doc/src/sgml/ref/initdb.sgml
index afd344b4c0..3077530c7b 100644
--- a/doc/src/sgml/ref/initdb.sgml
+++ b/doc/src/sgml/ref/initdb.sgml
@@ -388,6 +388,17 @@ PostgreSQL documentation
     Other, less commonly used, options are also available:

     <variablelist>
+     <varlistentry>
+      <term><option>--clobber-cache</option></term>
+      <listitem>
+       <para>
+        Run the bootstrap backend with the
+        <literal>debug_invalidate_system_caches_always=1</literal> option.
+        This takes a very long time and is only of use for deep debugging.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry>
       <term><option>-d</option></term>
       <term><option>--debug</option></term>
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 152d21e88b..0945d70061 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -202,6 +202,9 @@ static bool authwarning = false;
 static const char *boot_options = "-F";
 static const char *backend_options = "--single -F -O -j -c search_path=pg_catalog -c exit_on_error=true";

+/* Additional switches to pass to backend (either boot or standalone) */
+static char *extra_options = "";
+
 static const char *const subdirs[] = {
     "global",
     "pg_wal/archive_status",
@@ -962,12 +965,12 @@ test_config_settings(void)
         test_buffs = MIN_BUFS_FOR_CONNS(test_conns);

         snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --boot -x0 %s "
+                 "\"%s\" --boot -x0 %s %s "
                  "-c max_connections=%d "
                  "-c shared_buffers=%d "
                  "-c dynamic_shared_memory_type=%s "
                  "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options,
+                 backend_exec, boot_options, extra_options,
                  test_conns, test_buffs,
                  dynamic_shared_memory_type,
                  DEVNULL, DEVNULL);
@@ -998,12 +1001,12 @@ test_config_settings(void)
         }

         snprintf(cmd, sizeof(cmd),
-                 "\"%s\" --boot -x0 %s "
+                 "\"%s\" --boot -x0 %s %s "
                  "-c max_connections=%d "
                  "-c shared_buffers=%d "
                  "-c dynamic_shared_memory_type=%s "
                  "< \"%s\" > \"%s\" 2>&1",
-                 backend_exec, boot_options,
+                 backend_exec, boot_options, extra_options,
                  n_connections, test_buffs,
                  dynamic_shared_memory_type,
                  DEVNULL, DEVNULL);
@@ -1403,11 +1406,11 @@ bootstrap_template1(void)
     unsetenv("PGCLIENTENCODING");

     snprintf(cmd, sizeof(cmd),
-             "\"%s\" --boot -x1 -X %u %s %s %s",
+             "\"%s\" --boot -x1 -X %u %s %s %s %s",
              backend_exec,
              wal_segment_size_mb * (1024 * 1024),
              data_checksums ? "-k" : "",
-             boot_options,
+             boot_options, extra_options,
              debug ? "-d 5" : "");


@@ -2263,6 +2266,7 @@ usage(const char *progname)
     printf(_("  -X, --waldir=WALDIR       location for the write-ahead log directory\n"));
     printf(_("      --wal-segsize=SIZE    size of WAL segments, in megabytes\n"));
     printf(_("\nLess commonly used options:\n"));
+    printf(_("      --clobber-cache       use cache-clobbering debug option\n"));
     printf(_("  -d, --debug               generate lots of debugging output\n"));
     printf(_("  -L DIRECTORY              where to find the input files\n"));
     printf(_("  -n, --no-clean            do not clean up after errors\n"));
@@ -2863,8 +2867,8 @@ initialize_data_directory(void)
     fflush(stdout);

     snprintf(cmd, sizeof(cmd),
-             "\"%s\" %s template1 >%s",
-             backend_exec, backend_options,
+             "\"%s\" %s %s template1 >%s",
+             backend_exec, backend_options, extra_options,
              DEVNULL);

     PG_CMD_OPEN;
@@ -2943,6 +2947,7 @@ main(int argc, char *argv[])
         {"wal-segsize", required_argument, NULL, 12},
         {"data-checksums", no_argument, NULL, 'k'},
         {"allow-group-access", no_argument, NULL, 'g'},
+        {"clobber-cache", no_argument, NULL, 14},
         {NULL, 0, NULL, 0}
     };

@@ -3084,6 +3089,11 @@ main(int argc, char *argv[])
             case 'g':
                 SetDataDirectoryCreatePerm(PG_DIR_MODE_GROUP);
                 break;
+            case 14:
+                extra_options = psprintf("%s %s",
+                                         extra_options,
+                                         "-c debug_invalidate_system_caches_always=1");
+                break;
             default:
                 /* getopt_long already emitted a complaint */
                 fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
diff --git a/build-farm.conf.sample b/build-farm.conf.sample
index 7e80451..d7e6a62 100644
--- a/build-farm.conf.sample
+++ b/build-farm.conf.sample
@@ -98,6 +98,11 @@ my $confdir = File::Spec->rel2abs(File::Basename::dirname(__FILE__));
           --leak-check=no --gen-suppressions=all   --error-limit=no}
     ),

+    # if true run tests with debug_invalidate_system_caches_always=1, or
+    # the equivalent on older branches.  Do not set CLOBBER_CACHE_ALWAYS
+    # if you use this.
+    use_clobber_cache => undef,
+
     # path to directory with auxiliary web script
     # if relative, the must be relative to buildroot/branch
     # Now only used on older Msys installations
diff --git a/run_build.pl b/run_build.pl
index 9ed6056..5c945b0 100755
--- a/run_build.pl
+++ b/run_build.pl
@@ -178,6 +178,7 @@ my (
     $wait_timeout,              $use_accache,
     $use_valgrind,              $valgrind_options,
     $use_installcheck_parallel, $max_load_avg,
+    $use_clobber_cache,
     $archive_reports
   )
   = @PGBuild::conf{
@@ -186,7 +187,7 @@ my (
       use_vpath tar_log_cmd using_msvc extra_config make_jobs core_file_glob
       ccache_failure_remove wait_timeout use_accache
       use_valgrind valgrind_options use_installcheck_parallel max_load_avg
-      archive_reports)
+      use_clobber_cache archive_reports)
   };

 $ts_prefix = sprintf('%s:%-13s ', $animal, $branch);
@@ -651,6 +652,19 @@ if ($extra_config && $extra_config->{DEFAULT})
     }
 }

+if ($use_clobber_cache && ($branch eq 'HEAD' || $branch ge 'REL_14'))
+{
+    if (!exists $extra_config->{$branch})
+    {
+    $extra_config->{$branch} = ["debug_invalidate_system_caches_always = 1"];
+    }
+    else
+    {
+    push(@{ $extra_config->{$branch} },
+         "debug_invalidate_system_caches_always = 1");
+    }
+}
+
 if ($extra_config && $extra_config->{$branch})
 {
     my $tmpname = "$tmpdir/bfextra.conf";
@@ -1340,9 +1354,15 @@ sub initdb

     chdir $installdir;

+    my $initdbopts = qq{-A trust -U buildfarm --locale=$locale};
+
+    if ($use_clobber_cache && ($branch eq 'HEAD' || $branch ge 'REL_14'))
+    {
+        $initdbopts .= " --clobber-cache";
+    }
+
     @initout =
-      run_log(
-        qq{"bin/initdb" -A trust -U buildfarm --locale=$locale data-$locale});
+      run_log(qq{"bin/initdb" $initdbopts data-$locale});

     my $status = $? >> 8;

@@ -2370,6 +2390,17 @@ sub configure
             }
         }
     }
+    if ($use_clobber_cache && $branch ne 'HEAD' && $branch lt 'REL_14')
+    {
+        if (defined $env->{CPPFLAGS})
+        {
+        $env->{CPPFLAGS} .= " -DCLOBBER_CACHE_ALWAYS";
+        }
+        else
+        {
+        $env->{CPPFLAGS} = "-DCLOBBER_CACHE_ALWAYS";
+        }
+    }

     my $envstr = "";
     while (my ($key, $val) = each %$env)

Re: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

From
Andrew Dunstan
Date:
On 6/22/21 5:11 PM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 6/20/21 6:10 PM, Tom Lane wrote:
>>> (3) Since this only works in v14 and up, older branches would
>>> have to fall back to -DCLOBBER_CACHE_ALWAYS.  Perhaps we could
>>> improve the buildfarm client script so that buildfarm owners
>>> just configure "clobber_cache_testing => 1" and then the script
>>> would do the right branch-dependent thing.
>> Maybe. Let's see what you come up with.
> Here's a couple of draft-quality patches --- one for initdb, one
> for the buildfarm --- to implement this idea.  These are just
> lightly tested; in particular I've not had the patience to run
> full BF cycles to see how much is actually saved.
>
>             



Looks OK for the buildfarm patch. I wonder if we just want to run initdb
once with --clobber-cache instead of once per locale?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Andrew Dunstan <andrew@dunslane.net> writes:
> Looks OK for the buildfarm patch. I wonder if we just want to run initdb
> once with --clobber-cache instead of once per locale?

I thought about that, but I'm not sure it's appropriate for the buildfarm
client to be making that decision.  I do not think any of the CCA animals
run more than one locale anyway, so it's likely moot.

            regards, tom lane



Andrew Dunstan <andrew@dunslane.net> writes:
> On 6/22/21 5:11 PM, Tom Lane wrote:
>> Here's a couple of draft-quality patches --- one for initdb, one
>> for the buildfarm --- to implement this idea.  These are just
>> lightly tested; in particular I've not had the patience to run
>> full BF cycles to see how much is actually saved.

> Looks OK for the buildfarm patch. I wonder if we just want to run initdb
> once with --clobber-cache instead of once per locale?

So, where do we want to go with these?

I'm inclined to argue that it's okay to sneak the initdb change into
v14, on the grounds that it's needed to fully exploit the change
from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always.
Without it, there is no way to do CCA testing on the bootstrap process
except by reverting to the old hard-wired way of doing things.

Having pushed that, we could try out the buildfarm side of the
change and verify it's okay.

            regards, tom lane



Re: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

From
Andrew Dunstan
Date:
On 7/1/21 11:01 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> On 6/22/21 5:11 PM, Tom Lane wrote:
>>> Here's a couple of draft-quality patches --- one for initdb, one
>>> for the buildfarm --- to implement this idea.  These are just
>>> lightly tested; in particular I've not had the patience to run
>>> full BF cycles to see how much is actually saved.
>> Looks OK for the buildfarm patch. I wonder if we just want to run initdb
>> once with --clobber-cache instead of once per locale?
> So, where do we want to go with these?
>
> I'm inclined to argue that it's okay to sneak the initdb change into
> v14, on the grounds that it's needed to fully exploit the change
> from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always.
> Without it, there is no way to do CCA testing on the bootstrap process
> except by reverting to the old hard-wired way of doing things.
>
> Having pushed that, we could try out the buildfarm side of the
> change and verify it's okay.
>
>             



Seems reasonable. I don't have a CCA animal any more, but I guess I
could set up a test.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Andrew Dunstan <andrew@dunslane.net> writes:
> On 7/1/21 11:01 AM, Tom Lane wrote:
>> I'm inclined to argue that it's okay to sneak the initdb change into
>> v14, on the grounds that it's needed to fully exploit the change
>> from CLOBBER_CACHE_ALWAYS to debug_invalidate_system_caches_always.
>> Without it, there is no way to do CCA testing on the bootstrap process
>> except by reverting to the old hard-wired way of doing things.
>> 
>> Having pushed that, we could try out the buildfarm side of the
>> change and verify it's okay.

> Seems reasonable. I don't have a CCA animal any more, but I guess I
> could set up a test.

I can run a test here --- I'll commandeer sifaka for awhile,
since that's the fastest animal I have.

            regards, tom lane



I wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Seems reasonable. I don't have a CCA animal any more, but I guess I
>> could set up a test.

> I can run a test here --- I'll commandeer sifaka for awhile,
> since that's the fastest animal I have.

Done, and here's the results:

Traditional way (-DCLOBBER_CACHE_ALWAYS): 32:53:24
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09

New way: 16:15:43
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16

That's running sifaka's full test schedule including TAP tests,
which ordinarily takes it a shade under 10 minutes.  The savings
on a non-TAP run would be a lot less, of course, thanks to
fewer initdb invocations.

Although I lacked the patience to run a full back-branch cycle
with -DCLOBBER_CACHE_ALWAYS, I did verify that the patch
correctly injects that #define when running an old branch.
So I think it's ready to go into the buildfarm, modulo any
cosmetic work you might want to do.

            regards, tom lane



Re: Reducing the cycle time for CLOBBER_CACHE_ALWAYS buildfarm members

From
Andrew Dunstan
Date:
On 7/3/21 6:59 PM, Tom Lane wrote:
> I wrote:
>> Andrew Dunstan <andrew@dunslane.net> writes:
>>> Seems reasonable. I don't have a CCA animal any more, but I guess I
>>> could set up a test.
>> I can run a test here --- I'll commandeer sifaka for awhile,
>> since that's the fastest animal I have.
> Done, and here's the results:
>
> Traditional way (-DCLOBBER_CACHE_ALWAYS): 32:53:24
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-01%2018%3A06%3A09
>
> New way: 16:15:43
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sifaka&dt=2021-07-03%2004%3A02%3A16
>
> That's running sifaka's full test schedule including TAP tests,
> which ordinarily takes it a shade under 10 minutes.  The savings
> on a non-TAP run would be a lot less, of course, thanks to
> fewer initdb invocations.
>
> Although I lacked the patience to run a full back-branch cycle
> with -DCLOBBER_CACHE_ALWAYS, I did verify that the patch
> correctly injects that #define when running an old branch.
> So I think it's ready to go into the buildfarm, modulo any
> cosmetic work you might want to do.
>
>             




Yeah, I'm looking at it now. A couple of things: I think we should
probably call the setting 'use_clobber_cache_always' since that's what
it does. And I think we should probably put in a sanity check to make it
incompatible with any -DCLOBBER_CACHE_* define in CPPFLAGS.


Thoughts?


There is one item  I want to complete before putting out a new client
release - making provision for a change in the name of the default git
branch - the aim is that with the new release in place that will be
completely seamless whenever it happens and whatever name is chosen. I
hope to have that done in a week or so., so the new release would be out
in about two weeks, if all goes well.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com




Andrew Dunstan <andrew@dunslane.net> writes:
> On 7/3/21 6:59 PM, Tom Lane wrote:
>> So I think it's ready to go into the buildfarm, modulo any
>> cosmetic work you might want to do.

> Yeah, I'm looking at it now. A couple of things: I think we should
> probably call the setting 'use_clobber_cache_always' since that's what
> it does. And I think we should probably put in a sanity check to make it
> incompatible with any -DCLOBBER_CACHE_* define in CPPFLAGS.

> Thoughts?

No objections here.

            regards, tom lane