[HACKERS] WIP patch for avoiding duplicate initdb runs during "make check" - Mailing list pgsql-hackers

From Tom Lane
Subject [HACKERS] WIP patch for avoiding duplicate initdb runs during "make check"
Date
Msg-id 19567.1499016819@sss.pgh.pa.us
Whole thread Raw
Responses Re: [HACKERS] WIP patch for avoiding duplicate initdb runs during "make check"
Re: [HACKERS] WIP patch for avoiding duplicate initdb runs during"make check"
List pgsql-hackers
Yesterday I spent a bit of time on an idea that we've talked about
before, which is to not run initdb over and over again in contexts like
"make check-world", or even just during "make check" in contrib or in the
recovery or subscription tests.  The idea would be to do it once and
then copy the created data directory tree into place for each test
run.  Thus we might trade this:

$ time initdb -D $PGDATA -N
...
real    0m1.235s
user    0m0.896s
sys     0m0.341s

for this:

$ time cp -a $PGDATA junk
real    0m0.146s
user    0m0.008s
sys     0m0.136s

The attached patch is far from being committable, but it's complete enough
to get an idea of what sort of overall performance gain we might expect.
On my workstation, I've lately been doing check-world like this:
    time make -s check-world -j4 PROVE_FLAGS='-j4'
and what I see is that HEAD takes about this long:
    real    2m1.514s
    user    2m8.113s
    sys     1m15.993s
and with this patch I get
    real    1m42.549s
    user    0m42.888s
    sys     0m46.982s
(These are median-of-3-runs; the real time bounces around a fair bit,
the CPU times not very much.)

On my laptop, a simple non-parallelized "time make check-world" takes
    real    9m12.813s
    user    2m15.113s
    sys     1m25.631s
and with patch
    real    8m3.785s
    user    1m19.024s
    sys     1m13.707s

So those are respectable gains, but not earth-shattering.  I'm not
sure if it's worth putting more work into it right now.  The main
thing that's needed to make it committable is to get rid of the
system("cp -a ...") call in favor of something more portable.
I looked longingly at our existing copydir() function but it seems
pretty tied to the backend environment.  It wouldn't be hard to
make a version for frontend, but I'm not going to expend that work
right now unless people are excited enough to want to commit this
now rather than later (or not at all).

(Other unfinished work: teaching the MSVC scripts to use this,
and teaching pg_upgrade's test script to use it.)

            regards, tom lane

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index dc8a89a..a28a03a 100644
*** a/src/Makefile.global.in
--- b/src/Makefile.global.in
*************** ifeq ($(MAKELEVEL),0)
*** 332,337 ****
--- 332,338 ----
      rm -rf '$(abs_top_builddir)'/tmp_install
      $(MKDIR_P) '$(abs_top_builddir)'/tmp_install/log
      $(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install
>'$(abs_top_builddir)'/tmp_install/log/install.log2>&1 
+     '$(abs_top_builddir)/tmp_install$(bindir)/initdb' -D '$(abs_top_builddir)/tmp_install/proto_pgdata' --no-clean
--no-sync-A trust >'$(abs_top_builddir)'/tmp_install/log/initdb.log 2>&1 
  endif
      $(if $(EXTRA_INSTALL),for extra in $(EXTRA_INSTALL); do $(MAKE) -C '$(top_builddir)'/$$extra
DESTDIR='$(abs_top_builddir)'/tmp_installinstall >>'$(abs_top_builddir)'/tmp_install/log/install.log || exit; done) 
  endif
*************** $(if $(filter $(PORTNAME),darwin),DYLD_L
*** 355,361 ****
  endef

  define with_temp_install
! PATH="$(abs_top_builddir)/tmp_install$(bindir):$$PATH" $(call
add_to_path,$(ld_library_path_var),$(abs_top_builddir)/tmp_install$(libdir))
  endef

  ifeq ($(enable_tap_tests),yes)
--- 356,362 ----
  endef

  define with_temp_install
! PATH="$(abs_top_builddir)/tmp_install$(bindir):$$PATH" $(call
add_to_path,$(ld_library_path_var),$(abs_top_builddir)/tmp_install$(libdir))
PG_PROTO_DATADIR='$(abs_top_builddir)/tmp_install/proto_pgdata'
  endef

  ifeq ($(enable_tap_tests),yes)
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index f4fa600..22c1a72 100644
*** a/src/test/perl/PostgresNode.pm
--- b/src/test/perl/PostgresNode.pm
*************** sub init
*** 404,411 ****
      mkdir $self->backup_dir;
      mkdir $self->archive_dir;

!     TestLib::system_or_bail('initdb', '-D', $pgdata, '-A', 'trust', '-N',
!         @{ $params{extra} });
      TestLib::system_or_bail($ENV{PG_REGRESS}, '--config-auth', $pgdata);

      open my $conf, '>>', "$pgdata/postgresql.conf";
--- 404,426 ----
      mkdir $self->backup_dir;
      mkdir $self->archive_dir;

!     # If we're using default initdb parameters, and the top-level "make check"
!     # created a prototype data directory for us, just copy that.
!     if (!defined($params{extra}) &&
!         defined($ENV{PG_PROTO_DATADIR}) &&
!         -d $ENV{PG_PROTO_DATADIR})
!     {
!         rmdir($pgdata);
!         RecursiveCopy::copypath($ENV{PG_PROTO_DATADIR}, $pgdata);
!         chmod(0700, $pgdata);
!     }
!     else
!     {
!         TestLib::system_or_bail('initdb', '-D', $pgdata,
!                                 '--no-clean', '--no-sync', '-A', 'trust',
!                                 @{ $params{extra} });
!     }
!
      TestLib::system_or_bail($ENV{PG_REGRESS}, '--config-auth', $pgdata);

      open my $conf, '>>', "$pgdata/postgresql.conf";
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index abb742b..04d7fb9 100644
*** a/src/test/regress/pg_regress.c
--- b/src/test/regress/pg_regress.c
*************** regression_main(int argc, char *argv[],
*** 2214,2219 ****
--- 2214,2221 ----

      if (temp_instance)
      {
+         char       *pg_proto_datadir;
+         struct stat ppst;
          FILE       *pg_conf;
          const char *env_wait;
          int            wait_seconds;
*************** regression_main(int argc, char *argv[],
*** 2243,2262 ****
          if (!directory_exists(buf))
              make_directory(buf);

!         /* initdb */
!         header(_("initializing database system"));
!         snprintf(buf, sizeof(buf),
!                  "\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync%s%s > \"%s/log/initdb.log\" 2>&1",
!                  bindir ? bindir : "",
!                  bindir ? "/" : "",
!                  temp_instance,
!                  debug ? " --debug" : "",
!                  nolocale ? " --no-locale" : "",
!                  outputdir);
!         if (system(buf))
          {
!             fprintf(stderr, _("\n%s: initdb failed\nExamine %s/log/initdb.log for the reason.\nCommand was: %s\n"),
progname,outputdir, buf); 
!             exit(2);
          }

          /*
--- 2245,2287 ----
          if (!directory_exists(buf))
              make_directory(buf);

!         /* see if we should run initdb or just copy a prototype datadir */
!         pg_proto_datadir = getenv("PG_PROTO_DATADIR");
!         if (!nolocale &&
!             pg_proto_datadir &&
!             stat(pg_proto_datadir, &ppst) == 0 &&
!             S_ISDIR(ppst.st_mode))
          {
!             /* copy prototype */
!             header(_("copying prototype data directory"));
!             snprintf(buf, sizeof(buf),
!                      "cp -a \"%s\" \"%s/data\" > \"%s/log/initdb.log\" 2>&1",
!                      pg_proto_datadir,
!                      temp_instance,
!                      outputdir);
!             if (system(buf))
!             {
!                 fprintf(stderr, _("\n%s: cp failed\nExamine %s/log/initdb.log for the reason.\nCommand was: %s\n"),
progname,outputdir, buf); 
!                 exit(2);
!             }
!         }
!         else
!         {
!             /* run initdb */
!             header(_("initializing database system"));
!             snprintf(buf, sizeof(buf),
!                      "\"%s%sinitdb\" -D \"%s/data\" --no-clean --no-sync%s%s > \"%s/log/initdb.log\" 2>&1",
!                      bindir ? bindir : "",
!                      bindir ? "/" : "",
!                      temp_instance,
!                      debug ? " --debug" : "",
!                      nolocale ? " --no-locale" : "",
!                      outputdir);
!             if (system(buf))
!             {
!                 fprintf(stderr, _("\n%s: initdb failed\nExamine %s/log/initdb.log for the reason.\nCommand was:
%s\n"),progname, outputdir, buf); 
!                 exit(2);
!             }
          }

          /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Race-like failure in recovery/t/009_twophase.pl