Re: Race to build pg_isolation_regress in "make -j check-world" - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Race to build pg_isolation_regress in "make -j check-world"
Date
Msg-id 20180102013956.GA898314@rfd.leadboat.com
Whole thread Raw
In response to Re: Race to build pg_isolation_regress in "make -j check-world"  (Noah Misch <noah@leadboat.com>)
Responses Re: Race to build pg_isolation_regress in "make -j check-world"
List pgsql-hackers
On Sat, Dec 16, 2017 at 08:06:05PM -0800, Noah Misch wrote:
> On Mon, Nov 06, 2017 at 12:07:52AM -0800, Noah Misch wrote:

> I now see similar trouble from multiple
> "make" processes running "make -C contrib/test_decoding install" concurrently.
> This is a risk for any directory named in an EXTRA_INSTALL variable of more
> than one makefile.  Under the right circumstances, this would affect
> contrib/hstore and others in addition to contrib/test_decoding.  That brings
> me back to the locking idea:
> 
> > The problem of multiple "make" processes in a directory (especially src/port)
> > shows up elsewhere.  In a cleaned tree, "make -j -C src/bin" or "make -j
> > installcheck-world" will do it.  For more-prominent use cases, src/Makefile
> > prevents this with ".NOTPARALLEL:" and building first the directories that are
> > frequent submake targets.  Perhaps we could fix the general problem with
> > directory locking; targets that call "$(MAKE) -C FOO" would first sleep until
> > FOO's lock is available.  That could be tricky to make robust.
> 
> If one is willing to assume that a lock-holding process never crashes, locking
> in a shell script is simple: mkdir to lock, rmdir to unlock.  I don't want to
> assume that.  The bakery algorithm provides convenient opportunities for
> checking whether the last locker crashed; I have attached a shell script
> demonstrating this approach.  Better ideas?  Otherwise, I'll look into
> integrating this design into the makefiles.

Performance has been the principal snare.  I value "make -j" being fast when
there's little to rebuild, but that shell script approach slowed an empty
build by 340% (GNU/Linux w/ SSD) to 2300% (Cygwin).  In a build having nothing
to do, merely adding a no-op wrapper around "make -C" (does nothing but
execvp() the real GNU make) slowed the build by over 10%[1].  To get what I
considered decent performance took several design changes:

1. Replace the shell script implementation of the bakery algorithm with a C
   program that issues fcntl(F_SETLK) on the target directory's makefile.

2. Stop re-entering widely-referenced directories like src/common and src/port
   dozens of times per build.  Instead, enter them before any "make -C", then
   assume they're built if $(MAKELEVEL) is nonzero.  This reduced the total
   number of "make -C" calls and lock acquisitions from 276 to 147.

3. Lock only directories known to be entered more than once per top-level
   target.  Preliminarily, this reduced the 147 lock acquisitions to 25.

I regret that (3) entails ongoing maintenance of a whitelist of such
directories; adding a "make -C" call can expand the list.  However, I can
automate verification of that whitelist.  If I abandon (3), an empty build in
an NFS-mounted source directory slows from 6.4s to 22s; with all three design
changes intact, it slows to 6.6s.  I think that justifies keeping (3).

I considered abandoning (1).  An empty build would then slow from 6.6s to 7.7s
on NFS-mounted sources, and it would slow from 2.1s to 6.8s on Cygwin.  (I
expect similar on MSYS.)  That is perhaps acceptable.  It would save a few
lines of code and perhaps avoid portability snags.  Unlike abandoning (3), I
would not expect long-term maintenance savings.  For systems with low PID
entropy[2], (1) also eliminates unreliable code for detecting that a lock
holder has died.  I plan to keep all three design changes.


The non-performance problems that arose were easy to fix:

a. In addition to avoiding overlapping "make install" invocations in each
   directory, we must ensure no directory is installed more than once per
   tmp_install.  Otherwise, the second "make install" could unlink or
   overwrite a file while another process runs a test that uses the file.  I
   fixed this with stamp files like we use elsewhere in Makefile.global.

b. Our "make -C" call graph has cycles[3].  To avoid deadlocks, I accumulated
   lock-holding PIDs in an environment variable.  If the holder of a desired
   lock appears in that variable, then that holder is sleeping until after the
   present process completes.  We can then proceed without a lock acquisition.

Comments?  Otherwise, I'll look at finishing the patch series.

Thanks,
nm


[1] I was surprised to see that GNU make is this efficient at determining
    there's nothing to rebuild.

[2] For example, Cygwin reuses PIDs as often as every 645 processes.  Also,
    Cygwin processes have both a Cygwin PID and a Windows PID, and kill() can
    match either PID.  The shell script relied on kill(PID, 0) to deduce that
    a lock holder had died.  We may have ended up needing a secondary factor,
    like process start time, on such systems.

[3] For example, src/backend depends on libpgport_srv.a, and src/port depends
    on submake-errcodes of src/backend.


pgsql-hackers by date:

Previous
From: Gavin Flower
Date:
Subject: Re: What does Time.MAX_VALUE actually represent?
Next
From: Nikita Glukhov
Date:
Subject: Re: [HACKERS] SQL/JSON in PostgreSQL