Re: "pg_ctl: the PID file ... is empty" at end of make check - Mailing list pgsql-hackers

From Tom Lane
Subject Re: "pg_ctl: the PID file ... is empty" at end of make check
Date
Msg-id 9628.1543379310@sss.pgh.pa.us
Whole thread Raw
In response to "pg_ctl: the PID file ... is empty" at end of make check  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: "pg_ctl: the PID file ... is empty" at end of make check  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
Thomas Munro <thomas.munro@enterprisedb.com> writes:
> Today I saw a one-off case of $SUBJECT, on macOS.  I can't reproduce
> it, but I noticed exactly the same thing on longfin the other day:
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-25%2005%3A39%3A04

I trawled the buildfarm logs and discovered a second instance of exactly
the same thing:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=longfin&dt=2018-11-19%2018%3A37%3A00

There have not been any other occurrences in the past 3 months, which is
as far back as I went.  (lorikeet has half a dozen occurrences of "could
not stop postmaster", which is what I was grepping for, but they all
are associated with that machine's intermittent postmaster crashes.)

So that lets out the flaky-hardware theory: that occurrence is before
longfin's hardware transplant.

Also, I don't think I believe the OS-bug idea either, given that you
saw it on 10.14.0.  longfin's been running 10.14.something since
2018-09-26, and has accumulated circa 200 runs since then just on HEAD,
never mind the back branches.  It'd be pretty unlikely to see it only
in the past week, and only on HEAD, if it were an OS bug introduced two
months ago.

So my theory is we broke something in HEAD a couple weeks ago.  But what?

The fsync changes you made are suspiciously close to this issue (ie one
could explain it as written data not getting out), and were committed in
the right time frame, but that change didn't affect writes to
postmaster.pid did it?

            regards, tom lane


pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Planning time of Generic plan for a table partitioned into a lot
Next
From: Etsuro Fujita
Date:
Subject: Re: postgres_fdw: oddity in costing aggregate pushdown paths