"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> A look at the code shows that it is merely trying to run psql, and
>> if psql reports that it can connect to the specified port, then
>> pg_regress thinks the postmaster started OK. Of course, psql was
>> really reporting that it could connect to the other instance's
>> postmaster.
> Clearly picking unique ports for `make check` is the ultimate
> solution, but I'm curious whether this would have been caught sooner
> with less effort if the pg_ctl TODO titled "Have the postmaster
> write a random number to a file on startup that pg_ctl checks
> against the contents of a pg_ping response on its initial connection
> (without login)" had been implemented.
It would certainly make the failure more transparent. As I mentioned,
there are previous buildfarm failures that look like they might be
caused by a similar conflict, but it's seldom possible to be sure.
A cross-check like that would be much safer.
BTW, I don't know why anyone would think that "a random number" would
offer any advantage here. I'd use the postmaster PID, which is
guaranteed to be unique across the space that you're worried about.
In fact, you could implement this off the existing postmaster.pid,
no need for any new file. What's lacking is the pg_ping protocol.
regards, tom lane