On 01/13/2013 10:58 AM, Kevin Grittner wrote:
> Andrew Dunstan wrote:
>
>> Part of the trouble with detecting rogue postmasters it might have left
>> lying around is that various things like to decide what port to run on,
>> so it's not always easy for the buildfarm to know what it should be
>> looking for.
> For Linux, perhaps some form of lsof with the +D option? Maybe?:
>
> lsof +D "$PGDATA" -Fp | grep -E '^p[0-9]{1,5}$' | cut -c1- | xargs kill -9
>
This actually won't help. In most cases the relevant data directory has
long disappeared out from under the rogue postmaster as part of
buildfarm cleanup. Also, lsof is not universally available. We try to
avoid creating new dependencies if possible.
Yesterday I committed a change that will let the buildfarm client ensure
that all the tests it runs are run on the configured build port.
Given that, we can should be able reliably to detect a rogue postmaster
by testing for the existence of a socket at /tmp/.S.PGSQL.$buildport.
Certainly, having something there will cause a failure. I currently have
this test running both before a run starts and after it finishes on the
buildfarm development instance (crake), using perl's -S operator. If it
fails there will be a buildfarm failure on stage Pre-run-port-check or
Post-run-port-check.
For the pre-run check I'm not inclined to do anything. If there's a
pre-existing listener on the required port it's an error and we'll just
abort, before we even try a checkout let alone anything else.
For the post-run check, we could possibly do something like
fuser -k /tmp/.s.PGSQL.$buildport
although that's not portable either ;-( .
None of this helps for msvc or mingw builds where there's no unix
socket, and I'll have to come up with another idea. But it's a start.
cheers
andrew