Re: A couple of random BF failures in kerberosCheck - Mailing list pgsql-hackers

From Andrew Dunstan
Subject Re: A couple of random BF failures in kerberosCheck
Date
Msg-id 680020e8-b659-2405-1d94-76696ba7747f@2ndQuadrant.com
Whole thread Raw
In response to Re: A couple of random BF failures in kerberosCheck  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: A couple of random BF failures in kerberosCheck
List pgsql-hackers
On 8/3/19 6:42 PM, Tom Lane wrote:
> I wrote:
>> * kerberos/t/001_auth.pl just blithely assumes that it can pick
>> any random port above 48K and that's guaranteed to be free.
>> Maybe we should split out the code in get_new_node for finding
>> a free TCP port, so we can call it here?
> I've confirmed that the reason it's failing on my machine is exactly
> that krb5kdc tries to bind to a socket that is still in TIME_WAIT state.
> Also, it looks like the socket is typically one that was used by the
> GSSAPI client side (no surprise, the test leaves a lot more of those
> than the one server socket), so we'd have no record of it even if we
> were somehow saving state from prior runs.
>
> So I propose the attached patch, which seems to fix this for me.
>
> The particular case I'm looking at (running these tests in a tight
> loop) is of course not that interesting, but I argue that it's just
> increasing the odds of failure enough that I can isolate the cause.
> A buildfarm animal running both kerberos and ldap tests is almost
> certainly at risk of such a failure with low probability.
>
> (Still don't know what actually happened in those two buildfarm
> failures, though.)
>
>             


Looks good. A couple of minor nits:


. since we're exporting the name there's no need to document it as a
class method. I'd remove the "PostgresNode->" from the couple of places
you have it in the docco. You're not actually calling it that way
anywhere, and indeed doing so ends up passing 'PostgresNode' as a
useless parameter to the subroutine. This is different from calling it
with a qualified name (PostgresNode::get_free_port()).

. in the inner loop we should probably exit the loop if we set found to
0. There's no point testing other addresses in that case. Something like
"last unless found;" would do the trick.


cheers


andrew


-- 
Andrew Dunstan                https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: pglz performance
Next
From: "Jonathan S. Katz"
Date:
Subject: Re: First draft of back-branch release notes is done