On 8/3/19 6:42 PM, Tom Lane wrote:
> I wrote:
>> * kerberos/t/001_auth.pl just blithely assumes that it can pick
>> any random port above 48K and that's guaranteed to be free.
>> Maybe we should split out the code in get_new_node for finding
>> a free TCP port, so we can call it here?
> I've confirmed that the reason it's failing on my machine is exactly
> that krb5kdc tries to bind to a socket that is still in TIME_WAIT state.
> Also, it looks like the socket is typically one that was used by the
> GSSAPI client side (no surprise, the test leaves a lot more of those
> than the one server socket), so we'd have no record of it even if we
> were somehow saving state from prior runs.
>
> So I propose the attached patch, which seems to fix this for me.
>
> The particular case I'm looking at (running these tests in a tight
> loop) is of course not that interesting, but I argue that it's just
> increasing the odds of failure enough that I can isolate the cause.
> A buildfarm animal running both kerberos and ldap tests is almost
> certainly at risk of such a failure with low probability.
>
> (Still don't know what actually happened in those two buildfarm
> failures, though.)
>
>
Looks good. A couple of minor nits:
. since we're exporting the name there's no need to document it as a
class method. I'd remove the "PostgresNode->" from the couple of places
you have it in the docco. You're not actually calling it that way
anywhere, and indeed doing so ends up passing 'PostgresNode' as a
useless parameter to the subroutine. This is different from calling it
with a qualified name (PostgresNode::get_free_port()).
. in the inner loop we should probably exit the loop if we set found to
0. There's no point testing other addresses in that case. Something like
"last unless found;" would do the trick.
cheers
andrew
--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services