Thread: buildfarm server suddenly not talking to old SSL stacks?

buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
My buildfarm animals dromedary and prairiedog have been failing since
around 9AM EDT on Sunday.  The buildfarm script output isn't very
detailed:

getting branches of interest (https://buildfarm.postgresql.org/branches_of_inte\
rest.txt) at ./run_branches.pl line 129.

but trying it manually yields

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

The same thing works fine on newer machines though, as does fetching with
http: instead of https:.  Have we done something recently to create an
incompatibility with old SSL stacks?

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Alvaro Herrera
Date:
On 2018-Jul-16, Tom Lane wrote:

> My buildfarm animals dromedary and prairiedog have been failing since
> around 9AM EDT on Sunday.  The buildfarm script output isn't very
> detailed:
> 
> getting branches of interest (https://buildfarm.postgresql.org/branches_of_inte\
> rest.txt) at ./run_branches.pl line 129.
> 
> but trying it manually yields
> 
> $ curl https://buildfarm.postgresql.org/branches_of_interest.txt
> curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
> 
> The same thing works fine on newer machines though, as does fetching with
> http: instead of https:.  Have we done something recently to create an
> incompatibility with old SSL stacks?

Yeah, there were a few updates that day at 11am UTC; particularly the
ca-certificates package was updated (to version 20161130+nmu1+deb9u1).
I don't know why this would be significant (is the server trying to
verify the client's cert?), but here's the changelog:

ca-certificates (20161130+nmu1+deb9u1) stretch; urgency=medium

  * debian/ca-certificates.postinst:
    Prevent postinst failure on read-only /usr/local. Closes: #843722
  * debian/control:
    Remove Christian Perrier from uploaders at his request. Closes: #894070
  * mozilla/{certdata.txt,nssckbi.h}:
    Update Mozilla certificate authority bundle to version 2.22.
    Closes: #858064
    The following certificate authorities were added (+):
    + "AC RAIZ FNMT-RCM"
    + "Amazon Root CA 1"
    + "Amazon Root CA 2"
    + "Amazon Root CA 3"
    + "Amazon Root CA 4"
    + "D-TRUST Root CA 3 2013"
    + "GDCA TrustAUTH R5 ROOT"
    + "LuxTrust Global Root 2"
    + "SSL.com EV Root Certification Authority ECC"
    + "SSL.com EV Root Certification Authority RSA R2"
    + "SSL.com Root Certification Authority ECC"
    + "SSL.com Root Certification Authority RSA"
    + "Symantec Class 1 Public Primary Certification Authority - G4"
    + "Symantec Class 1 Public Primary Certification Authority - G6"
    + "Symantec Class 2 Public Primary Certification Authority - G4"
    + "Symantec Class 2 Public Primary Certification Authority - G6"
    + "TrustCor ECA-1"
    + "TrustCor RootCert CA-1"
    + "TrustCor RootCert CA-2"
    + "TUBITAK Kamu SM SSL Kok Sertifikasi - Surum 1"
    The following certificate authorities were removed (-):
    - "ACEDICOM Root"
    - "AddTrust Public Services Root"
    - "AddTrust Qualified Certificates Root"
    - "ApplicationCA - Japanese Government"
    - "Buypass Class 2 CA 1"
    - "CA Disig Root R1"
    - "Certinomis - Autorité Racine"
    - "China Internet Network Information Center EV Certificates Root"
    - "CNNIC ROOT"
    - "Comodo Secure Services root"
    - "Comodo Trusted Services root"
    - "DST ACES CA X6"
    - "EBG Elektronik Sertifika Hizmet Saglayicisi"
    - "Equifax Secure CA"
    - "Equifax Secure eBusiness CA 1"
    - "Equifax Secure Global eBusiness CA"
    - "GeoTrust Global CA 2"
    - "IGC/A"
    - "Juur-SK"
    - "Microsec e-Szigno Root CA"
    - "PSCProcert"
    - "Root CA Generalitat Valenciana"
    - "RSA Security 2048 v3"
    - "Security Communication EV RootCA1"
    - "S-TRUST Authentication and Encryption Root CA 2005 PN"
    - "Swisscom Root CA 1"
    - "Swisscom Root EV CA 2"
    - "TUBITAK UEKAE Kok Sertifika Hizmet Saglayicisi - Surum 3"
    - "TURKTRUST Certificate Services Provider Root 2007"
    - "TÜRKTRUST Elektronik Sertifika Hizmet Sağlayıcısı H6"
    - "UTN USERFirst Hardware Root CA"
    - "Verisign Class 1 Public Primary Certification Authority"
    - "Verisign Class 2 Public Primary Certification Authority - G2"
    - "Verisign Class 3 Public Primary Certification Authority"
    - "WellsSecure Public Root Certificate Authority"

 -- Michael Shuler <michael@pbandjelly.org>  Sat, 07 Jul 2018 01:08:40 +0200


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Jul-16, Tom Lane wrote:
>> My buildfarm animals dromedary and prairiedog have been failing since
>> around 9AM EDT on Sunday. ... Have we done something recently to create an
>> incompatibility with old SSL stacks?

> Yeah, there were a few updates that day at 11am UTC; particularly the
> ca-certificates package was updated (to version 20161130+nmu1+deb9u1).

Ah, that sounds plausibly related.  Guess I need a certificate update
on those machines.  Thanks!

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 7:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> On 2018-Jul-16, Tom Lane wrote:
>> My buildfarm animals dromedary and prairiedog have been failing since
>> around 9AM EDT on Sunday. ... Have we done something recently to create an
>> incompatibility with old SSL stacks?

> Yeah, there were a few updates that day at 11am UTC; particularly the
> ca-certificates package was updated (to version 20161130+nmu1+deb9u1).

Ah, that sounds plausibly related.  Guess I need a certificate update
on those machines.  Thanks!

We also changed some of the server setup so there is now a haproxy that's doing the SSL termination. So there is probably a slightly different configuration of available SSL algorithms and such as well. It might be either one of those two, both changes happened not too far apart on that day. 

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Adrian Klaver
Date:
On 07/16/2018 08:31 PM, Tom Lane wrote:
> My buildfarm animals dromedary and prairiedog have been failing since
> around 9AM EDT on Sunday.  The buildfarm script output isn't very
> detailed:
> 
> getting branches of interest (https://buildfarm.postgresql.org/branches_of_inte\
> rest.txt) at ./run_branches.pl line 129.
> 
> but trying it manually yields
> 
> $ curl https://buildfarm.postgresql.org/branches_of_interest.txt
> curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
> 
> The same thing works fine on newer machines though, as does fetching with
> http: instead of https:.  Have we done something recently to create an
> incompatibility with old SSL stacks?

Maybe something to do with this?:

https://blog.pcisecuritystandards.org/are-you-ready-for-30-june-2018-sayin-goodbye-to-ssl-early-tls


> 
>             regards, tom lane
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 3:20 PM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:
On 07/16/2018 08:31 PM, Tom Lane wrote:
My buildfarm animals dromedary and prairiedog have been failing since
around 9AM EDT on Sunday.  The buildfarm script output isn't very
detailed:

getting branches of interest (https://buildfarm.postgresql.org/branches_of_inte\
rest.txt) at ./run_branches.pl line 129.

but trying it manually yields

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

The same thing works fine on newer machines though, as does fetching with
http: instead of https:.  Have we done something recently to create an
incompatibility with old SSL stacks?

Maybe something to do with this?:

https://blog.pcisecuritystandards.org/are-you-ready-for-30-june-2018-sayin-goodbye-to-ssl-early-tls

Our buildfarm does not require PCI classification :P 


--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> My buildfarm animals dromedary and prairiedog have been failing since
>>> around 9AM EDT on Sunday. ... Have we done something recently to create
>>> an incompatibility with old SSL stacks?

> We also changed some of the server setup so there is now a haproxy that's
> doing the SSL termination. So there is probably a slightly different
> configuration of available SSL algorithms and such as well. It might be
> either one of those two, both changes happened not too far apart on that
> day.

Hm.  Closer investigation suggests that there's something else wrong.
While, as I said, curl works for non-SSL connections:

$ curl http://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

doing the same thing the way the buildfarm script does it does not work:

$ perl -MLWP::Simple -e 'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:80 (No route to host)
<URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

That's on dromedary's host with perl 5.10.0.  Even weirder, it
*does* work on prairiedog's host with perl 5.8.3.  I think that the
latter installation is newer and hence may have newer copies of
some CPAN-supplied modules, but I'm not sure how to debug further.

Also, on prairiedog's host, this is what I get for the https case:

$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:443 <URL:https://buildfarm.postgresql.org/branches_of_interest.txt>

which isn't terribly informative but it doesn't look like an SSL
certificate failure.

I've temporarily revived prairiedog by changing its config to report
to http not https.  But dromedary is dead in the water until this
gets sorted.

BTW, Noah's AIX critters may be suffering from the same problem;
I'd have expected them to report in by now on recent HEAD changes...

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Alvaro Herrera
Date:
On 2018-Jul-17, Tom Lane wrote:


> doing the same thing the way the buildfarm script does it does not work:
> 
> $ perl -MLWP::Simple -e 'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host)
<URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

I don't know if Varnish catches https calls as well as http, but if it
does, it could very well be related.  A Varnish cache was added recently
to buildfarm.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> My buildfarm animals dromedary and prairiedog have been failing since
>>> around 9AM EDT on Sunday. ... Have we done something recently to create
>>> an incompatibility with old SSL stacks?

> We also changed some of the server setup so there is now a haproxy that's
> doing the SSL termination. So there is probably a slightly different
> configuration of available SSL algorithms and such as well. It might be
> either one of those two, both changes happened not too far apart on that
> day.

Hm.  Closer investigation suggests that there's something else wrong.
While, as I said, curl works for non-SSL connections:

$ curl http://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

doing the same thing the way the buildfarm script does it does not work:

$ perl -MLWP::Simple -e 'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:80 (No route to host) <URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

OK, that's just weird. It's failing to connect on port *80* with a "No route to host" error? That sounds more like it would be on a network layer?

I could understand many weird errors on it, but no route to host seems extremely weird. Almost indicates it would be connecting to the wrong IP.


That's on dromedary's host with perl 5.10.0.  Even weirder, it
*does* work on prairiedog's host with perl 5.8.3.  I think that the
latter installation is newer and hence may have newer copies of
some CPAN-supplied modules, but I'm not sure how to debug further.

Also, on prairiedog's host, this is what I get for the https case:

$ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:443 <URL:https://buildfarm.postgresql.org/branches_of_interest.txt>

which isn't terribly informative but it doesn't look like an SSL
certificate failure.

That one I believe more in since it could be because of SSL issues. What do you get with curl on that one?


--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 7:08 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
On 2018-Jul-17, Tom Lane wrote:


> doing the same thing the way the buildfarm script does it does not work:
>
> $ perl -MLWP::Simple -e 'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host) <URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

I don't know if Varnish catches https calls as well as http, but if it
does, it could very well be related.  A Varnish cache was added recently
to buildfarm.


https is terminated in haproxy and relayed from there. Varnish doesn't speak native https. 

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Also, on prairiedog's host, this is what I get for the https case:
>>
>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>> https://buildfarm.postgresql.org/branches_of_interest.txt");'
>> 500 Can't connect to buildfarm.postgresql.org:443 <URL:https://buildfarm.
>> postgresql.org/branches_of_interest.txt>
>>
>> which isn't terribly informative but it doesn't look like an SSL
>> certificate failure.

> That one I believe more in since it could be because of SSL issues. What do
> you get with curl on that one?

Both machines show the same behavior with curl:

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version
$ curl http://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

Now, curl is the OS-supplied one and probably isn't sharing any userspace
infrastructure at all with prairiedog's Perl stack.  On the other hand,
dromedary is using Apple's perl installation so it's possible that it
shares root-certificate infrastructure with curl.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Also, on prairiedog's host, this is what I get for the https case:
>>
>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>> https://buildfarm.postgresql.org/branches_of_interest.txt");'
>> 500 Can't connect to buildfarm.postgresql.org:443 <URL:https://buildfarm.
>> postgresql.org/branches_of_interest.txt>
>>
>> which isn't terribly informative but it doesn't look like an SSL
>> certificate failure.

> That one I believe more in since it could be because of SSL issues. What do
> you get with curl on that one?

Both machines show the same behavior with curl:

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert protocol version

Ah. Some googling shows that does seem to indicate an old version of OpenSSL.

The old config rejected sslv2 and sslv3, but allowed tlsv1.

The new one refuses both tlsv1 and tlsv1.1, allowing only tlsv1.2.

As a check if this might be it, I have at least temporarily removed that restriction. Can you try again now?

 
$ curl http://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

Now, curl is the OS-supplied one and probably isn't sharing any userspace
infrastructure at all with prairiedog's Perl stack.  On the other hand,
dromedary is using Apple's perl installation so it's possible that it
shares root-certificate infrastructure with curl.

 

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Both machines show the same behavior with curl:
>> $ curl https://buildfarm.postgresql.org/branches_of_interest.txt
>> curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert
>> protocol version

> Ah. Some googling shows that does seem to indicate an old version of
> OpenSSL.
> The old config rejected sslv2 and sslv3, but allowed tlsv1.
> The new one refuses both tlsv1 and tlsv1.1, allowing only tlsv1.2.
> As a check if this might be it, I have at least temporarily removed that
> restriction. Can you try again now?

Same results, both via curl and via perl.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 7:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Both machines show the same behavior with curl:
>> $ curl https://buildfarm.postgresql.org/branches_of_interest.txt
>> curl: (35) error:1407742E:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert
>> protocol version

> Ah. Some googling shows that does seem to indicate an old version of
> OpenSSL.
> The old config rejected sslv2 and sslv3, but allowed tlsv1.
> The new one refuses both tlsv1 and tlsv1.1, allowing only tlsv1.2.
> As a check if this might be it, I have at least temporarily removed that
> restriction. Can you try again now?

Same results, both via curl and via perl.

Ha. I changed the client config instead of the server :/ Sorry about that, once more? 

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Magnus Hagander <magnus@hagander.net> writes:
>>> The old config rejected sslv2 and sslv3, but allowed tlsv1.
>>> The new one refuses both tlsv1 and tlsv1.1, allowing only tlsv1.2.
>>> As a check if this might be it, I have at least temporarily removed that
>>> restriction. Can you try again now?

>> Same results, both via curl and via perl.

> Ha. I changed the client config instead of the server :/ Sorry about that,
> once more?

Better.  On prairiedog, I now get

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
curl: (60) SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed
More details here: http://curl.haxx.se/docs/sslcerts.html

curl performs SSL certificate verification by default, using a "bundle"
 of Certificate Authority (CA) public keys (CA certs). The default
 bundle is named curl-ca-bundle.crt; you can specify an alternate file
 using the --cacert option.
If this HTTPS server uses a certificate signed by a CA represented in
 the bundle, the certificate verification probably failed due to a
 problem with the certificate (it might be expired, or the name might
 not match the domain name in the URL).
If you'd like to turn off curl's verification of the certificate, use
 the -k (or --insecure) option.

and with -k it actually works:

$ curl -k https://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

and what's more useful for the purpose at hand, so does perl:

$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

The fact that Perl is happy may have something to do with my having just
updated Mozilla::CA on these machines, which so far as I can find is Perl's
only source of root certs.  But curl is using the OS' keystore which of
course is horribly behind the times.

The results on dromedary are even more interesting:

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

(So, system keystore less out of date here...)

$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:80 (No route to host)
<URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

I have no idea what to make of the fact that http: still fails with this
perl version.  But I think we've conclusively proven that the problem with
https: is down to these machines trying to use tlsv1.

So the next question is what to do about it.  Is tls < 1.2 officially
deprecated these days, or was that configuration change just accidental?

I can probably restore these machines to functionality by updating
whichever Perl module knows about TLS (anyone know which that is?),
so if you want to undo the config change, it's OK by me.  But other
owners of ancient buildfarm critters might be less happy about it.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 7:51 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
<snip>
 

The results on dromedary are even more interesting:

$ curl https://buildfarm.postgresql.org/branches_of_interest.txt
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

(So, system keystore less out of date here...)
 

$ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:80 (No route to host) <URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

$ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD

I have no idea what to make of the fact that http: still fails with this

Yeah, that part is super weird. Do we know if that worked before? Or has it been using https for a while?

 
perl version.  But I think we've conclusively proven that the problem with
https: is down to these machines trying to use tlsv1. 

So the next question is what to do about it.  Is tls < 1.2 officially
deprecated these days, or was that configuration change just accidental?

It absolutely is. I actually thought we had already blocked that in the *previous* setup, but clearly we hadn't :)

That said, the buildfarm doesn't really do things that are that sensitive.  So we can probably turn it off on that individual machine if we have to. Right now our config management will flip the configuration right back shortly, but I can probably get that sorted out pretty easily.


I can probably restore these machines to functionality by updating
whichever Perl module knows about TLS (anyone know which that is?),
so if you want to undo the config change, it's OK by me.  But other
owners of ancient buildfarm critters might be less happy about it.

I think what you'd need is a new version of openssl. 

But it might be hard to get in on all of them. Let's see if we can turn off the restriction for a while, and see if the other BF animals also recover.

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The results on dromedary are even more interesting:
>> 
>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>> http://buildfarm.postgresql.org/branches_of_interest.txt");'
>> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host) <URL:
>> http://buildfarm.postgresql.org/branches_of_interest.txt>

> Yeah, that part is super weird. Do we know if that worked before? Or has it
> been using https for a while?

It looks like I installed Perl https support on that machine on
2017-01-14, so I'd guess dromedary has been using https since then.

>> I can probably restore these machines to functionality by updating
>> whichever Perl module knows about TLS (anyone know which that is?),
>> so if you want to undo the config change, it's OK by me.  But other
>> owners of ancient buildfarm critters might be less happy about it.

> I think what you'd need is a new version of openssl.

Yeah, I'd just come to that conclusion after researching things a bit
(although it looks like IO::Socket:SSL has some relevant fixes too).

> But it might be hard to get in on all of them. Let's see if we can turn off
> the restriction for a while, and see if the other BF animals also recover.

The bigger issue here is that if we force buildfarm members to run
openssl >= x.y, I'd say that's tantamount to desupporting openssl < x.y.
Are we ready to desupport versions that don't have TLS 1.2?  I think
that might well be reasonable to do in HEAD, but I'm less enthused about
it for the back branches.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Magnus Hagander
Date:


On Tue, Jul 17, 2018 at 8:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 8:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> The results on dromedary are even more interesting:
>>
>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>> http://buildfarm.postgresql.org/branches_of_interest.txt");'
>> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host) <URL:
>> http://buildfarm.postgresql.org/branches_of_interest.txt>

> Yeah, that part is super weird. Do we know if that worked before? Or has it
> been using https for a while?

It looks like I installed Perl https support on that machine on
2017-01-14, so I'd guess dromedary has been using https since then.

So it could be something else. I have no idea what it would be though, since port 80 seems to work from elsewhere.


>> I can probably restore these machines to functionality by updating
>> whichever Perl module knows about TLS (anyone know which that is?),
>> so if you want to undo the config change, it's OK by me.  But other
>> owners of ancient buildfarm critters might be less happy about it.

> I think what you'd need is a new version of openssl.

Yeah, I'd just come to that conclusion after researching things a bit
(although it looks like IO::Socket:SSL has some relevant fixes too).

> But it might be hard to get in on all of them. Let's see if we can turn off
> the restriction for a while, and see if the other BF animals also recover.

The bigger issue here is that if we force buildfarm members to run
openssl >= x.y, I'd say that's tantamount to desupporting openssl < x.y.
Are we ready to desupport versions that don't have TLS 1.2?  I think
that might well be reasonable to do in HEAD, but I'm less enthused about
it for the back branches.

Yeah, that's definitely a bigger problem.

We could always use http for those and not https. But surely that's *worse* than using a https that's considered insecure. Completely skipping it must be worse... And I don't think separating out the site into "submissions can do 1.0 but viewers can only do 1.2+" is reasonable, not given that the only things that actually passes credentials *are* the submissions. 

--

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> On Tue, Jul 17, 2018 at 8:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>>> http://buildfarm.postgresql.org/branches_of_interest.txt");'
>>> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host)
>>> <URL:http://buildfarm.postgresql.org/branches_of_interest.txt>

> So it could be something else. I have no idea what it would be though,
> since port 80 seems to work from elsewhere.

Oh, and before you decide to get back in the water ... I just tried this
stuff from my RHEL6 server.  curl is fine, but:

$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
REL9_3_STABLE
REL9_4_STABLE
REL9_5_STABLE
REL9_6_STABLE
REL_10_STABLE
REL_11_STABLE
HEAD
$ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
500 Can't connect to buildfarm.postgresql.org:443 (connect: Network is unreachable)
<URL:https://buildfarm.postgresql.org/branches_of_interest.txt>

Now I'm completely confused.  This is going through a different ISP
and significantly different network path to reach rackspace, but
that shouldn't have anything to do with it?  *Something* is darn
weird here.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Stefan Kaltenbrunner
Date:
On 07/17/2018 08:58 PM, Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Tue, Jul 17, 2018 at 8:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>>> $ perl -MLWP::Simple -MLWP::Protocol::https -e 'LWP::Simple::getprint("
>>>> http://buildfarm.postgresql.org/branches_of_interest.txt");'
>>>> 500 Can't connect to buildfarm.postgresql.org:80 (No route to host)
>>>> <URL:http://buildfarm.postgresql.org/branches_of_interest.txt>
> 
>> So it could be something else. I have no idea what it would be though,
>> since port 80 seems to work from elsewhere.
> 
> Oh, and before you decide to get back in the water ... I just tried this
> stuff from my RHEL6 server.  curl is fine, but:
> 
> $ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("http://buildfarm.postgresql.org/branches_of_interest.txt");'
> REL9_3_STABLE
> REL9_4_STABLE
> REL9_5_STABLE
> REL9_6_STABLE
> REL_10_STABLE
> REL_11_STABLE
> HEAD
> $ perl -MLWP::Simple -MLWP::Protocol::https -e
'LWP::Simple::getprint("https://buildfarm.postgresql.org/branches_of_interest.txt");'
> 500 Can't connect to buildfarm.postgresql.org:443 (connect: Network is unreachable)
<URL:https://buildfarm.postgresql.org/branches_of_interest.txt>
> 
> Now I'm completely confused.  This is going through a different ISP
> and significantly different network path to reach rackspace, but
> that shouldn't have anything to do with it?  *Something* is darn
> weird here.

given it does not yet seem to have been discussed in this thread - ipv4
vs ipv6 (either direct or indirect through CGN or similiar technologies)?


Stefan


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> On 07/17/2018 08:58 PM, Tom Lane wrote:
>> Now I'm completely confused.  This is going through a different ISP
>> and significantly different network path to reach rackspace, but
>> that shouldn't have anything to do with it?  *Something* is darn
>> weird here.

> given it does not yet seem to have been discussed in this thread - ipv4
> vs ipv6 (either direct or indirect through CGN or similiar technologies)?

Good thought, but at least at my end it's all IPv4, and traceroute
doesn't suggest there's anything else between.  The RHEL6 box sees
this traceroute:

$ traceroute buildfarm.postgresql.org
traceroute to buildfarm.postgresql.org (174.143.35.217), 30 hops max, 60 byte packets
 1  router1.sss.pgh.pa.us (192.168.168.5)  3.705 ms  4.465 ms  5.528 ms
 2  192.168.252.29 (192.168.252.29)  18.278 ms  19.285 ms  20.719 ms
 3  gw.aspStation.net (66.207.128.1)  22.197 ms  23.351 ms  24.578 ms
 4  144.232.10.211 (144.232.10.211)  25.990 ms  27.226 ms  28.582 ms
 5  144.232.10.210 (144.232.10.210)  29.884 ms  31.063 ms  33.532 ms
 6  144.232.14.7 (144.232.14.7)  32.215 ms  46.337 ms  44.983 ms
 7  144.232.15.174 (144.232.15.174)  46.679 ms 144.232.14.10 (144.232.14.10)  34.810 ms  35.586 ms
 8  144.232.15.121 (144.232.15.121)  35.414 ms  34.871 ms  35.634 ms
 9  sl-above1-722053-0.sprintlink.net (144.228.205.158)  35.691 ms  35.101 ms  35.580 ms
10  ae1.cr2.dca2.us.zip.zayo.com (64.125.20.121)  35.510 ms  34.947 ms  35.528 ms
11  ae27.cs2.dca2.us.eth.zayo.com (64.125.30.248)  83.111 ms  82.605 ms  87.833 ms
12  ae4.cs2.lga5.us.eth.zayo.com (64.125.29.30)  98.348 ms  97.706 ms  97.689 ms
13  ae3.cs2.ord2.us.eth.zayo.com (64.125.29.213)  85.612 ms  85.198 ms  85.710 ms
14  ae5.cs2.den5.us.eth.zayo.com (64.125.29.216)  85.700 ms  85.143 ms  85.800 ms
15  ae7.cs2.den5.us.eth.zayo.com (64.125.31.237)  91.259 ms  90.791 ms  91.310 ms
16  ae28.er1.dfw2.us.zip.zayo.com (64.125.26.15)  85.946 ms  69.331 ms  69.892 ms
17  128.177.70.86.IPYX-076520-900-ZYO.zip.zayo.com (128.177.70.86)  69.761 ms  87.568 ms  88.741 ms
18  * * *
19  74.205.108.121 (74.205.108.121)  85.957 ms be42.coreb.dfw1.rackspace.net (74.205.108.125)  85.334 ms
be41.corea.dfw1.rackspace.net(74.205.108.113)  85.820 ms 
20  core5-coreb.dfw1.rackspace.net (74.205.108.27)  76.332 ms po1.CoreA.core6.dfw1.rackspace.net (72.32.111.13)  84.456
mspo2.CoreB.core6.dfw1.rackspace.net (72.32.111.15)  77.344 ms 
21  core5-aggr313a.dfw1.rackspace.net (67.192.56.63)  83.224 ms  82.713 ms  71.320 ms
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

while the buildfarm critters are going through verizon:

$ traceroute buildfarm.postgresql.org
traceroute to buildfarm.postgresql.org (174.143.35.217), 64 hops max, 40 byte packets
 1  router2 (192.168.168.2)  5.031 ms  4.108 ms  4.195 ms
 2  * * *
 3  b3309.pitbpa-lcr-22.verizon-gni.net (130.81.28.70)  133.770 ms  8.250 ms  8.069 ms
 4  * * *
 5  * * *
 6  0.et-7-3-0.br1.iad8.alter.net (140.222.239.83)  16.920 ms 0.et-5-1-5.br1.iad8.alter.net (140.222.0.65)  16.547 ms
14.896ms 
 7  xe-2-1-0.er2.iad10.us.zip.zayo.com (64.125.13.173)  15.706 ms  14.482 ms  14.481 ms
 8  ae1.cr2.dca2.us.zip.zayo.com (64.125.20.121)  15.731 ms  17.841 ms  17.265 ms
 9  ae27.cs2.dca2.us.eth.zayo.com (64.125.30.248)  68.536 ms  68.973 ms  67.971 ms
10  ae4.cs2.lga5.us.eth.zayo.com (64.125.29.30)  67.021 ms  68.627 ms  67.130 ms
11  ae3.cs2.ord2.us.eth.zayo.com (64.125.29.213)  67.277 ms  67.727 ms  68.079 ms
12  ae5.cs2.den5.us.eth.zayo.com (64.125.29.216)  66.520 ms  69.493 ms  73.931 ms
13  ae7.cs2.den5.us.eth.zayo.com (64.125.31.237)  193.146 ms  66.854 ms  67.034 ms
14  ae28.er1.dfw2.us.zip.zayo.com (64.125.26.15)  149.815 ms  67.910 ms  70.480 ms
15  128.177.70.86.ipyx-076520-900-zyo.zip.zayo.com (128.177.70.86)  68.615 ms  78.398 ms  70.127 ms
16  * * *
17  be41.coreb.dfw1.rackspace.net (74.205.108.117)  73.871 ms 74.205.108.121 (74.205.108.121)  67.374 ms  70.223 ms
18  core5-corea.dfw1.rackspace.net (74.205.108.11)  72.848 ms po1.corea.core6.dfw1.rackspace.net (72.32.111.13)  69.176
mscore5-corea.dfw1.rackspace.net (74.205.108.11)  70.037 ms 
19  po2.core6.aggr313a.rackspace.net (72.32.111.211)  203.537 ms  70.130 ms core5-aggr313a.dfw1.rackspace.net
(67.192.56.63) 70.854 ms 
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

Also, if the issue is somewhere in between, that fails to explain why
"curl" works but not perl.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Stefan Kaltenbrunner
Date:
On 07/17/2018 09:22 PM, Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> On 07/17/2018 08:58 PM, Tom Lane wrote:
>>> Now I'm completely confused.  This is going through a different ISP
>>> and significantly different network path to reach rackspace, but
>>> that shouldn't have anything to do with it?  *Something* is darn
>>> weird here.
> 
>> given it does not yet seem to have been discussed in this thread - ipv4
>> vs ipv6 (either direct or indirect through CGN or similiar technologies)?
> 
> Good thought, but at least at my end it's all IPv4, and traceroute
> doesn't suggest there's anything else between.  The RHEL6 box sees
> this traceroute:
> 
> $ traceroute buildfarm.postgresql.org
> traceroute to buildfarm.postgresql.org (174.143.35.217), 30 hops max, 60 byte packets
>  1  router1.sss.pgh.pa.us (192.168.168.5)  3.705 ms  4.465 ms  5.528 ms
>  2  192.168.252.29 (192.168.252.29)  18.278 ms  19.285 ms  20.719 ms
>  3  gw.aspStation.net (66.207.128.1)  22.197 ms  23.351 ms  24.578 ms
>  4  144.232.10.211 (144.232.10.211)  25.990 ms  27.226 ms  28.582 ms
>  5  144.232.10.210 (144.232.10.210)  29.884 ms  31.063 ms  33.532 ms
>  6  144.232.14.7 (144.232.14.7)  32.215 ms  46.337 ms  44.983 ms
>  7  144.232.15.174 (144.232.15.174)  46.679 ms 144.232.14.10 (144.232.14.10)  34.810 ms  35.586 ms
>  8  144.232.15.121 (144.232.15.121)  35.414 ms  34.871 ms  35.634 ms
>  9  sl-above1-722053-0.sprintlink.net (144.228.205.158)  35.691 ms  35.101 ms  35.580 ms
> 10  ae1.cr2.dca2.us.zip.zayo.com (64.125.20.121)  35.510 ms  34.947 ms  35.528 ms
> 11  ae27.cs2.dca2.us.eth.zayo.com (64.125.30.248)  83.111 ms  82.605 ms  87.833 ms
> 12  ae4.cs2.lga5.us.eth.zayo.com (64.125.29.30)  98.348 ms  97.706 ms  97.689 ms
> 13  ae3.cs2.ord2.us.eth.zayo.com (64.125.29.213)  85.612 ms  85.198 ms  85.710 ms
> 14  ae5.cs2.den5.us.eth.zayo.com (64.125.29.216)  85.700 ms  85.143 ms  85.800 ms
> 15  ae7.cs2.den5.us.eth.zayo.com (64.125.31.237)  91.259 ms  90.791 ms  91.310 ms
> 16  ae28.er1.dfw2.us.zip.zayo.com (64.125.26.15)  85.946 ms  69.331 ms  69.892 ms
> 17  128.177.70.86.IPYX-076520-900-ZYO.zip.zayo.com (128.177.70.86)  69.761 ms  87.568 ms  88.741 ms
> 18  * * *
> 19  74.205.108.121 (74.205.108.121)  85.957 ms be42.coreb.dfw1.rackspace.net (74.205.108.125)  85.334 ms
be41.corea.dfw1.rackspace.net(74.205.108.113)  85.820 ms
 
> 20  core5-coreb.dfw1.rackspace.net (74.205.108.27)  76.332 ms po1.CoreA.core6.dfw1.rackspace.net (72.32.111.13)
84.456ms po2.CoreB.core6.dfw1.rackspace.net (72.32.111.15)  77.344 ms
 
> 21  core5-aggr313a.dfw1.rackspace.net (67.192.56.63)  83.224 ms  82.713 ms  71.320 ms
> 22  * * *
> 23  * * *
> 24  * * *
> 25  * * *
> 26  * * *
> 27  * * *
> 28  * * *
> 29  * * *
> 30  * * *
> 
> while the buildfarm critters are going through verizon:
> 
> $ traceroute buildfarm.postgresql.org
> traceroute to buildfarm.postgresql.org (174.143.35.217), 64 hops max, 40 byte packets
>  1  router2 (192.168.168.2)  5.031 ms  4.108 ms  4.195 ms
>  2  * * *
>  3  b3309.pitbpa-lcr-22.verizon-gni.net (130.81.28.70)  133.770 ms  8.250 ms  8.069 ms
>  4  * * *
>  5  * * *
>  6  0.et-7-3-0.br1.iad8.alter.net (140.222.239.83)  16.920 ms 0.et-5-1-5.br1.iad8.alter.net (140.222.0.65)  16.547 ms
14.896 ms
 
>  7  xe-2-1-0.er2.iad10.us.zip.zayo.com (64.125.13.173)  15.706 ms  14.482 ms  14.481 ms
>  8  ae1.cr2.dca2.us.zip.zayo.com (64.125.20.121)  15.731 ms  17.841 ms  17.265 ms
>  9  ae27.cs2.dca2.us.eth.zayo.com (64.125.30.248)  68.536 ms  68.973 ms  67.971 ms
> 10  ae4.cs2.lga5.us.eth.zayo.com (64.125.29.30)  67.021 ms  68.627 ms  67.130 ms
> 11  ae3.cs2.ord2.us.eth.zayo.com (64.125.29.213)  67.277 ms  67.727 ms  68.079 ms
> 12  ae5.cs2.den5.us.eth.zayo.com (64.125.29.216)  66.520 ms  69.493 ms  73.931 ms
> 13  ae7.cs2.den5.us.eth.zayo.com (64.125.31.237)  193.146 ms  66.854 ms  67.034 ms
> 14  ae28.er1.dfw2.us.zip.zayo.com (64.125.26.15)  149.815 ms  67.910 ms  70.480 ms
> 15  128.177.70.86.ipyx-076520-900-zyo.zip.zayo.com (128.177.70.86)  68.615 ms  78.398 ms  70.127 ms
> 16  * * *
> 17  be41.coreb.dfw1.rackspace.net (74.205.108.117)  73.871 ms 74.205.108.121 (74.205.108.121)  67.374 ms  70.223 ms
> 18  core5-corea.dfw1.rackspace.net (74.205.108.11)  72.848 ms po1.corea.core6.dfw1.rackspace.net (72.32.111.13)
69.176ms core5-corea.dfw1.rackspace.net (74.205.108.11)  70.037 ms
 
> 19  po2.core6.aggr313a.rackspace.net (72.32.111.211)  203.537 ms  70.130 ms core5-aggr313a.dfw1.rackspace.net
(67.192.56.63) 70.854 ms
 
> 20  * * *
> 21  * * *
> 22  * * *
> 23  * * *
> 24  * * *
> 25  * * *
> 26  * * *
> 27  * * *
> 28  * * *
> 29  * * *
> 30  * * *
> 
> Also, if the issue is somewhere in between, that fails to explain why
> "curl" works but not perl.

not sure that proofs that v4 vs v6 is out entirely, there could be other
factors involved (like your isp mapping v4 to v6 on the router(!) maybe
combined with dns64/dns46 related tricks).
It should still be possible to figure out where exactly the "network
unreachable" comes from - whether it is something the local tcp/ip stack
generates or something that comes in as an icmp-error from remote using
tcpdump or similiar.



Stefan


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> On 07/17/2018 09:22 PM, Tom Lane wrote:
>> Also, if the issue is somewhere in between, that fails to explain why
>> "curl" works but not perl.

> not sure that proofs that v4 vs v6 is out entirely, there could be other
> factors involved (like your isp mapping v4 to v6 on the router(!) maybe
> combined with dns64/dns46 related tricks).
> It should still be possible to figure out where exactly the "network
> unreachable" comes from - whether it is something the local tcp/ip stack
> generates or something that comes in as an icmp-error from remote using
> tcpdump or similiar.

Good idea ... tcpdump says *nothing at all* is happening during the
https request, which led me to try strace'ing the perl run, and that
tells the tale:

[ lots of setup omitted ]
socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
sendto(3, "\370-\1\0\0\1\0\0\0\0\0\0\tbuildfarm\npostgresq"..., 42, MSG_NOSIGNAL, NULL, 0) = 42
poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
ioctl(3, FIONREAD, [70])                = 0
recvfrom(3, "\370-\201\200\0\1\0\1\0\0\0\0\tbuildfarm\npostgresq"..., 1024, 0, {sa_family=AF_INET, sin_port=htons(53),
sin_addr=inet_addr("127.0.0.1")},[16]) = 70 
close(3)                                = 0
socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 3
ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffe430799c0) = -1 EINVAL (Invalid argument)
lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffe430799c0) = -1 EINVAL (Invalid argument)
lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
bind(3, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0},28) = 0 
connect(3, {sa_family=AF_INET6, sin6_port=htons(443), inet_pton(AF_INET6, "2001:4800:1501:1::217", &sin6_addr),
sin6_flowinfo=0,sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable) 
close(3)                                = 0
write(2, "500 Can't connect to buildfarm.p"..., 83500 Can't connect to buildfarm.postgresql.org:443 (connect: Network
isunreachable)) = 83 
write(2, " <URL:https://buildfarm.postgres"..., 65 <URL:https://buildfarm.postgresql.org/branches_of_interest.txt>
) = 65

So for some reason, perl's https support is trying to bind to the IPv6
address of buildfarm.postgresql.org, even though no IPv6 support is
configured at all on this machine.  I wonder how long that's been going
on?  Has anything about the machine's DNS entries changed recently?
(Also, "ssh buildfarm.postgresql.org" binds to IPv4 just fine.)

Also, checking the equally inexplicable failure on dromedary, it looks
like the explanation might be similar there, only reversed: the http:
request produces zero interface traffic, suggesting that it's getting
mapped to an IPv6 address.  I don't seem to have a working strace
equivalent on that machine so it's harder to be sure about it.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Stefan Kaltenbrunner
Date:
On 07/17/2018 10:14 PM, Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> On 07/17/2018 09:22 PM, Tom Lane wrote:
>>> Also, if the issue is somewhere in between, that fails to explain why
>>> "curl" works but not perl.
> 
>> not sure that proofs that v4 vs v6 is out entirely, there could be other
>> factors involved (like your isp mapping v4 to v6 on the router(!) maybe
>> combined with dns64/dns46 related tricks).
>> It should still be possible to figure out where exactly the "network
>> unreachable" comes from - whether it is something the local tcp/ip stack
>> generates or something that comes in as an icmp-error from remote using
>> tcpdump or similiar.
> 
> Good idea ... tcpdump says *nothing at all* is happening during the
> https request, which led me to try strace'ing the perl run, and that
> tells the tale:
> 
> [ lots of setup omitted ]
> socket(PF_INET, SOCK_DGRAM|SOCK_NONBLOCK, IPPROTO_IP) = 3
> connect(3, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, 16) = 0
> poll([{fd=3, events=POLLOUT}], 1, 0)    = 1 ([{fd=3, revents=POLLOUT}])
> sendto(3, "\370-\1\0\0\1\0\0\0\0\0\0\tbuildfarm\npostgresq"..., 42, MSG_NOSIGNAL, NULL, 0) = 42
> poll([{fd=3, events=POLLIN}], 1, 5000)  = 1 ([{fd=3, revents=POLLIN}])
> ioctl(3, FIONREAD, [70])                = 0
> recvfrom(3, "\370-\201\200\0\1\0\1\0\0\0\0\tbuildfarm\npostgresq"..., 1024, 0, {sa_family=AF_INET,
sin_port=htons(53),sin_addr=inet_addr("127.0.0.1")}, [16]) = 70
 
> close(3)                                = 0
> socket(PF_INET6, SOCK_STREAM, IPPROTO_TCP) = 3
> ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffe430799c0) = -1 EINVAL (Invalid
argument)
> lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
> ioctl(3, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0x7ffe430799c0) = -1 EINVAL (Invalid
argument)
> lseek(3, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
> fcntl(3, F_SETFD, FD_CLOEXEC)           = 0
> bind(3, {sa_family=AF_INET6, sin6_port=htons(0), inet_pton(AF_INET6, "::", &sin6_addr), sin6_flowinfo=0,
sin6_scope_id=0},28) = 0
 
> connect(3, {sa_family=AF_INET6, sin6_port=htons(443), inet_pton(AF_INET6, "2001:4800:1501:1::217", &sin6_addr),
sin6_flowinfo=0,sin6_scope_id=0}, 28) = -1 ENETUNREACH (Network is unreachable)
 
> close(3)                                = 0
> write(2, "500 Can't connect to buildfarm.p"..., 83500 Can't connect to buildfarm.postgresql.org:443 (connect: Network
isunreachable)) = 83
 
> write(2, " <URL:https://buildfarm.postgres"..., 65 <URL:https://buildfarm.postgresql.org/branches_of_interest.txt>
> ) = 65
> 
> So for some reason, perl's https support is trying to bind to the IPv6
> address of buildfarm.postgresql.org, even though no IPv6 support is
> configured at all on this machine.  I wonder how long that's been going
> on?  Has anything about the machine's DNS entries changed recently?
> (Also, "ssh buildfarm.postgresql.org" binds to IPv4 just fine.)
> 
> Also, checking the equally inexplicable failure on dromedary, it looks
> like the explanation might be similar there, only reversed: the http:
> request produces zero interface traffic, suggesting that it's getting
> mapped to an IPv6 address.  I don't seem to have a working strace
> equivalent on that machine so it's harder to be sure about it.

I dont think there have been any recent changes on (DNS) v6 for
brentalia - afaiks in our internal revision control we have had v6 on
that box for at least 2 years now.
However could it be that whatever DNS resolver those boxes are using
just started to return AAAAs as well (the strsize in the strace output
is not large enough to see the actual response from the local resolver)
- like as part of your ISP enabling v6?

Also note that the bind() call does actually return 0 so not sure it is
perl to blame that it tries a connect() as well...



Stefan


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> On 07/17/2018 10:14 PM, Tom Lane wrote:
>> So for some reason, perl's https support is trying to bind to the IPv6
>> address of buildfarm.postgresql.org, even though no IPv6 support is
>> configured at all on this machine.  I wonder how long that's been going
>> on?  Has anything about the machine's DNS entries changed recently?
>> (Also, "ssh buildfarm.postgresql.org" binds to IPv4 just fine.)

> I dont think there have been any recent changes on (DNS) v6 for
> brentalia - afaiks in our internal revision control we have had v6 on
> that box for at least 2 years now.
> However could it be that whatever DNS resolver those boxes are using
> just started to return AAAAs as well (the strsize in the strace output
> is not large enough to see the actual response from the local resolver)

The nameserver is one I run locally, and the only change it's seen lately
is RHEL6's occasional security updates.  I don't think that's where the
issue came in.

The full nameserver interaction is

sendto(3,
"\x21\x86\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42 

recvfrom(3,
"\x21\x86\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01\xc0\x0c\x00\x1c\x00\x01\x00\x00\x06\xc1\x00\x10\x20\x01\x48\x00\x15\x01\x00\x01\x00\x00\x00\x00\x00\x00\x02\x17",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 70 

I don't have anything handy like wireshark installed on this machine, but
I see the hex for buildfarm's IPv6 address in that response, and *not*
the hex for its IPv4 address.  Conversely, when I try the http: URL,
I see a different query and only the IPv4 address in the response:

sendto(3,
"\xa8\x93\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42 

recvfrom(3,
"\xa8\x93\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01\xc0\x0c\x00\x01\x00\x01\x00\x00\x01\xd5\x00\x04\xae\x8f\x23\xd9",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 58 

It looks like Perl is specifically asking for AAAA in preference to A
records, but only for https:.  Weird.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Steve Atkins
Date:
> On Jul 17, 2018, at 2:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> The nameserver is one I run locally, and the only change it's seen lately
> is RHEL6's occasional security updates.  I don't think that's where the
> issue came in.
>
> The full nameserver interaction is
>
> sendto(3,
"\x21\x86\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42 

00 1c is AAAA, so this is requesting the AAAA for buildfarm.postgresql.org


>
> recvfrom(3,
"\x21\x86\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01\xc0\x0c\x00\x1c\x00\x01\x00\x00\x06\xc1\x00\x10\x20\x01\x48\x00\x15\x01\x00\x01\x00\x00\x00\x00\x00\x00\x02\x17",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 70 
>
> I don't have anything handy like wireshark installed on this machine, but
> I see the hex for buildfarm's IPv6 address in that response, and *not*
> the hex for its IPv4 address.  Conversely, when I try the http: URL,
> I see a different query and only the IPv4 address in the response:
>
> sendto(3,
"\xa8\x93\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42 

and 00 01 is A.

>
> recvfrom(3,
"\xa8\x93\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01\xc0\x0c\x00\x01\x00\x01\x00\x00\x01\xd5\x00\x04\xae\x8f\x23\xd9",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 58 
>
> It looks like Perl is specifically asking for AAAA in preference to A
> records, but only for https:.  Weird.

Rather weird.

Cheers,
  Steve






Re: buildfarm server suddenly not talking to old SSL stacks?

From
Stefan Kaltenbrunner
Date:
On 07/17/2018 11:29 PM, Tom Lane wrote:
> Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
>> On 07/17/2018 10:14 PM, Tom Lane wrote:
>>> So for some reason, perl's https support is trying to bind to the IPv6
>>> address of buildfarm.postgresql.org, even though no IPv6 support is
>>> configured at all on this machine.  I wonder how long that's been going
>>> on?  Has anything about the machine's DNS entries changed recently?
>>> (Also, "ssh buildfarm.postgresql.org" binds to IPv4 just fine.)
> 
>> I dont think there have been any recent changes on (DNS) v6 for
>> brentalia - afaiks in our internal revision control we have had v6 on
>> that box for at least 2 years now.
>> However could it be that whatever DNS resolver those boxes are using
>> just started to return AAAAs as well (the strsize in the strace output
>> is not large enough to see the actual response from the local resolver)
> 
> The nameserver is one I run locally, and the only change it's seen lately
> is RHEL6's occasional security updates.  I don't think that's where the
> issue came in.
> 
> The full nameserver interaction is
> 
> sendto(3,
"\x21\x86\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42
 
> 
> recvfrom(3,
"\x21\x86\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01\xc0\x0c\x00\x1c\x00\x01\x00\x00\x06\xc1\x00\x10\x20\x01\x48\x00\x15\x01\x00\x01\x00\x00\x00\x00\x00\x00\x02\x17",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 70
 
> 
> I don't have anything handy like wireshark installed on this machine, but
> I see the hex for buildfarm's IPv6 address in that response, and *not*
> the hex for its IPv4 address.  Conversely, when I try the http: URL,
> I see a different query and only the IPv4 address in the response:
> 
> sendto(3,
"\xa8\x93\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01",
42,MSG_NOSIGNAL, NULL, 0) = 42
 
> 
> recvfrom(3,
"\xa8\x93\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01\xc0\x0c\x00\x01\x00\x01\x00\x00\x01\xd5\x00\x04\xae\x8f\x23\xd9",
1024,0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 58
 
> 
> It looks like Perl is specifically asking for AAAA in preference to A
> records, but only for https:.  Weird.

not really weird I think - the buildfarm uses LWP and for SSL support it 
might use(iirc) either Crypt::SSLeay (older versions before unbundling 
of lwp::protocol:https) or IO::Socket:SSL which has this in its docs:

"Please be aware that with the IPv6 capable super classes, it will look 
first for the IPv6 address of a given hostname. If the resolver provides 
an IPv6 address, but the host cannot be reached by IPv6, there will be 
no automatic fallback to IPv4. To avoid these problems you can enforce 
IPv4 for a specific socket by using the Domain or Family option with the 
value AF_INET as described in IO::Socket::IP. Alternatively you can 
enforce IPv4 globally by loading IO::Socket::SSL with the option 
'inet4', in which case it will use the IPv4 only class IO::Socket::INET 
as the super class."

So maybe removing the IO::Socket::INET6 superclass/package from the 
system will get it working (or hacking the buildfarm script).



Stefan


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Andrew Dunstan
Date:


On Wed, Jul 18, 2018 at 2:57 AM, Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> wrote:
On 07/17/2018 11:29 PM, Tom Lane wrote:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
On 07/17/2018 10:14 PM, Tom Lane wrote:
So for some reason, perl's https support is trying to bind to the IPv6
address of buildfarm.postgresql.org, even though no IPv6 support is
configured at all on this machine.  I wonder how long that's been going
on?  Has anything about the machine's DNS entries changed recently?
(Also, "ssh buildfarm.postgresql.org" binds to IPv4 just fine.)

I dont think there have been any recent changes on (DNS) v6 for
brentalia - afaiks in our internal revision control we have had v6 on
that box for at least 2 years now.
However could it be that whatever DNS resolver those boxes are using
just started to return AAAAs as well (the strsize in the strace output
is not large enough to see the actual response from the local resolver)

The nameserver is one I run locally, and the only change it's seen lately
is RHEL6's occasional security updates.  I don't think that's where the
issue came in.

The full nameserver interaction is

sendto(3, "\x21\x86\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01", 42, MSG_NOSIGNAL, NULL, 0) = 42

recvfrom(3, "\x21\x86\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x1c\x00\x01\xc0\x0c\x00\x1c\x00\x01\x00\x00\x06\xc1\x00\x10\x20\x01\x48\x00\x15\x01\x00\x01\x00\x00\x00\x00\x00\x00\x02\x17", 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 70

I don't have anything handy like wireshark installed on this machine, but
I see the hex for buildfarm's IPv6 address in that response, and *not*
the hex for its IPv4 address.  Conversely, when I try the http: URL,
I see a different query and only the IPv4 address in the response:

sendto(3, "\xa8\x93\x01\x00\x00\x01\x00\x00\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01", 42, MSG_NOSIGNAL, NULL, 0) = 42

recvfrom(3, "\xa8\x93\x81\x80\x00\x01\x00\x01\x00\x00\x00\x00\x09\x62\x75\x69\x6c\x64\x66\x61\x72\x6d\x0a\x70\x6f\x73\x74\x67\x72\x65\x73\x71\x6c\x03\x6f\x72\x67\x00\x00\x01\x00\x01\xc0\x0c\x00\x01\x00\x01\x00\x00\x01\xd5\x00\x04\xae\x8f\x23\xd9", 1024, 0, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("127.0.0.1")}, [16]) = 58

It looks like Perl is specifically asking for AAAA in preference to A
records, but only for https:.  Weird.

not really weird I think - the buildfarm uses LWP and for SSL support it might use(iirc) either Crypt::SSLeay (older versions before unbundling of lwp::protocol:https) or IO::Socket:SSL which has this in its docs:

"Please be aware that with the IPv6 capable super classes, it will look first for the IPv6 address of a given hostname. If the resolver provides an IPv6 address, but the host cannot be reached by IPv6, there will be no automatic fallback to IPv4. To avoid these problems you can enforce IPv4 for a specific socket by using the Domain or Family option with the value AF_INET as described in IO::Socket::IP. Alternatively you can enforce IPv4 globally by loading IO::Socket::SSL with the option 'inet4', in which case it will use the IPv4 only class IO::Socket::INET as the super class."

So maybe removing the IO::Socket::INET6 superclass/package from the system will get it working (or hacking the buildfarm script).





Tom, please see if adding this at the top of the failing script fixes it:

    use IO::Socket::SSL qw (inet);

cheers

andrew

Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom, please see if adding this at the top of the failing script fixes it:
>     use IO::Socket::SSL qw (inet);

No, that doesn't work at all, but

      use IO::Socket::SSL qw (inet4);

does fix it.  Not sure how far that helps though --- we'd not want to put
that in the buildfarm client would we?

Some more detail: tracing shows that IO::Socket::INET6 is getting used,
and that contains code that purports to make the correct decision between
IPv6 and IPv4, but it's going wrong.  It looks like what it *actually*
does is make sure that both the local and remote addresses can be resolved
in the same address family.  I think that the local address is probably
"localhost", which RHEL6 will helpfully resolve as either 127.0.0.1 or ::1
regardless of whether there's any other support for IPv6 anyplace,
allowing INET6 to predict that the connection will work ... which it
doesn't, but the code doesn't want to retry after failing that step.

Perhaps I could fix this by rejiggering things so that localhost only
resolves as 127.0.0.1, but I don't really want to muck with that.
Removing the perl-IO-Socket-INET6 package would be less invasive.

            regards, tom lane


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Stefan Kaltenbrunner
Date:
On 07/20/2018 01:11 AM, Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>> Tom, please see if adding this at the top of the failing script fixes it:
>>      use IO::Socket::SSL qw (inet);
> 
> No, that doesn't work at all, but
> 
>        use IO::Socket::SSL qw (inet4);
> 
> does fix it.  Not sure how far that helps though --- we'd not want to put
> that in the buildfarm client would we?

maybe a more general option to "force ipv4 or ipv6" akin to what most 
unix networking related utilities support with -4 and -6 might be useful?

On the other side I wonder whether passing in "MultiHomed" to the 
IO::Socket::INET6 Constructor behind LWPs back might work - though the 
docs are pretty light on any details on its actual behaviour:

https://metacpan.org/pod/release/SHLOMIF/IO-Socket-INET6-2.72/lib/IO/Socket/INET6.pm#CONSTRUCTOR


> 
> Some more detail: tracing shows that IO::Socket::INET6 is getting used,
> and that contains code that purports to make the correct decision between
> IPv6 and IPv4, but it's going wrong.  It looks like what it *actually*
> does is make sure that both the local and remote addresses can be resolved
> in the same address family.  I think that the local address is probably
> "localhost", which RHEL6 will helpfully resolve as either 127.0.0.1 or ::1
> regardless of whether there's any other support for IPv6 anyplace,
> allowing INET6 to predict that the connection will work ... which it
> doesn't, but the code doesn't want to retry after failing that step.
> 
> Perhaps I could fix this by rejiggering things so that localhost only
> resolves as 127.0.0.1, but I don't really want to muck with that.
> Removing the perl-IO-Socket-INET6 package would be less invasive.

yeah the removal seems easier but do you actually know yet why the 
system started behaving differently in that regard?



Stefan


Re: buildfarm server suddenly not talking to old SSL stacks?

From
Tom Lane
Date:
Stefan Kaltenbrunner <stefan@kaltenbrunner.cc> writes:
> maybe a more general option to "force ipv4 or ipv6" akin to what most 
> unix networking related utilities support with -4 and -6 might be useful?

+1

> On the other side I wonder whether passing in "MultiHomed" to the 
> IO::Socket::INET6 Constructor behind LWPs back might work - though the 
> docs are pretty light on any details on its actual behaviour:

No, I already looked at the code :-(.  MultiHomed allows it to try
multiple IP addresses obtained from getaddrinfo, but it's already made
up its mind whether to use IPv4 or IPv6, and only addresses of the
given type will be tried.  (The loop logic looks more than slightly
broken, too, at least in the 2.56 version I've got here. I do not think
the author was very clear on whether he needed to handle multiple local
addresses or multiple remote addresses, but AFAICS it will only work
in the unlikely case that you've got *both*, because it loops through
both getaddrinfo results in lockstep.)

> yeah the removal seems easier but do you actually know yet why the 
> system started behaving differently in that regard?

I don't know that it ever was different.  I've never tried to run the
buildfarm client on this machine; I just happened to try the manual
getprint(".../branches_of_interest.txt") invocation that I'd also been
testing on my buildfarm hosts.  Presumably, the RHEL/Fedora machines
that are in the buildfarm have different network environments where it's
not a problem.

            regards, tom lane