Re: [BUGS] Replication to Postgres 10 on Windows is broken - Mailing list pgsql-bugs

From Tom Lane
Subject Re: [BUGS] Replication to Postgres 10 on Windows is broken
Date
Msg-id 5067.1502034740@sss.pgh.pa.us
Whole thread Raw
In response to Re: [BUGS] Replication to Postgres 10 on Windows is broken  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
I wrote:
> Gut instinct says that the reason this case fails when other tools
> can connect successfully is that libpqwalreceiver is the only tool
> that uses PQconnectStart/PQconnectPoll rather than a plain
> PQconnectdb, and that there is some behavioral difference between
> connectDBComplete's wait loop and libpqrcv_connect's wait loop that
> OpenSSL is sensitive to --- but only on Windows, and maybe only on
> particular OpenSSL versions.

On closer inspection, I take that back.  This can't be directly
OpenSSL's fault, because those error messages come out before libpq
has invoked OpenSSL at all; in particular we see

2017-08-03 10:49:41 UTC [2108]: [1-1] user=,db=,app=,client= FATAL:  could not connect to the primary server: could not
senddata to server: Socket is not connected (0x00002749/10057)   could not send SSL negotiation packet: Socket is not
connected
(0x00002749/10057)

and "could not send SSL negotiation packet" certainly must occur
before we've asked OpenSSL to do anything.

What seems likely to me at this point is that the changes in
PQconnectPoll() to support multiple hosts are somehow responsible.
It must still be connected to libpqwalreceiver's different wait loop,
but the details are unclear.

It would likely be useful to add some debug logging to PQconnectPoll
to find out what set of addresses it's seeing and whether this failure
occurs after having advanced over some of them.
        regards, tom lane


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

pgsql-bugs by date:

Previous
From: Noah Misch
Date:
Subject: Re: [BUGS] Replication to Postgres 10 on Windows is broken
Next
From: Tom Lane
Date:
Subject: Re: [BUGS] Replication to Postgres 10 on Windows is broken