Thread: Problem with streaming replication over SSL

Problem with streaming replication over SSL

From
"Albe Laurenz"
Date:
I have streaming replication configured over SSL, and
there seems to be a problem with SSL renegotiation.

This is from the primary's log:

2012-11-06 00:13:10.990
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,10,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08P01,"SSL
renegotiation failure",,,,,,,,,"walreceiver"

2012-11-06 00:13:10.998
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,11,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08P01,"SSL error:
unexpected record",,,,,,,,,"walreceiver"

2012-11-06 00:13:10.998
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,12,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08006,"could not send
data to client: Connection reset by peer",,,,,,,,,"walreceiver"

This is what the standby has to say:

2012-11-06 00:13:11.001 CET,,,26789,,509843df.68a5,2,,2012-11-05
23:55:27 CET,,0,FATAL,XX000,"could not receive data from WAL stream: SSL
error: sslv3 alert unexpected message
",,,,,,,,,""

This is PostgreSQL 9.1.3 on RHEL 6, openssl-1.0.0-20.el6.x86_64,
kernel 2.6.32-220.el6.x86_64.


After that, streaming replication reconnects and resumes working.

Is this an oversight in the replication protocol, or is this
working as designed?

Yours,
Laurenz Albe


Re: Problem with streaming replication over SSL

From
Magnus Hagander
Date:
On Tue, Nov 6, 2012 at 10:47 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
I have streaming replication configured over SSL, and
there seems to be a problem with SSL renegotiation.

This is from the primary's log:

2012-11-06 00:13:10.990
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,10,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08P01,"SSL
renegotiation failure",,,,,,,,,"walreceiver"

2012-11-06 00:13:10.998
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,11,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08P01,"SSL error:
unexpected record",,,,,,,,,"walreceiver"

2012-11-06 00:13:10.998
CET,"replication","",5204,"10.153.109.3:49889",509843df.1454,12,"streami
ng 1E3/76D64000",2012-11-05 23:55:27 CET,4/0,0,LOG,08006,"could not send
data to client: Connection reset by peer",,,,,,,,,"walreceiver"

This is what the standby has to say:

2012-11-06 00:13:11.001 CET,,,26789,,509843df.68a5,2,,2012-11-05
23:55:27 CET,,0,FATAL,XX000,"could not receive data from WAL stream: SSL
error: sslv3 alert unexpected message
",,,,,,,,,""

This is PostgreSQL 9.1.3 on RHEL 6, openssl-1.0.0-20.el6.x86_64,
kernel 2.6.32-220.el6.x86_64.


After that, streaming replication reconnects and resumes working.

Is this an oversight in the replication protocol, or is this
working as designed?


This sounds a lot like the general issue with SSL renegotiation, just that it tends to show itself more often on replication connections since they don't disconnect very often...

Have you tried disabling SSL renegotiation on the connection (ssl_renegotation=0)? If that helps, then the SSL library on one of the ends  still has the problem with renegotiation...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Problem with streaming replication over SSL

From
"Albe Laurenz"
Date:
Magnus Hagander wrote:
>> I have streaming replication configured over SSL, and
>> there seems to be a problem with SSL renegotiation.
[...]
>> After that, streaming replication reconnects and resumes working.
>>
>> Is this an oversight in the replication protocol, or is this
>> working as designed?

> This sounds a lot like the general issue with SSL renegotiation, just
that it tends to show itself
> more often on replication connections since they don't disconnect very
often...
>
> Have you tried disabling SSL renegotiation on the connection
(ssl_renegotation=0)? If that helps, then
> the SSL library on one of the ends  still has the problem with
renegotiation...

It can hardly be the CVE-2009-3555 renegotiation problem.

Both machines have OpenSSL 1.0.0, and RFC 5746 was implemented in
0.9.8m.

But I'll try to test if normal connections have the problem too.

Yours,
Laurenz Albe


Re: Problem with streaming replication over SSL

From
Magnus Hagander
Date:
On Tue, Nov 6, 2012 at 12:47 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Magnus Hagander wrote:
>> I have streaming replication configured over SSL, and
>> there seems to be a problem with SSL renegotiation.
[...]
>> After that, streaming replication reconnects and resumes working.
>>
>> Is this an oversight in the replication protocol, or is this
>> working as designed?

> This sounds a lot like the general issue with SSL renegotiation, just
that it tends to show itself
> more often on replication connections since they don't disconnect very
often...
>
> Have you tried disabling SSL renegotiation on the connection
(ssl_renegotation=0)? If that helps, then
> the SSL library on one of the ends  still has the problem with
renegotiation...

It can hardly be the CVE-2009-3555 renegotiation problem.

Both machines have OpenSSL 1.0.0, and RFC 5746 was implemented in
0.9.8m.

It certainly *sounds* like that problem though. Maybe RedHat carried along the broken fix? It would surprise me, but given that it's openssl, not hugely much so :)

It would be worth trying with ssl_renegotiation=0 to see if the problem goes away.
 

But I'll try to test if normal connections have the problem too.

That would be a useful datapoint. All settings around this *should* happen at a lower layer than the difference between a replication connection and a regular one, but it would be good to confir mit. 


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: Problem with streaming replication over SSL

From
"Albe Laurenz"
Date:
Magnus Hagander wrote:
>>>> I have streaming replication configured over SSL, and
>>>> there seems to be a problem with SSL renegotiation.
>> [...]
>>>> After that, streaming replication reconnects and resumes working.
>>>>
>>>> Is this an oversight in the replication protocol, or is this
>>>> working as designed?

>>> This sounds a lot like the general issue with SSL renegotiation, just
>>> that it tends to show itself
>>> more often on replication connections since they don't disconnect very
>>> often...
>>>
>>> Have you tried disabling SSL renegotiation on the connection
>>> (ssl_renegotation=0)? If that helps, then
>>> the SSL library on one of the ends  still has the problem with
>>> renegotiation...

>> It can hardly be the CVE-2009-3555 renegotiation problem.
>>
>> Both machines have OpenSSL 1.0.0, and RFC 5746 was implemented
>> in 0.9.8m.

> It certainly *sounds* like that problem though. Maybe RedHat carried
> along the broken fix? It would surprise me, but given that it's
> openssl, not hugely much so :)
>
> It would be worth trying with ssl_renegotiation=0 to see if the problem
> goes away.

I tried, and that makes the problem go away.
This is to be expected of course, because no
renegotiation will take place with that setting.

>> But I'll try to test if normal connections have the problem too.

> That would be a useful datapoint. All settings around this *should*
> happen at a lower layer than the difference between a replication
> connection and a regular one, but it would be good to confir mit. 

I tried, and a normal data connection does not have the
problem.  I transferred more than 0.5 GB of data (at which
point renegotiation should take place), and there was no error.

Does it make sense to try and take a stack trace of the
problem, on primary or standby?

Yours,
Laurenz Albe

Re: Problem with streaming replication over SSL

From
"Albe Laurenz"
Date:
I wrote:
>Magnus Hagander wrote:
>>>>> I have streaming replication configured over SSL, and
>>>>> there seems to be a problem with SSL renegotiation.
>>> [...]
>>>>> After that, streaming replication reconnects and resumes working.
>>>>>
>>>>> Is this an oversight in the replication protocol, or is this
>>>>> working as designed?

>>> It can hardly be the CVE-2009-3555 renegotiation problem.
>>> Both machines have OpenSSL 1.0.0, and RFC 5746 was implemented
>>> in 0.9.8m.

>> It would be worth trying with ssl_renegotiation=0 to see if the
>> problem goes away.

> I tried, and that makes the problem go away.
> This is to be expected of course, because no
> renegotiation will take place with that setting.

>>> But I'll try to test if normal connections have the problem too.

>> That would be a useful datapoint. All settings around this *should*
>> happen at a lower layer than the difference between a replication
>> connection and a regular one, but it would be good to confir mit.
> 
> I tried, and a normal data connection does not have the
> problem.  I transferred more than 0.5 GB of data (at which
> point renegotiation should take place), and there was no error.
> 
> Does it make sense to try and take a stack trace of the
> problem, on primary or standby?

FWIW, I collected a stack trace:

#0  libpqrcv_receive (timeout=100, type=0x7fff0e1d9d2f "w\002", buffer=0x7fff0e1d9d20, len=0x7fff0e1d9d28)
    at libpqwalreceiver.c:366
#1  0x0000000000601797 in WalReceiverMain () at walreceiver.c:311
#2  0x00000000004ae1a1 in AuxiliaryProcessMain (argc=2, argv=0x7fff0e1d9dc0) at bootstrap.c:433
#3  0x00000000005ee0c3 in StartChildProcess (type=WalReceiverProcess) at postmaster.c:4504
#4  0x00000000005f13c1 in sigusr1_handler (postgres_signal_arg=<value optimized out>) at postmaster.c:4300
#5  <signal handler called>
#6  0x00000037934de2d3 in __select_nocancel () from /lib64/libc.so.6
#7  0x00000000005ef54a in ServerLoop () at postmaster.c:1415
#8  0x00000000005f21de in PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1116
#9  0x0000000000594398 in main (argc=5, argv=0x1596bf0) at main.c:199

The only thing that sticks out to me is that walreceiver
is running inside a signal handler -- could that cause a problem
with OpenSSL?

Yours,
Laurenz Albe