I wrote:
>Magnus Hagander wrote:
>>>>> I have streaming replication configured over SSL, and
>>>>> there seems to be a problem with SSL renegotiation.
>>> [...]
>>>>> After that, streaming replication reconnects and resumes working.
>>>>>
>>>>> Is this an oversight in the replication protocol, or is this
>>>>> working as designed?
>>> It can hardly be the CVE-2009-3555 renegotiation problem.
>>> Both machines have OpenSSL 1.0.0, and RFC 5746 was implemented
>>> in 0.9.8m.
>> It would be worth trying with ssl_renegotiation=0 to see if the
>> problem goes away.
> I tried, and that makes the problem go away.
> This is to be expected of course, because no
> renegotiation will take place with that setting.
>>> But I'll try to test if normal connections have the problem too.
>> That would be a useful datapoint. All settings around this *should*
>> happen at a lower layer than the difference between a replication
>> connection and a regular one, but it would be good to confir mit.
>
> I tried, and a normal data connection does not have the
> problem. I transferred more than 0.5 GB of data (at which
> point renegotiation should take place), and there was no error.
>
> Does it make sense to try and take a stack trace of the
> problem, on primary or standby?
FWIW, I collected a stack trace:
#0 libpqrcv_receive (timeout=100, type=0x7fff0e1d9d2f "w\002", buffer=0x7fff0e1d9d20, len=0x7fff0e1d9d28)
at libpqwalreceiver.c:366
#1 0x0000000000601797 in WalReceiverMain () at walreceiver.c:311
#2 0x00000000004ae1a1 in AuxiliaryProcessMain (argc=2, argv=0x7fff0e1d9dc0) at bootstrap.c:433
#3 0x00000000005ee0c3 in StartChildProcess (type=WalReceiverProcess) at postmaster.c:4504
#4 0x00000000005f13c1 in sigusr1_handler (postgres_signal_arg=<value optimized out>) at postmaster.c:4300
#5 <signal handler called>
#6 0x00000037934de2d3 in __select_nocancel () from /lib64/libc.so.6
#7 0x00000000005ef54a in ServerLoop () at postmaster.c:1415
#8 0x00000000005f21de in PostmasterMain (argc=<value optimized out>, argv=<value optimized out>) at postmaster.c:1116
#9 0x0000000000594398 in main (argc=5, argv=0x1596bf0) at main.c:199
The only thing that sticks out to me is that walreceiver
is running inside a signal handler -- could that cause a problem
with OpenSSL?
Yours,
Laurenz Albe