Thread: pg_dump 8.4.9 failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux

Hi all,

Looking for confirmation there is an issue with pg_dump failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux.

-bash-4.1$ pg_dump -V
pg_dump (PostgreSQL) 8.4.9

-bash-4.1$ pg_dump -h localhost -C Hogwarts -a -t mafs -f zz
pg_dump: Dumping the contents of table "mafs" failed: PQgetCopyData() failed.
pg_dump: Error message from server: SSL error: unexpected message
pg_dump: The command was: COPY public.mafs (hugo_symbol, 
...
...
analysis_id) TO stdout;


This is only happening on 2 tables in this database.  The same database can be backed up with pgAdmin3.app remotely from a Mac

As stated I am fairly sure the cause was the upgrade of openssl as it started to fail the next day:
Jun 16 05:18:25 qcmg-database1 yum[2965]: Updated: openssl-1.0.1e-30.el6_6.11.x86_64 
Douglas Stetner <stetner@icloud.com> writes:
> Looking for confirmation there is an issue with pg_dump failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on
redhatlinux. 

Quick thought --- did you restart the Postgres service after upgrading
openssl?  If not, your server is still using the old library version,
while pg_dump would be running the new version on the client side.
I don't know exactly what was done to openssl in the last round of
revisions, but maybe there is some sort of version compatibility issue.

Also, you really ought to be running something newer than PG 8.4.9.

            regards, tom lane


>
> On 18 Jun 2015, at 02:06 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Douglas Stetner <stetner@icloud.com> writes:
>> Looking for confirmation there is an issue with pg_dump failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64
onredhat linux. 
>
> Quick thought --- did you restart the Postgres service after upgrading
> openssl?  If not, your server is still using the old library version,
> while pg_dump would be running the new version on the client side.
> I don't know exactly what was done to openssl in the last round of
> revisions, but maybe there is some sort of version compatibility issue.
>
> Also, you really ought to be running something newer than PG 8.4.9.
>
>             regards, tom lane


Thanks for the reply Tom.  Unfortunately restart did not help.  Will try an upgrade to 8.4.20 (other software depends
on8.4.x) A remote client with 8.4.20 works, so fingers crossed. 

Douglas Stetner
Mobile 0474 082 019
UNIX - Live Free Or Die


Douglas Stetner
Mobile 0474 082 019
UNIX - Live Free Or Die




Attachment
Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Douglas Stetner <stetner@icloud.com> writes:
>> Looking for confirmation there is an issue with pg_dump failing after
>> upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux.
>
> Quick thought --- did you restart the Postgres service after upgrading
> openssl?  If not, your server is still using the old library version,
> while pg_dump would be running the new version on the client side.
> I don't know exactly what was done to openssl in the last round of
> revisions, but maybe there is some sort of version compatibility issue.
>
> Also, you really ought to be running something newer than PG 8.4.9.

Hi,

I have the same problem with fresh postgresql 9.2.13.
Started after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64

Since then pg_dump aborts after dumping circa 2GB:

pg_dump: [archiver (db)] query failed: SSL error: unexpected message
pg_dump: [archiver (db)] query was: FETCH 100 FROM _pg_dump_cursor

openssl-1.0.1e-30.el6_6.11.x86_64 on both ends (connecting via localhost)

pg_dump via unix socket, without "-h localhost" - there is no problem.

Fetching 2.5 GB of such text dump via https (apache + mod_ssl +
openssl-1.0.1e-30.el6_6.11.x86_64) => wget +
openssl-1.0.1e-30.el6_6.11.x86_64  - there is no problem

Looks like postgresql+ssl issue.

postgres=#  select name,setting,unit from pg_settings where name ~ 'ssl' ;
          name           |              setting              | unit
-------------------------+-----------------------------------+------
 ssl                     | on                                |
 ssl_ca_file             |                                   |
 ssl_cert_file           | server.crt                        |
 ssl_ciphers             | ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH |
 ssl_crl_file            |                                   |
 ssl_key_file            | server.key                        |
 ssl_renegotiation_limit | 524288                            | kB


Any thoughts?

Regards,

--
Piotr Gackiewicz

>
> On 18 Jun 2015, at 02:06 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Douglas Stetner <stetner@icloud.com> writes:
>> Looking for confirmation there is an issue with pg_dump failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64
onredhat linux. 
>
> Quick thought --- did you restart the Postgres service after upgrading
> openssl?  If not, your server is still using the old library version,
> while pg_dump would be running the new version on the client side.
> I don't know exactly what was done to openssl in the last round of
> revisions, but maybe there is some sort of version compatibility issue.
>
> Also, you really ought to be running something newer than PG 8.4.9.
>
>             regards, tom lane


Thanks for the reply Tom.  Unfortunately restart did not help.  Will try an upgrade to 8.4.20 (other software depends
on8.4.x) A remote client with 8.4.20 works, so fingers crossed. 

Douglas Stetner
Mobile 0474 082 019
UNIX - Live Free Or Die



Douglas Stetner <stetner@icloud.com> writes:
> On 18 Jun 2015, at 02:06 , Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Douglas Stetner <stetner@icloud.com> writes:
>>> Looking for confirmation there is an issue with pg_dump failing after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64
onredhat linux. 

>> Quick thought --- did you restart the Postgres service after upgrading
>> openssl?  If not, your server is still using the old library version,
>> while pg_dump would be running the new version on the client side.
>> I don't know exactly what was done to openssl in the last round of
>> revisions, but maybe there is some sort of version compatibility issue.
>>
>> Also, you really ought to be running something newer than PG 8.4.9.

> Thanks for the reply Tom.  Unfortunately restart did not help.  Will try
> an upgrade to 8.4.20 (other software depends on 8.4.x) A remote client
> with 8.4.20 works, so fingers crossed.

Hm.  The only possibly SSL-relevant patch I see in the 8.4 git history is
this:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_4_BR [74242c23c] 2013-12-05 12:48:28 -0500
Branch: REL9_3_STABLE Release: REL9_3_3 [2a6e1a554] 2013-12-05 12:48:31 -0500
Branch: REL9_2_STABLE Release: REL9_2_7 [41042970b] 2013-12-05 12:48:35 -0500
Branch: REL9_1_STABLE Release: REL9_1_12 [ad910ccdc] 2013-12-05 12:48:37 -0500
Branch: REL9_0_STABLE Release: REL9_0_16 [36352ceb4] 2013-12-05 12:48:41 -0500
Branch: REL8_4_STABLE Release: REL8_4_20 [7635dae55] 2013-12-05 12:48:44 -0500

    Clear retry flags properly in replacement OpenSSL sock_write function.

    Current OpenSSL code includes a BIO_clear_retry_flags() step in the
    sock_write() function.  Either we failed to copy the code correctly, or
    they added this since we copied it.  In any case, lack of the clear step
    appears to be the cause of the server lockup after connection loss reported
    in bug #8647 from Valentine Gogichashvili.  Assume that this is correct
    coding for all OpenSSL versions, and hence back-patch to all supported
    branches.

    Diagnosis and patch by Alexander Kukushkin.

Although the problem that was reported at the time isn't much like yours,
it's possible that this missing step has additional effects with the
latest openssl version; so it's certainly worth trying.

Whether this fixes your immediate issue or not, you really ought to be
using the last available 8.4.x version, which is 8.4.22.

            regards, tom lane


Piotr Gackiewicz wrote:
> Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Douglas Stetner <stetner@icloud.com> writes:
>>> Looking for confirmation there is an issue with pg_dump failing after
>>> upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux.
>>
>> Quick thought --- did you restart the Postgres service after upgrading
>> openssl?  If not, your server is still using the old library version,
>> while pg_dump would be running the new version on the client side.
>> I don't know exactly what was done to openssl in the last round of
>> revisions, but maybe there is some sort of version compatibility issue.
>>
>> Also, you really ought to be running something newer than PG 8.4.9.

> I have the same problem with fresh postgresql 9.2.13.
> Started after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64
> 
> Since then pg_dump aborts after dumping circa 2GB:
> 
> pg_dump: [archiver (db)] query failed: SSL error: unexpected message
> pg_dump: [archiver (db)] query was: FETCH 100 FROM _pg_dump_cursor
> 
> openssl-1.0.1e-30.el6_6.11.x86_64 on both ends (connecting via localhost)
> 
> pg_dump via unix socket, without "-h localhost" - there is no problem.
> 
> Fetching 2.5 GB of such text dump via https (apache + mod_ssl +
> openssl-1.0.1e-30.el6_6.11.x86_64) => wget +
> openssl-1.0.1e-30.el6_6.11.x86_64  - there is no problem
> 
> Looks like postgresql+ssl issue.
> 
> postgres=#  select name,setting,unit from pg_settings where name ~ 'ssl' ;
>           name           |              setting              | unit
> -------------------------+-----------------------------------+------
>  ssl                     | on                                |
>  ssl_ca_file             |                                   |
>  ssl_cert_file           | server.crt                        |
>  ssl_ciphers             | ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH |
>  ssl_crl_file            |                                   |
>  ssl_key_file            | server.key                        |
>  ssl_renegotiation_limit | 524288                            | kB
> 
> 
> Any thoughts?

Maybe it has something to do with this OpenSSL bug:
http://rt.openssl.org/Ticket/Display.html?id=3712&user=guest&pass=guest

Basically, OpenSSL fails to handle application data messages during renegotiation.

I have only encountered that when using other SSL libraries together with
OpenSSL, but maybe it can also happen with only OpenSSL.

Just to make sure:
Do you have the same version of OpenSSL on both PostgreSQL client and server?

Yours,
Laurenz Albe

Albe Laurenz <laurenz.albe@wien.gv.at> writes:
> Piotr Gackiewicz wrote:
>>> Douglas Stetner <stetner@icloud.com> writes:
>>>> Looking for confirmation there is an issue with pg_dump failing after
>>>> upgrade to openssl-1.0.1e-30.el6_6.11.x86_64 on redhat linux.

>> I have the same problem with fresh postgresql 9.2.13.
>> Started after upgrade to openssl-1.0.1e-30.el6_6.11.x86_64
>>
>> Since then pg_dump aborts after dumping circa 2GB:
>> pg_dump: [archiver (db)] query failed: SSL error: unexpected message
>> pg_dump: [archiver (db)] query was: FETCH 100 FROM _pg_dump_cursor

I've been able to reproduce this failure with Postgres HEAD, so whatever
it is, it's pretty much independent of our code version.  It was fine with
openssl-1.0.1e-30.el6_6.9.x86_64
but after updating to
openssl-1.0.1e-30.el6_6.11.x86_64
pg_dump fails after about 2GB worth of data transfer.

I find that setting ssl_renegotiation_limit to 0 in postgresql.conf allows
things to work, so it's got something to do with bad renegotiation.  But
curiously, the amount of data dumped before failing is the same whether
ssl_renegotiation_limit is 512MB (the default) or something much smaller
such as 10MB.  In either case we should have successfully completed
several renegotiations before the failure, so I don't think it's solely
a matter of "renegotiation is busted".

> Maybe it has something to do with this OpenSSL bug:
> http://rt.openssl.org/Ticket/Display.html?id=3712&user=guest&pass=guest

That link doesn't work for me :-(

I'm going to file this as a bug with Red Hat.  In the meantime it looks
like we can suggest ssl_renegotiation_limit = 0 as a temporary workaround.

            regards, tom lane


I wrote:
> I'm going to file this as a bug with Red Hat.  In the meantime it looks
> like we can suggest ssl_renegotiation_limit = 0 as a temporary workaround.

Done at

https://bugzilla.redhat.com/show_bug.cgi?id=1234487

            regards, tom lane


I wrote:
>> I'm going to file this as a bug with Red Hat.  In the meantime it looks
>> like we can suggest ssl_renegotiation_limit = 0 as a temporary workaround.

> Done at
> https://bugzilla.redhat.com/show_bug.cgi?id=1234487

BTW, we should not feel too awful, because it seems this same update has
also broken sendmail, mysql, and probably other services.  Not for the
same reason, but still ...

Red Hat fell down badly on QA'ing this.

            regards, tom lane


Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
>
> Maybe it has something to do with this OpenSSL bug:
> http://rt.openssl.org/Ticket/Display.html?id=3712&user=guest&pass=guest
>
> Basically, OpenSSL fails to handle application data messages during renegotiation.
>
> I have only encountered that when using other SSL libraries together with
> OpenSSL, but maybe it can also happen with only OpenSSL.
>
> Just to make sure:
> Do you have the same version of OpenSSL on both PostgreSQL client and server?

Yep, that's it :

$ psql -h localhost -c "SET ssl_renegotiation_limit='3kB'; SELECT repeat('0123456789', 1800);"
SSL error: unexpected message
connection to server was lost

psql and server share same openssl library on the same host, of course.

--
Piotr Gackiewicz

Piotr Gackiewicz <gacek@intertele.pl> writes:
> Yep, that's it :

> $ psql -h localhost -c "SET ssl_renegotiation_limit='3kB'; SELECT repeat('0123456789', 1800);"
> SSL error: unexpected message
> connection to server was lost

> psql and server share same openssl library on the same host, of course.

Red Hat have confirmed that this was caused by a faulty openssl security
patch in RHEL6 and RHEL7.  They apparently have a fix already, which
I'd expect will ship in a day or two.  Keep an eye on the bugzilla entry
I posted upthread for status.

            regards, tom lane


Piotr Gackiewicz <gacek@intertele.pl> writes:
> $ psql -h localhost -c "SET ssl_renegotiation_limit='3kB'; SELECT repeat('0123456789', 1800);"
> SSL error: unexpected message
> connection to server was lost

BTW, are you using any nondefault SSL settings?  Because I can't reproduce
the failure you show.  In my tests, the value of ssl_renegotiation_limit
does not seem to matter, as long as it's not zero.  What it looks like
is that if we've forced any renegotiations, then once the server has
transmitted more than 2GB, the next server SSL_read() call fails.  The
precise number of previous renegotiations does not matter.

If the above is reproducible for you, there may be more than one bug :-(

            regards, tom lane


I wrote:
> Piotr Gackiewicz <gacek@intertele.pl> writes:
>> $ psql -h localhost -c "SET ssl_renegotiation_limit='3kB'; SELECT repeat('0123456789', 1800);"
>> SSL error: unexpected message
>> connection to server was lost

> BTW, are you using any nondefault SSL settings?  Because I can't reproduce
> the failure you show.

Oh, scratch that: I do reproduce that in PG <= 9.3, just not in 9.4 or
HEAD.  Apparently our renegotiation rewrite in 9.4 affects this.

            regards, tom lane


Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> Piotr Gackiewicz <gacek@intertele.pl> writes:
>>> $ psql -h localhost -c "SET ssl_renegotiation_limit='3kB'; SELECT repeat('0123456789', 1800);"
>>> SSL error: unexpected message
>>> connection to server was lost
>
>> BTW, are you using any nondefault SSL settings?  Because I can't reproduce
>> the failure you show.
>
> Oh, scratch that: I do reproduce that in PG <= 9.3, just not in 9.4 or
> HEAD.  Apparently our renegotiation rewrite in 9.4 affects this.

I have even more surprises:
9.4.4 passes test above (9.2.13 does not).
But 9.4.4 pg_dump over ssl still breaks, this time with slightly different error:

$ pg_dump --column-inserts -h localhost4 db > db.dump
pg_dump: [archiver (db)] query failed: connection not open
pg_dump: [archiver (db)] query was: FETCH 100 FROM _pg_dump_cursor

In this case it breaked after dumping 1.7 GB, but this is completely different
data from my previous 9.2.13 tests.

Could it be really two different bugs, as you suspected?
:-/

Regards,

--
Piotr Gackiewicz