Thread: libpq3 + ssl memory leak

libpq3 + ssl memory leak

From
Olaf Hartmann
Date:
============================================================================
                         POSTGRESQL BUG REPORT TEMPLATE
============================================================================


Your name               : Olaf Hartmann
Your email address      : hao@telemedsys.de


System Configuration
---------------------
   Architecture (example: Intel Pentium)         : i386

   Operating System (example: Linux 2.0.26 ELF)  : linux-2.4.21

   PostgreSQL version (example: PostgreSQL-7.3.4): PostgreSQL-7.4

   Compiler used (example:  gcc 2.95.2)          : gcc 3.3.2


Please enter a FULL description of your problem:
------------------------------------------------

We expirience a memory leak with every connection when using an SSL
encrypted TCP connection to connect a postgres database.
A simple code example is included in this mail. It simply opens and
closes a database connection. (You also need a postgresql server set up
for ssl.)

'valgrind' detects the main memory leak (which gets bigger with every
connection) to be in libcrypto.so.0.9.7. But you can also verify the
increased memory consumption using 'ps axl', if you add a 'sleep(..);'
at the end of the program.


Please describe a way to repeat the problem.   Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------


/*
  * begin leak.c
  */

#include <stdio.h>
#include <postgresql/libpq-fe.h>

int main()
{
   int i;
   PGconn *conn;

   for (i = 0; i < 10; i ++)
   {
     conn = PQconnectdb("host=... dbname=template1 ... requiressl=1");
     if (PQstatus(conn) == CONNECTION_BAD)
     {
       fprintf(stderr, "Error: %s", PQerrorMessage(conn));
       PQfinish(conn);
       exit(1);
     }
     PQfinish(conn);
   }
   return 0;
}

/*
  * end leak.c
  */


$ gcc leak.c -lpq -o leak


$ valgrind --leak-check=yes --error-limit=no ./leak


If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

Re: libpq3 + ssl memory leak

From
Neil Conway
Date:
Olaf Hartmann <hao@telemedsys.de> writes:
> We expirience a memory leak with every connection when using an SSL
> encrypted TCP connection to connect a postgres database.

I can verify this locally. Unfortunately, my copy of valgrind doesn't
seem to be picking up the debugging symbols for OpenSSL, which makes
it difficult to track down if this is a fault in OpenSSL or libpq.

If someone else wants to tackle this, go ahead -- my plate is full
until around December 10th. Otherwise I'll look into it then.

-Neil

Re: libpq3 + ssl memory leak

From
Neil Conway
Date:
Neil Conway <neilc@samurai.com> writes:
> I can verify this locally. Unfortunately, my copy of valgrind doesn't
> seem to be picking up the debugging symbols for OpenSSL

Ok, some progress. I installed a development snapshot of OpenSSL, and
confirmed that the problem still occurs there. The full valgrind leak
trace call stack is:

==14598== 8316 bytes in 99 blocks are definitely lost in loss record 5 of 7
==14598==    at 0x4002CA5D: malloc (vg_replace_malloc.c:153)
==14598==    by 0x4040340C: default_malloc_ex (mem.c:79)
==14598==    by 0x40403A93: CRYPTO_malloc (mem.c:304)
==14598==    by 0x4048F8A4: asn1_item_ex_combine_new (tasn_new.c:171)
==14598==    by 0x4048F5E4: ASN1_item_ex_new (tasn_new.c:83)
==14598==    by 0x40491A91: ASN1_item_ex_d2i (tasn_dec.c:318)
==14598==    by 0x40491423: ASN1_item_d2i (tasn_dec.c:115)
==14598==    by 0x404897DD: d2i_X509 (x_x509.c:125)
==14598==    by 0x403AAF22: ssl3_get_server_certificate (s3_clnt.c:837)
==14598==    by 0x403A9E87: ssl3_connect (s3_clnt.c:295)
==14598==    by 0x403B89AD: SSL_connect (ssl_lib.c:829)
==14598==    by 0x4025C5FA: open_client_SSL (fe-secure.c:898)
==14598==    by 0x4025B62B: pqsecure_open_client (fe-secure.c:253)
==14598==    by 0x4024D496: PQconnectPoll (fe-connect.c:1518)
==14598==    by 0x4024CBB0: connectDBComplete (fe-connect.c:1060)
==14598==    by 0x4024BB8D: PQconnectdb (fe-connect.c:248)
==14598==    by 0x804860E: main (leak.c:11)
==14598==    by 0x40279DA5: __libc_start_main (libc-start.c:242)
==14598==    by 0x8048540: ??? (start.S:102)

(The line numbers are against the snapshot I'm using, which you can
get from ftp://ftp.openssl.org/snapshot/openssl-SNAP-20031211.tar.gz)

A cursory look at the libpq SSL code doesn't reveal anything obvious
that we're doing wrong (e.g. SSL_new() calls seem to be matched up
with SSL_free() calls AFAICS). The problem doesn't seem to occur if I
link against 7.3's libpq.so, so this does seem to be a libpq bug (and
not an OpenSSL leak).

BTW Sean, do you have any insight into the problem?

I'll continue working on it and let the list know if I learn anything
new.

-Neil

Re: libpq3 + ssl memory leak

From
Neil Conway
Date:
Okay, I've attached a patch that fixes the problem for me. The problem
turned out to be pretty simple: the PostgreSQL code (both backend and
frontend SSL support) was calling SSL_get_peer_certificate() without
properly free'ing its return value.

I haven't actually confirmed the backend memory leak, but it should be
readily reproduceable (the same OpenSSL API call is made and the
return value is never free'd).

Olaf, can you please test the attached patch? (Against CVS HEAD, but
should apply easily enough to 7.4.0) At the very least you'll need to
rebuild libpq and ensure that the test app picks up the new .so, but
testing the backend with the patch applied would also be helpful.

This fix needs to be in 7.4.1, so if anyone wants to review this
patch, that would be great.

-Neil

P.S. I added an assertion in the backend code to help catch any other
memory leaks in this area. I didn't add an equivalent one to the
frontend code, because libpq doesn't seem to be setup for
assertions. When this is applied to 7.4.1, we probably shouldn't
include the assertion at the risk of suffering backend crashes.


Attachment

Re: libpq3 + ssl memory leak

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> This fix needs to be in 7.4.1, so if anyone wants to review this
> patch, that would be great.

Looks reasonable from here, but confirmation from the complainant that
it fixes his problem would be real good ...

> P.S. I added an assertion in the backend code to help catch any other
> memory leaks in this area. I didn't add an equivalent one to the
> frontend code, because libpq doesn't seem to be setup for
> assertions.

Yeah, we can't assume that a decent support mechanism exists in an
arbitrary client application.

> When this is applied to 7.4.1, we probably shouldn't
> include the assertion at the risk of suffering backend crashes.

Assert away.  People who are concerned about assertions destabilizing
their systems don't build with asserts enabled in the first place.

            regards, tom lane

Re: libpq3 + ssl memory leak

From
Tom Lane
Date:
Neil Conway <neilc@samurai.com> writes:
> Okay, I've attached a patch that fixes the problem for me. The problem
> turned out to be pretty simple: the PostgreSQL code (both backend and
> frontend SSL support) was calling SSL_get_peer_certificate() without
> properly free'ing its return value.

Applied to 7.4 and HEAD.

            regards, tom lane

Re: libpq3 + ssl memory leak

From
Bruce Momjian
Date:
Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
> > Okay, I've attached a patch that fixes the problem for me. The problem
> > turned out to be pretty simple: the PostgreSQL code (both backend and
> > frontend SSL support) was calling SSL_get_peer_certificate() without
> > properly free'ing its return value.
>
> Applied to 7.4 and HEAD.

Good. I have given up getting an update from the original bug reporter.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073