Thread: libpq3 + ssl memory leak
============================================================================ POSTGRESQL BUG REPORT TEMPLATE ============================================================================ Your name : Olaf Hartmann Your email address : hao@telemedsys.de System Configuration --------------------- Architecture (example: Intel Pentium) : i386 Operating System (example: Linux 2.0.26 ELF) : linux-2.4.21 PostgreSQL version (example: PostgreSQL-7.3.4): PostgreSQL-7.4 Compiler used (example: gcc 2.95.2) : gcc 3.3.2 Please enter a FULL description of your problem: ------------------------------------------------ We expirience a memory leak with every connection when using an SSL encrypted TCP connection to connect a postgres database. A simple code example is included in this mail. It simply opens and closes a database connection. (You also need a postgresql server set up for ssl.) 'valgrind' detects the main memory leak (which gets bigger with every connection) to be in libcrypto.so.0.9.7. But you can also verify the increased memory consumption using 'ps axl', if you add a 'sleep(..);' at the end of the program. Please describe a way to repeat the problem. Please try to provide a concise reproducible example, if at all possible: ---------------------------------------------------------------------- /* * begin leak.c */ #include <stdio.h> #include <postgresql/libpq-fe.h> int main() { int i; PGconn *conn; for (i = 0; i < 10; i ++) { conn = PQconnectdb("host=... dbname=template1 ... requiressl=1"); if (PQstatus(conn) == CONNECTION_BAD) { fprintf(stderr, "Error: %s", PQerrorMessage(conn)); PQfinish(conn); exit(1); } PQfinish(conn); } return 0; } /* * end leak.c */ $ gcc leak.c -lpq -o leak $ valgrind --leak-check=yes --error-limit=no ./leak If you know how this problem might be fixed, list the solution below: ---------------------------------------------------------------------
Olaf Hartmann <hao@telemedsys.de> writes: > We expirience a memory leak with every connection when using an SSL > encrypted TCP connection to connect a postgres database. I can verify this locally. Unfortunately, my copy of valgrind doesn't seem to be picking up the debugging symbols for OpenSSL, which makes it difficult to track down if this is a fault in OpenSSL or libpq. If someone else wants to tackle this, go ahead -- my plate is full until around December 10th. Otherwise I'll look into it then. -Neil
Neil Conway <neilc@samurai.com> writes: > I can verify this locally. Unfortunately, my copy of valgrind doesn't > seem to be picking up the debugging symbols for OpenSSL Ok, some progress. I installed a development snapshot of OpenSSL, and confirmed that the problem still occurs there. The full valgrind leak trace call stack is: ==14598== 8316 bytes in 99 blocks are definitely lost in loss record 5 of 7 ==14598== at 0x4002CA5D: malloc (vg_replace_malloc.c:153) ==14598== by 0x4040340C: default_malloc_ex (mem.c:79) ==14598== by 0x40403A93: CRYPTO_malloc (mem.c:304) ==14598== by 0x4048F8A4: asn1_item_ex_combine_new (tasn_new.c:171) ==14598== by 0x4048F5E4: ASN1_item_ex_new (tasn_new.c:83) ==14598== by 0x40491A91: ASN1_item_ex_d2i (tasn_dec.c:318) ==14598== by 0x40491423: ASN1_item_d2i (tasn_dec.c:115) ==14598== by 0x404897DD: d2i_X509 (x_x509.c:125) ==14598== by 0x403AAF22: ssl3_get_server_certificate (s3_clnt.c:837) ==14598== by 0x403A9E87: ssl3_connect (s3_clnt.c:295) ==14598== by 0x403B89AD: SSL_connect (ssl_lib.c:829) ==14598== by 0x4025C5FA: open_client_SSL (fe-secure.c:898) ==14598== by 0x4025B62B: pqsecure_open_client (fe-secure.c:253) ==14598== by 0x4024D496: PQconnectPoll (fe-connect.c:1518) ==14598== by 0x4024CBB0: connectDBComplete (fe-connect.c:1060) ==14598== by 0x4024BB8D: PQconnectdb (fe-connect.c:248) ==14598== by 0x804860E: main (leak.c:11) ==14598== by 0x40279DA5: __libc_start_main (libc-start.c:242) ==14598== by 0x8048540: ??? (start.S:102) (The line numbers are against the snapshot I'm using, which you can get from ftp://ftp.openssl.org/snapshot/openssl-SNAP-20031211.tar.gz) A cursory look at the libpq SSL code doesn't reveal anything obvious that we're doing wrong (e.g. SSL_new() calls seem to be matched up with SSL_free() calls AFAICS). The problem doesn't seem to occur if I link against 7.3's libpq.so, so this does seem to be a libpq bug (and not an OpenSSL leak). BTW Sean, do you have any insight into the problem? I'll continue working on it and let the list know if I learn anything new. -Neil
Okay, I've attached a patch that fixes the problem for me. The problem turned out to be pretty simple: the PostgreSQL code (both backend and frontend SSL support) was calling SSL_get_peer_certificate() without properly free'ing its return value. I haven't actually confirmed the backend memory leak, but it should be readily reproduceable (the same OpenSSL API call is made and the return value is never free'd). Olaf, can you please test the attached patch? (Against CVS HEAD, but should apply easily enough to 7.4.0) At the very least you'll need to rebuild libpq and ensure that the test app picks up the new .so, but testing the backend with the patch applied would also be helpful. This fix needs to be in 7.4.1, so if anyone wants to review this patch, that would be great. -Neil P.S. I added an assertion in the backend code to help catch any other memory leaks in this area. I didn't add an equivalent one to the frontend code, because libpq doesn't seem to be setup for assertions. When this is applied to 7.4.1, we probably shouldn't include the assertion at the risk of suffering backend crashes.
Attachment
Neil Conway <neilc@samurai.com> writes: > This fix needs to be in 7.4.1, so if anyone wants to review this > patch, that would be great. Looks reasonable from here, but confirmation from the complainant that it fixes his problem would be real good ... > P.S. I added an assertion in the backend code to help catch any other > memory leaks in this area. I didn't add an equivalent one to the > frontend code, because libpq doesn't seem to be setup for > assertions. Yeah, we can't assume that a decent support mechanism exists in an arbitrary client application. > When this is applied to 7.4.1, we probably shouldn't > include the assertion at the risk of suffering backend crashes. Assert away. People who are concerned about assertions destabilizing their systems don't build with asserts enabled in the first place. regards, tom lane
Neil Conway <neilc@samurai.com> writes: > Okay, I've attached a patch that fixes the problem for me. The problem > turned out to be pretty simple: the PostgreSQL code (both backend and > frontend SSL support) was calling SSL_get_peer_certificate() without > properly free'ing its return value. Applied to 7.4 and HEAD. regards, tom lane
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > Okay, I've attached a patch that fixes the problem for me. The problem > > turned out to be pretty simple: the PostgreSQL code (both backend and > > frontend SSL support) was calling SSL_get_peer_certificate() without > > properly free'ing its return value. > > Applied to 7.4 and HEAD. Good. I have given up getting an update from the original bug reporter. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073