Thread: Fix error handling in be_tls_open_server()
Hi, A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). Here is a fix. Also handle an error from X509_NAME_print_ex(). AFAICS, the error "SSL certificate's distinguished name contains embedded null" could not be reached at all, because XN_FLAG_RFC2253 passed to X509_NAME_print_ex() ensures that null bytes are escaped. Best regards, -- Sergey Shinderuk https://postgrespro.com/
Attachment
> On 1 Aug 2023, at 16:44, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). This has the smell of a theoretical problem, I can't really imagine a certificate where which would produce this. Have you been able to trigger it? Wouldn't a better fix be to error out on len == -1 as in the attached, maybe with a "Shouldn't happen" comment? -- Daniel Gustafsson
Attachment
On Wed, Aug 23, 2023 at 6:23 AM Daniel Gustafsson <daniel@yesql.se> wrote: > This has the smell of a theoretical problem, I can't really imagine a > certificate where which would produce this. Have you been able to trigger it? > > Wouldn't a better fix be to error out on len == -1 as in the attached, maybe > with a "Shouldn't happen" comment? Using "cert clientname=DN" in the HBA should let you issue Subjects without Common Names. Or, if you're using a certificate to authorize the connection rather than authenticate the user (for example "scram-sha-256 clientcert=verify-ca" in your HBA), then the certs you distribute could even be SAN-only with a completely empty Subject. --Jacob
> On 23 Aug 2023, at 23:47, Jacob Champion <jchampion@timescale.com> wrote: > > On Wed, Aug 23, 2023 at 6:23 AM Daniel Gustafsson <daniel@yesql.se> wrote: >> This has the smell of a theoretical problem, I can't really imagine a >> certificate where which would produce this. Have you been able to trigger it? >> >> Wouldn't a better fix be to error out on len == -1 as in the attached, maybe >> with a "Shouldn't happen" comment? > > Using "cert clientname=DN" in the HBA should let you issue Subjects > without Common Names. Or, if you're using a certificate to authorize > the connection rather than authenticate the user (for example > "scram-sha-256 clientcert=verify-ca" in your HBA), then the certs you > distribute could even be SAN-only with a completely empty Subject. That's a good point, didn't think about those. In those cases the original patch by Sergey seems along the right lines. -- Daniel Gustafsson
On 23.08.2023 16:23, Daniel Gustafsson wrote: >> On 1 Aug 2023, at 16:44, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > >> A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). > > This has the smell of a theoretical problem, I can't really imagine a > certificate where which would produce this. Have you been able to trigger it? I triggered a crash by generating a certificate without a CN and forcing malloc to return NULL when called from X509_NAME_print_ex or BIO_get_mem_ptr with gdb. Initially I tried to trigger a crash by generating a certificate without a CN and with a DN contaning the null byte. But as I said, the error condition "SSL certificate's distinguished name contains embedded null" isn't really reachable, because XN_FLAG_RFC2253 escapes null bytes. -- Sergey Shinderuk https://postgrespro.com/
> On 24 Aug 2023, at 10:11, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > > On 23.08.2023 16:23, Daniel Gustafsson wrote: >>> On 1 Aug 2023, at 16:44, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: >>> A static analyzer reported a possible pfree(NULL) in be_tls_open_server(). >> This has the smell of a theoretical problem, I can't really imagine a >> certificate where which would produce this. Have you been able to trigger it? > > I triggered a crash by generating a certificate without a CN and forcing malloc to return NULL when called from X509_NAME_print_exor BIO_get_mem_ptr with gdb. Can you extend the patch with that certificate and a test using it? The certificates are generated from config files kept in the repo in src/test/ssl in order to be reproducible. -- Daniel Gustafsson
On 24.08.2023 11:38, Daniel Gustafsson wrote: >> On 24 Aug 2023, at 10:11, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: >> I triggered a crash by generating a certificate without a CN and forcing malloc to return NULL when called from X509_NAME_print_exor BIO_get_mem_ptr with gdb. > > Can you extend the patch with that certificate and a test using it? The > certificates are generated from config files kept in the repo in src/test/ssl > in order to be reproducible. But I need to force malloc to fail at the right time during the test. I don't think I can make a stable and portable test from this. -- Sergey Shinderuk https://postgrespro.com/
On Thu, Aug 24, 2023 at 12:13:18PM +0300, Sergey Shinderuk wrote: > But I need to force malloc to fail at the right time during the test. I > don't think I can make a stable and portable test from this. LD_PRELOAD is the only thing I can think about, but that's very fancy. Even with that, having a certificate with a NULL peer_cn could prove to be useful in the SSL suite to stress more patterns around it? -- Michael
Attachment
On Thu, Aug 24, 2023 at 6:25 PM Michael Paquier <michael@paquier.xyz> wrote: > LD_PRELOAD is the only thing I can think about, but that's very fancy. > Even with that, having a certificate with a NULL peer_cn could prove > to be useful in the SSL suite to stress more patterns around it? +1. Last we tried it, OpenSSL didn't want to create a certificate with an embedded null, but maybe things have changed? --Jacob
On 28.08.2023 18:12, Jacob Champion wrote: > On Thu, Aug 24, 2023 at 6:25 PM Michael Paquier <michael@paquier.xyz> wrote: >> LD_PRELOAD is the only thing I can think about, but that's very fancy. >> Even with that, having a certificate with a NULL peer_cn could prove >> to be useful in the SSL suite to stress more patterns around it? > > +1. Last we tried it, OpenSSL didn't want to create a certificate with > an embedded null, but maybe things have changed? > To embed a null byte into the Subject, I first generated a regular certificate request in the DER (binary) format, then manually inserted null into the file and recomputed the checksum. Like this: https://security.stackexchange.com/a/58845 I'll try to add a client certificate lacking a CN to the SSL test suite. -- Sergey Shinderuk https://postgrespro.com/
> On 28 Aug 2023, at 19:39, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > I'll try to add a client certificate lacking a CN to the SSL test suite. Certificates can be regenerated with the buildsystem, which ideally would apply to this cert as well, but if that's not feasible we can perhaps accept a static one with build information detailed in the README. Having such a cert could for sure be interesting for testing. Awaiting resolution on this, I propose we go ahead with the original patch from this thread. Any objections to that? -- Daniel Gustafsson
On Mon, Sep 18, 2023 at 02:35:28PM +0200, Daniel Gustafsson wrote: > Certificates can be regenerated with the buildsystem, which ideally would apply > to this cert as well, but if that's not feasible we can perhaps accept a static > one with build information detailed in the README. Having such a cert could > for sure be interesting for testing. WFM, but I'd prefer something that would be generated with the makefile rules. These are so handy when it comes to regenerate all these certs.. > Awaiting resolution on this, I propose we go ahead with the original patch from > this thread. Any objections to that? I was wondering for a few seconds if you talked about the one posted on [1], which would break the case where X509_NAME_get_text_by_NID() fails if there's a valid bio, but you mean the one at the top of the thread in [2], of course :) One doubt that I have is if we shouldn't let X509_NAME_print_ex() be as it is now, and not force a failure on the bio if this calls fails. [1]: https://www.postgresql.org/message-id/E3921399-FAE7-4B1F-B1BF-B3357DDC9F19@yesql.se [2]: https://www.postgresql.org/message-id/8db5374d-32e0-6abb-d402-40762511eff2@postgrespro.ru -- Michael
Attachment
On 19.09.2023 03:54, Michael Paquier wrote: > One doubt that I have is if we shouldn't let X509_NAME_print_ex() be > as it is now, and not force a failure on the bio if this calls fails. If malloc fails inside X509_NAME_print_ex, then we will be left with empty port->peer_dn. Here is a gdb session showing this: (gdb) b X509_NAME_print_ex Breakpoint 1 at 0x7f539f6c0cf0 (gdb) c Continuing. Breakpoint 1, 0x00007f539f6c0cf0 in X509_NAME_print_ex () from /lib/x86_64-linux-gnu/libcrypto.so.3 (gdb) bt #0 0x00007f539f6c0cf0 in X509_NAME_print_ex () from /lib/x86_64-linux-gnu/libcrypto.so.3 #1 0x000056026d2fbe8d in be_tls_open_server (port=port@entry=0x56026ed5d730) at be-secure-openssl.c:635 #2 0x000056026d2eefa5 in secure_open_server (port=port@entry=0x56026ed5d730) at be-secure.c:118 #3 0x000056026d3dc412 in ProcessStartupPacket (port=port@entry=0x56026ed5d730, ssl_done=ssl_done@entry=false, gss_done=gss_done@entry=false) at postmaster.c:2065 #4 0x000056026d3dcd8e in BackendInitialize (port=port@entry=0x56026ed5d730) at postmaster.c:4377 #5 0x000056026d3def6a in BackendStartup (port=port@entry=0x56026ed5d730) at postmaster.c:4155 #6 0x000056026d3df115 in ServerLoop () at postmaster.c:1781 #7 0x000056026d3e0645 in PostmasterMain (argc=argc@entry=20, argv=argv@entry=0x56026ec5a0d0) at postmaster.c:1465 #8 0x000056026d2fcd7c in main (argc=20, argv=0x56026ec5a0d0) at main.c:198 (gdb) b malloc Breakpoint 2 at 0x7f539eca5120: file ./malloc/malloc.c, line 3287. (gdb) c Continuing. Breakpoint 2, __GI___libc_malloc (bytes=4) at ./malloc/malloc.c:3287 3287 ./malloc/malloc.c: No such file or directory. (gdb) bt #0 __GI___libc_malloc (bytes=4) at ./malloc/malloc.c:3287 #1 0x00007f539f6f6e09 in BUF_MEM_grow_clean () from /lib/x86_64-linux-gnu/libcrypto.so.3 #2 0x00007f539f6e4fb8 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #3 0x00007f539f6d22fb in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #4 0x00007f539f6d5c06 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #5 0x00007f539f6d5d37 in BIO_write () from /lib/x86_64-linux-gnu/libcrypto.so.3 #6 0x00007f539f6bdb41 in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #7 0x00007f539f6c0b7d in ?? () from /lib/x86_64-linux-gnu/libcrypto.so.3 #8 0x000056026d2fbe8d in be_tls_open_server (port=port@entry=0x56026ed5d730) at be-secure-openssl.c:635 #9 0x000056026d2eefa5 in secure_open_server (port=port@entry=0x56026ed5d730) at be-secure.c:118 #10 0x000056026d3dc412 in ProcessStartupPacket (port=port@entry=0x56026ed5d730, ssl_done=ssl_done@entry=false, gss_done=gss_done@entry=false) at postmaster.c:2065 #11 0x000056026d3dcd8e in BackendInitialize (port=port@entry=0x56026ed5d730) at postmaster.c:4377 #12 0x000056026d3def6a in BackendStartup (port=port@entry=0x56026ed5d730) at postmaster.c:4155 #13 0x000056026d3df115 in ServerLoop () at postmaster.c:1781 #14 0x000056026d3e0645 in PostmasterMain (argc=argc@entry=20, argv=argv@entry=0x56026ec5a0d0) at postmaster.c:1465 #15 0x000056026d2fcd7c in main (argc=20, argv=0x56026ec5a0d0) at main.c:198 (gdb) return 0 Make __GI___libc_malloc return now? (y or n) y #0 0x00007f539f6f6e09 in BUF_MEM_grow_clean () from /lib/x86_64-linux-gnu/libcrypto.so.3 (gdb) delete Delete all breakpoints? (y or n) y (gdb) c Continuing. And in the server log we see: DEBUG: SSL connection from DN:"" CN:"ssltestuser" While in the normal case we get: DEBUG: SSL connection from DN:"CN=ssltestuser" CN:"ssltestuser"\ Probably we shouldn't ignore the error from X509_NAME_print_ex? -- Sergey Shinderuk https://postgrespro.com/
> On 19 Sep 2023, at 10:06, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > > On 19.09.2023 03:54, Michael Paquier wrote: >> One doubt that I have is if we shouldn't let X509_NAME_print_ex() be >> as it is now, and not force a failure on the bio if this calls fails. > > If malloc fails inside X509_NAME_print_ex, then we will be left with empty port->peer_dn. Looking at the OpenSSL code, there a other (albeit esoteric) errors that return -1 as well. I agree that we should handle this error. X509_NAME_print_ex is not documented to return -1 in OpenSSL 1.0.2 but reading the code it's clear that it does, so checking for -1 is safe for all supported OpenSSL versions (supported by us that is). Attached is a v2 on top of HEAD with commit message etc, which I propose to backpatch to v15 where it was introduced. -- Daniel Gustafsson
Attachment
On 20.09.2023 11:42, Daniel Gustafsson wrote: > Attached is a v2 on top of HEAD with commit message etc, which I propose to > backpatch to v15 where it was introduced. > There is a typo: upon en error. Otherwise, looks good to me. Thank you. -- Sergey Shinderuk https://postgrespro.com/
> On 20 Sep 2023, at 10:55, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: > > On 20.09.2023 11:42, Daniel Gustafsson wrote: >> Attached is a v2 on top of HEAD with commit message etc, which I propose to >> backpatch to v15 where it was introduced. > There is a typo: upon en error. Otherwise, looks good to me. Thank you. Thanks, will fix before pushing. -- Daniel Gustafsson
> On 20 Sep 2023, at 10:58, Daniel Gustafsson <daniel@yesql.se> wrote: >> On 20 Sep 2023, at 10:55, Sergey Shinderuk <s.shinderuk@postgrespro.ru> wrote: >> There is a typo: upon en error. Otherwise, looks good to me. Thank you. > > Thanks, will fix before pushing. I've applied this down to v15 now and the buildfarm have green builds on all three branches, thanks for the submission! -- Daniel Gustafsson