Thread: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17326 Logged by: James Pang Email address: chaolpan@cisco.com PostgreSQL version: 13.4 Operating system: RHEL8.4 Description: we need SSL enabled for our production env, when I test renew a ssl certificate , and reload_conf, it crashed. even with same certificate and ssl parameters, run reload_conf often lead to Postgres crash. For example : =# select name,setting from pg_settings where name like 'ssl_%' order by name; name | setting ----------------------------------------+--------------------------------------- ssl_ca_file | /var/lib/pgsql/sslcerts/awstestca.crt ssl_cert_file | /var/lib/pgsql/sslcerts/server.crt ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL ssl_crl_file | ssl_dh_params_file | ssl_ecdh_curve | prime256v1 ssl_key_file | /var/lib/pgsql/sslcerts/server.key ssl_library | OpenSSL ssl_max_protocol_version | ssl_min_protocol_version | TLSv1.2 ssl_passphrase_command | ssl_passphrase_command_supports_reload | off ssl_prefer_server_ciphers | on (13 rows) =# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) =# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) =# select pg_reload_conf(); FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed.
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
From postgres logs , it show 2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files 2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files 2021-12-08 03:58:03.143 UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 03:33:24 UTC:testsubLOG: disconnection:session time: 0:24:38.967 user=jamet database=jamet host=10.240.212.242 port=58646 2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 UTC:psqlFATAL: terminating connectiondue to unexpected postmaster exit 2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 UTC:psqlLOG: disconnection: sessiontime: 0:01:00.405 user=postgres database=jamet host=[local] James -----Original Message----- From: PG Bug reporting form <noreply@postgresql.org> Sent: Wednesday, December 8, 2021 12:03 PM To: pgsql-bugs@lists.postgresql.org Cc: James Pang (chaolpan) <chaolpan@cisco.com> Subject: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters The following bug has been logged on the website: Bug reference: 17326 Logged by: James Pang Email address: chaolpan@cisco.com PostgreSQL version: 13.4 Operating system: RHEL8.4 Description: we need SSL enabled for our production env, when I test renew a ssl certificate , and reload_conf, it crashed. even withsame certificate and ssl parameters, run reload_conf often lead to Postgres crash. For example : =# select name,setting from pg_settings where name like 'ssl_%' order by name; name | setting ----------------------------------------+------------------------------- ----------------------------------------+-------- ssl_ca_file | /var/lib/pgsql/sslcerts/awstestca.crt ssl_cert_file | /var/lib/pgsql/sslcerts/server.crt ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL ssl_crl_file | ssl_dh_params_file | ssl_ecdh_curve | prime256v1 ssl_key_file | /var/lib/pgsql/sslcerts/server.key ssl_library | OpenSSL ssl_max_protocol_version | ssl_min_protocol_version | TLSv1.2 ssl_passphrase_command | ssl_passphrase_command_supports_reload | off ssl_prefer_server_ciphers | on (13 rows) =# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) =# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) =# select pg_reload_conf(); FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed.
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Dmitry Dolgov
Date:
> The following bug has been logged on the website: > > Bug reference: 17326 > Logged by: James Pang > Email address: chaolpan@cisco.com > PostgreSQL version: 13.4 > Operating system: RHEL8.4 > Description: > > we need SSL enabled for our production env, when I test renew a ssl certificate , and reload_conf, it crashed. even withsame certificate and ssl parameters, run reload_conf often lead to Postgres crash. For example > : > > =# select name,setting from pg_settings where name like 'ssl_%' order by name; > name | setting > ----------------------------------------+------------------------------- > ----------------------------------------+-------- > ssl_ca_file | > /var/lib/pgsql/sslcerts/awstestca.crt > ssl_cert_file | > /var/lib/pgsql/sslcerts/server.crt > ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL > ssl_crl_file | > ssl_dh_params_file | > ssl_ecdh_curve | prime256v1 > ssl_key_file | > /var/lib/pgsql/sslcerts/server.key > ssl_library | OpenSSL > ssl_max_protocol_version | > ssl_min_protocol_version | TLSv1.2 > ssl_passphrase_command | > ssl_passphrase_command_supports_reload | off > ssl_prefer_server_ciphers | on > (13 rows) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote: > From postgres logs , it show > 2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files > 2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configurationfiles > 2021-12-08 03:58:03.143 UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 03:33:24 UTC:testsubLOG: disconnection:session time: 0:24:38.967 user=jamet database=jamet host=10.240.212.242 port=58646 > 2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 UTC:psqlFATAL: terminating connectiondue to unexpected postmaster exit > 2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 UTC:psqlLOG: disconnection: sessiontime: 0:01:00.405 user=postgres database=jamet host=[local] Hi, Thanks for reporting the issue. Any chance to get a stack trace corresponding to the crash, e.g. like in [1]? [1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
Looks like this issue is related with "set_user" extension, I removed all extensions , pg_reload_conf() works withoutissue. When I installed and enable "set_user" extension, the issue got reproduced. shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user' #set_user set_user.superuser_whitelist = '+dba' #set_user.superuser_allowlist = '+dba' set_user.block_log_statement=on set_user.nosuperuser_target_whitelist = '' #set_user.nosuperuser_target_allowlist = '' Will try to get and update the stack. James -----Original Message----- From: Dmitry Dolgov <9erthalion6@gmail.com> Sent: Wednesday, December 8, 2021 9:46 PM To: James Pang (chaolpan) <chaolpan@cisco.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters > The following bug has been logged on the website: > > Bug reference: 17326 > Logged by: James Pang > Email address: chaolpan@cisco.com > PostgreSQL version: 13.4 > Operating system: RHEL8.4 > Description: > > we need SSL enabled for our production env, when I test renew a ssl > certificate , and reload_conf, it crashed. even with same certificate > and ssl parameters, run reload_conf often lead to Postgres crash. For > example > : > > =# select name,setting from pg_settings where name like 'ssl_%' order by name; > name | setting > ----------------------------------------+----------------------------- > ----------------------------------------+-- > ----------------------------------------+-------- > ssl_ca_file | > /var/lib/pgsql/sslcerts/awstestca.crt > ssl_cert_file | > /var/lib/pgsql/sslcerts/server.crt > ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL > ssl_crl_file | > ssl_dh_params_file | > ssl_ecdh_curve | prime256v1 > ssl_key_file | > /var/lib/pgsql/sslcerts/server.key > ssl_library | OpenSSL > ssl_max_protocol_version | > ssl_min_protocol_version | TLSv1.2 > ssl_passphrase_command | > ssl_passphrase_command_supports_reload | off > ssl_prefer_server_ciphers | on > (13 rows) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote: > From postgres logs , it show > 2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 > UTC:LOG: received SIGHUP, reloading configuration files > 2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 > UTC:LOG: received SIGHUP, reloading configuration files > 2021-12-08 03:58:03.143 > UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 > 03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967 > user=jamet database=jamet host=10.240.212.242 port=58646 > 2021-12-08 03:58:03.147 > UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 > UTC:psqlFATAL: terminating connection due to unexpected postmaster > exit > 2021-12-08 03:58:03.147 > UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 > UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres > database=jamet host=[local] Hi, Thanks for reporting the issue. Any chance to get a stack trace corresponding to the crash, e.g. like in [1]? [1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
try to install debug_info and get stack, 1. use coredump , ]$ gdb -q -c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(nodebugging symbols found)...done. (no debugging symbols found)...done. warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing warning: Can't open file (null) during file-backed mapping note processing [New LWP 1317550] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/pgsql-13/bin/postgres'. Program terminated with signal SIGSEGV, Segmentation fault. #0 0x00007f72e3290094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 2. when gdb log , Program received signal SIGHUP, Hangup. 0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing. Program received signal SIGHUP, Hangup. 0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007f4fb5eef094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 Continuing. Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. Should I install debug info for set_user module too? Thanks, James -----Original Message----- From: James Pang (chaolpan) Sent: Thursday, December 9, 2021 11:34 AM To: Dmitry Dolgov <9erthalion6@gmail.com> Cc: pgsql-bugs@lists.postgresql.org Subject: RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters Looks like this issue is related with "set_user" extension, I removed all extensions , pg_reload_conf() works withoutissue. When I installed and enable "set_user" extension, the issue got reproduced. shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user' #set_user set_user.superuser_whitelist = '+dba' #set_user.superuser_allowlist = '+dba' set_user.block_log_statement=on set_user.nosuperuser_target_whitelist = '' #set_user.nosuperuser_target_allowlist = '' Will try to get and update the stack. James -----Original Message----- From: Dmitry Dolgov <9erthalion6@gmail.com> Sent: Wednesday, December 8, 2021 9:46 PM To: James Pang (chaolpan) <chaolpan@cisco.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters > The following bug has been logged on the website: > > Bug reference: 17326 > Logged by: James Pang > Email address: chaolpan@cisco.com > PostgreSQL version: 13.4 > Operating system: RHEL8.4 > Description: > > we need SSL enabled for our production env, when I test renew a ssl > certificate , and reload_conf, it crashed. even with same certificate > and ssl parameters, run reload_conf often lead to Postgres crash. For > example > : > > =# select name,setting from pg_settings where name like 'ssl_%' order by name; > name | setting > ----------------------------------------+----------------------------- > ----------------------------------------+-- > ----------------------------------------+-------- > ssl_ca_file | > /var/lib/pgsql/sslcerts/awstestca.crt > ssl_cert_file | > /var/lib/pgsql/sslcerts/server.crt > ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL > ssl_crl_file | > ssl_dh_params_file | > ssl_ecdh_curve | prime256v1 > ssl_key_file | > /var/lib/pgsql/sslcerts/server.key > ssl_library | OpenSSL > ssl_max_protocol_version | > ssl_min_protocol_version | TLSv1.2 > ssl_passphrase_command | > ssl_passphrase_command_supports_reload | off > ssl_prefer_server_ciphers | on > (13 rows) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > pg_reload_conf > ---------------- > t > (1 row) > > =# select pg_reload_conf(); > FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > > On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote: > From postgres logs , it show > 2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 > UTC:LOG: received SIGHUP, reloading configuration files > 2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 > UTC:LOG: received SIGHUP, reloading configuration files > 2021-12-08 03:58:03.143 > UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 > 03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967 > user=jamet database=jamet host=10.240.212.242 port=58646 > 2021-12-08 03:58:03.147 > UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 > UTC:psqlFATAL: terminating connection due to unexpected postmaster > exit > 2021-12-08 03:58:03.147 > UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 > UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres > database=jamet host=[local] Hi, Thanks for reporting the issue. Any chance to get a stack trace corresponding to the crash, e.g. like in [1]? [1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Dmitry Dolgov
Date:
> On Fri, Dec 10, 2021 at 09:05:19AM +0000, James Pang (chaolpan) wrote: > try to install debug_info and get stack, > 1. use coredump , > ]$ gdb -q -c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres > Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(nodebugging symbols found)...done. > (no debugging symbols found)...done. > > warning: Can't open file (null) during file-backed mapping note processing > > warning: Can't open file (null) during file-backed mapping note processing > > warning: Can't open file (null) during file-backed mapping note processing > [New LWP 1317550] > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `/usr/pgsql-13/bin/postgres'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007f72e3290094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 > > 2. when gdb log , > Program received signal SIGHUP, Hangup. > 0x00007f4fb438e25b in select () from /lib64/libc.so.6 > Continuing. > > Program received signal SIGHUP, Hangup. > 0x00007f4fb438e25b in select () from /lib64/libc.so.6 > Continuing. > > Program received signal SIGSEGV, Segmentation fault. > 0x00007f4fb5eef094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 > Continuing. > > Program terminated with signal SIGSEGV, Segmentation fault. > The program no longer exists. > > Should I install debug info for set_user module too? Eventually yes, but judging from the logs you've posted ("/usr/pgsql-13/bin/postgres...(no debugging symbols found)") the debugging symbols for postgres itself are not there yet. Do you get a meaningful stack trace from the coredump with the `bt` command right now?
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
1. gdb attache postgres ]# ps -ef | grep postgres postgres 8790 1 4 06:53 ? 00:00:00 /usr/pgsql-13/bin/postgres # gdb -p 8790 ... Attaching to process 8790 Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(nodebuggin g symbols found)...done. 2. start another psql session to run pg_reload_conf() jamet=# select pg_reload_conf(); pg_reload_conf ---------------- t (1 row) Edit postgresql.conf to change ssl_certificate parameter , 3. (gdb) cont Continuing. [Detaching after fork from child process 8828] Program received signal SIGHUP, Hangup. 0x00007ff49879d25b in select () from /lib64/libc.so.6 (gdb) cont Continuing. 4. psql session run pg_reload_conf again $ psql select pg_reload_conf(); 5. gdb receive SEGSEGV (gdb) cont Continuing. Program received signal SIGSEGV, Segmentation fault. 0x00007ff49a2fe094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 (gdb) bt #0 0x00007ff49a2fe094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1 #1 0x00007ff49a30824f in asn1_primitive_free.localalias () from /lib64/libcrypto.so.1.1 #2 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1 #3 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1 #4 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1 #5 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1 #6 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1 #7 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1 #8 0x00007ff49a3085d9 in ASN1_item_free () from /lib64/libcrypto.so.1.1 #9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1 #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 #12 0x000000000068b6b8 in be_tls_init () #13 0x00000000007271e1 in SIGHUP_handler () #14 <signal handler called> #15 0x00007ff49879d25b in select () from /lib64/libc.so.6 #16 0x000000000072a20c in ServerLoop () #17 0x000000000072bd10 in PostmasterMain () #18 0x00000000004869a0 in main () (gdb) cont Continuing. Program terminated with signal SIGSEGV, Segmentation fault. The program no longer exists. Thanks, James -----Original Message----- From: Dmitry Dolgov <9erthalion6@gmail.com> Sent: Friday, December 10, 2021 10:23 PM To: James Pang (chaolpan) <chaolpan@cisco.com> Cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters > On Fri, Dec 10, 2021 at 09:05:19AM +0000, James Pang (chaolpan) wrote: > try to install debug_info and get stack, 1. use coredump , ]$ gdb -q > -c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres > Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(nodebugging symbols found)...done. > (no debugging symbols found)...done. > > warning: Can't open file (null) during file-backed mapping note > processing > > warning: Can't open file (null) during file-backed mapping note > processing > > warning: Can't open file (null) during file-backed mapping note > processing [New LWP 1317550] [Thread debugging using libthread_db > enabled] Using host libthread_db library "/lib64/libthread_db.so.1". > Core was generated by `/usr/pgsql-13/bin/postgres'. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 0x00007f72e3290094 in asn1_string_embed_free () from > /lib64/libcrypto.so.1.1 > > 2. when gdb log , > Program received signal SIGHUP, Hangup. > 0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing. > > Program received signal SIGHUP, Hangup. > 0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing. > > Program received signal SIGSEGV, Segmentation fault. > 0x00007f4fb5eef094 in asn1_string_embed_free () from > /lib64/libcrypto.so.1.1 Continuing. > > Program terminated with signal SIGSEGV, Segmentation fault. > The program no longer exists. > > Should I install debug info for set_user module too? Eventually yes, but judging from the logs you've posted ("/usr/pgsql-13/bin/postgres...(no debugging symbols found)") thedebugging symbols for postgres itself are not there yet. Do you get a meaningful stack trace from the coredump with the`bt` command right now?
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Michael Paquier
Date:
On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote: > Edit postgresql.conf to change ssl_certificate parameter , Do you mean ssl_cert_file here? Also, something that's not completely clear to me is if this is a problem with a vanilla PostgreSQL instance or if this is related to the pgaudit extension set_user, as it has been mentioned as one potential origin of the problem upthread, but you are not telling if this is the case here. So what do you have for shared_preload_libraries in this crash? > #9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1 > #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 > #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 > #12 0x000000000068b6b8 in be_tls_init () > #13 0x00000000007271e1 in SIGHUP_handler () Why is secure_initialize() not showing up in this stack? That would be the caller of be_tls_init() in the SIGHUP handler. The version of OpenSSL you are linking your binaries to would be useful here. That would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter? -- Michael
Attachment
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Dmitry Dolgov
Date:
> On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote: > On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote: > > Edit postgresql.conf to change ssl_certificate parameter , > > Do you mean ssl_cert_file here? Also, something that's not completely > clear to me is if this is a problem with a vanilla PostgreSQL > instance or if this is related to the pgaudit extension set_user, as > it has been mentioned as one potential origin of the problem upthread, > but you are not telling if this is the case here. So what do you have > for shared_preload_libraries in this crash? > > > #9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1 > > #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 > > #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 > > #12 0x000000000068b6b8 in be_tls_init () > > #13 0x00000000007271e1 in SIGHUP_handler () > > Why is secure_initialize() not showing up in this stack? That would > be the caller of be_tls_init() in the SIGHUP handler. The version of > OpenSSL you are linking your binaries to would be useful here. That > would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter? I think I can actually reproduce the issue. In my case the stack is fine, it contains secure_initialize, and overall it looks like some sort of memory corruption -- at least openssl gets segfault because it can't access some memory address it tries to verify in asn1_primitive_free. Not sure yet why, investigating.
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Dmitry Dolgov
Date:
> On Tue, Dec 14, 2021 at 04:46:04PM +0100, Dmitry Dolgov wrote: > > On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote: > > On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote: > > > Edit postgresql.conf to change ssl_certificate parameter , > > > > Do you mean ssl_cert_file here? Also, something that's not completely > > clear to me is if this is a problem with a vanilla PostgreSQL > > instance or if this is related to the pgaudit extension set_user, as > > it has been mentioned as one potential origin of the problem upthread, > > but you are not telling if this is the case here. So what do you have > > for shared_preload_libraries in this crash? > > > > > #9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1 > > > #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 > > > #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 > > > #12 0x000000000068b6b8 in be_tls_init () > > > #13 0x00000000007271e1 in SIGHUP_handler () > > > > Why is secure_initialize() not showing up in this stack? That would > > be the caller of be_tls_init() in the SIGHUP handler. The version of > > OpenSSL you are linking your binaries to would be useful here. That > > would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter? > > I think I can actually reproduce the issue. In my case the stack is > fine, it contains secure_initialize, and overall it looks like some sort > of memory corruption -- at least openssl gets segfault because it can't > access some memory address it tries to verify in asn1_primitive_free. > Not sure yet why, investigating. After a short investigation looks like it's set_user problem. The extension has duplicating set of parameters, where one is the actual set and another one is "deprecated options". If I have both sets set simultaneously in configuration (e.g. set_user.superuser_whitelist and set_user.superuser_allowlist), on sighup in set_config_option / PGC_STRING branch / makeDefault condition something weird happens after set_extra_field, and after this point ssl context memory seems to be corrupted. Right before that an assign_hook from set_user is invoked to do something around "deprecated" options, that's why it looks suspicious. As soon as no "deprecated" options left in the config the issue disappears.
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
Michael Paquier
Date:
On Tue, Dec 14, 2021 at 06:36:54PM +0100, Dmitry Dolgov wrote: > something weird happens after set_extra_field, and after this point ssl > context memory seems to be corrupted. Right before that an assign_hook > from set_user is invoked to do something around "deprecated" options, > that's why it looks suspicious. As soon as no "deprecated" options left > in the config the issue disappears. Hmm, okay. Thanks. I have no idea if this extension is doing something it should not, but I'd like to keep in mind that there could be something that could be improved in core depending on what this module is trying to achieve. At least that's a possibility. -- Michael
Attachment
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
It's a new project that need security compliance , SSL is a MUST here , and pgaudit,set_user is installed here too to meetingthe compliance request. We test renew SSL certificate, and change the ssl_cert_file and ssl_key_file parameter torenewed ssl certificates. ssl = on ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' ssl_crl_file = '' #ssl_min_protocol_version = 'TLSv1.2' ssl_ca_file = '/var/lib/pgsql/sslrenew/idtrca.cer' #ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres-109798.crt' #ssl_key_file = '/var/lib/pgsql/sslrenew/postgres-109798.key' ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres014-110388.crt' ssl_key_file = '/var/lib/pgsql/sslrenew/postgres014-11038.key' -- shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user' pgaudit.log_catalog='on' pgaudit.log_level='log' pgaudit.log_parameter=on pgaudit.log_statement_once=off pgaudit.log='all, -misc' pgaudit.log='ddl,role' pgaudit.role='postgres,jamet' #set_user set_user.superuser_whitelist = '+dba' #set_user.superuser_allowlist = '+dba' set_user.block_log_statement=on #set_user.nosuperuser_target_whitelist = '' set_user.nosuperuser_target_allowlist = '' #pre_warm pg_prewarm.autoprewarm = true pg_prewarm.autoprewarm_interval = 600 the Operating system got some security hardening too, too meet compliance requirement. The OpenSSL 1.1.1g with FIPS enabled. $ openssl version OpenSSL 1.1.1g FIPS 21 Apr 2020 Yes, interesting thing is when I remove all extensions and try the test again, then install orafce, pg_background, pgaudit,looks like not reproduced the issue, until install set_user rpm it's ok, but when create extension again, reproducedthe issue. =# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+----------------------------------------------------------------------------------------------- amcheck | 1.2 | public | functions for verifying relation integrity orafce | 3.15 | public | Functions and operators that emulate a subset of functions and packages fromthe Oracle RDBMS pageinspect | 1.8 | public | inspect the contents of database pages at a low level pg_background | 1.0 | public | Run SQL queries in the background pg_buffercache | 1.3 | public | examine the shared buffer cache pg_cron | 1.4 | public | Job scheduler for PostgreSQL pg_freespacemap | 1.2 | public | examine the free space map (FSM) pg_permissions | 1.1 | public | view object permissions and compare them with the desired state pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed pgaudit | 1.5 | public | provides auditing functionality pgstattuple | 1.5 | public | show tuple-level statistics plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers set_user | 3.0 | public | similar to SET ROLE but with added logging (14 rows) Thanks, James -----Original Message----- From: Dmitry Dolgov <9erthalion6@gmail.com> Sent: Tuesday, December 14, 2021 11:46 PM To: Michael Paquier <michael@paquier.xyz> Cc: James Pang (chaolpan) <chaolpan@cisco.com>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters > On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote: > On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote: > > Edit postgresql.conf to change ssl_certificate parameter , > > Do you mean ssl_cert_file here? Also, something that's not completely > clear to me is if this is a problem with a vanilla PostgreSQL instance > or if this is related to the pgaudit extension set_user, as it has > been mentioned as one potential origin of the problem upthread, but > you are not telling if this is the case here. So what do you have for > shared_preload_libraries in this crash? > > > #9 0x00007ff49a78059c in ssl_cert_clear_certs () from > > /lib64/libssl.so.1.1 > > #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 > > #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 > > #12 0x000000000068b6b8 in be_tls_init () > > #13 0x00000000007271e1 in SIGHUP_handler () > > Why is secure_initialize() not showing up in this stack? That would > be the caller of be_tls_init() in the SIGHUP handler. The version of > OpenSSL you are linking your binaries to would be useful here. That > would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter? I think I can actually reproduce the issue. In my case the stack is fine, it contains secure_initialize, and overall it lookslike some sort of memory corruption -- at least openssl gets segfault because it can't access some memory address ittries to verify in asn1_primitive_free. Not sure yet why, investigating.
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters
From
"James Pang (chaolpan)"
Date:
It's a new project that need security compliance , SSL is a MUST here , and pgaudit,set_user is installed here too to meetingthe compliance request. We test renew SSL certificate, and change the ssl_cert_file and ssl_key_file parameter torenewed ssl certificates. ssl = on ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL' ssl_crl_file = '' #ssl_min_protocol_version = 'TLSv1.2' ssl_ca_file = '/var/lib/pgsql/sslrenew/idtrca.cer' #ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres-109798.crt' #ssl_key_file = '/var/lib/pgsql/sslrenew/postgres-109798.key' ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres014-110388.crt' ssl_key_file = '/var/lib/pgsql/sslrenew/postgres014-11038.key' -- shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user' pgaudit.log_catalog='on' pgaudit.log_level='log' pgaudit.log_parameter=on pgaudit.log_statement_once=off pgaudit.log='all, -misc' pgaudit.log='ddl,role' pgaudit.role='postgres,jamet' #set_user set_user.superuser_whitelist = '+dba' #set_user.superuser_allowlist = '+dba' set_user.block_log_statement=on #set_user.nosuperuser_target_whitelist = '' set_user.nosuperuser_target_allowlist = '' #pre_warm pg_prewarm.autoprewarm = true pg_prewarm.autoprewarm_interval = 600 the Operating system got some security hardening too, too meet compliance requirement. The OpenSSL 1.1.1g with FIPS enabled. $ openssl version OpenSSL 1.1.1g FIPS 21 Apr 2020 Yes, interesting thing is when I remove all extensions and try the test again, then install orafce, pg_background, pgaudit,looks like not reproduced the issue, until install set_user rpm it's ok, but when create extension again, reproducedthe issue. =# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+---------------------------- --------------------+---------+------------+---------------------------- --------------------+---------+------------+---------------------------- --------------------+---------+------------+----------- amcheck | 1.2 | public | functions for verifying relation integrity orafce | 3.15 | public | Functions and operators that emulate a subset of functions and packages fromthe Oracle RDBMS pageinspect | 1.8 | public | inspect the contents of database pages at a low level pg_background | 1.0 | public | Run SQL queries in the background pg_buffercache | 1.3 | public | examine the shared buffer cache pg_cron | 1.4 | public | Job scheduler for PostgreSQL pg_freespacemap | 1.2 | public | examine the free space map (FSM) pg_permissions | 1.1 | public | view object permissions and compare them with the desired state pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed pgaudit | 1.5 | public | provides auditing functionality pgstattuple | 1.5 | public | show tuple-level statistics plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers set_user | 3.0 | public | similar to SET ROLE but with added logging (14 rows) Thanks, James -----Original Message----- From: Dmitry Dolgov <9erthalion6@gmail.com> Sent: Tuesday, December 14, 2021 11:46 PM To: Michael Paquier <michael@paquier.xyz> Cc: James Pang (chaolpan) <chaolpan@cisco.com>; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters > On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote: > On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote: > > Edit postgresql.conf to change ssl_certificate parameter , > > Do you mean ssl_cert_file here? Also, something that's not completely > clear to me is if this is a problem with a vanilla PostgreSQL instance > or if this is related to the pgaudit extension set_user, as it has > been mentioned as one potential origin of the problem upthread, but > you are not telling if this is the case here. So what do you have for > shared_preload_libraries in this crash? > > > #9 0x00007ff49a78059c in ssl_cert_clear_certs () from > > /lib64/libssl.so.1.1 > > #10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1 > > #11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1 > > #12 0x000000000068b6b8 in be_tls_init () > > #13 0x00000000007271e1 in SIGHUP_handler () > > Why is secure_initialize() not showing up in this stack? That would > be the caller of be_tls_init() in the SIGHUP handler. The version of > OpenSSL you are linking your binaries to would be useful here. That > would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter? I think I can actually reproduce the issue. In my case the stack is fine, it contains secure_initialize, and overall it lookslike some sort of memory corruption -- at least openssl gets segfault because it can't access some memory address ittries to verify in asn1_primitive_free. Not sure yet why, investigating.