BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

Started by PG Bug reporting formover 4 years ago13 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 17326
Logged by: James Pang
Email address: chaolpan@cisco.com
PostgreSQL version: 13.4
Operating system: RHEL8.4
Description:

we need SSL enabled for our production env, when I test renew a ssl
certificate , and reload_conf, it crashed. even with same certificate and
ssl parameters, run reload_conf often lead to Postgres crash. For example
:

=# select name,setting from pg_settings where name like 'ssl_%' order by
name;
name | setting
----------------------------------------+---------------------------------------
ssl_ca_file |
/var/lib/pgsql/sslcerts/awstestca.crt
ssl_cert_file |
/var/lib/pgsql/sslcerts/server.crt
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file |
ssl_dh_params_file |
ssl_ecdh_curve | prime256v1
ssl_key_file |
/var/lib/pgsql/sslcerts/server.key
ssl_library | OpenSSL
ssl_max_protocol_version |
ssl_min_protocol_version | TLSv1.2
ssl_passphrase_command |
ssl_passphrase_command_supports_reload | off
ssl_prefer_server_ciphers | on
(13 rows)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
FATAL: terminating connection due to unexpected postmaster exit
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

#2James Pang (chaolpan)
chaolpan@cisco.com
In reply to: PG Bug reporting form (#1)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

From postgres logs , it show
2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:03.143 UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967 user=jamet database=jamet host=10.240.212.242 port=58646
2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 UTC:psqlFATAL: terminating connection due to unexpected postmaster exit
2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres database=jamet host=[local]

James

-----Original Message-----
From: PG Bug reporting form <noreply@postgresql.org>
Sent: Wednesday, December 8, 2021 12:03 PM
To: pgsql-bugs@lists.postgresql.org
Cc: James Pang (chaolpan) <chaolpan@cisco.com>
Subject: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

The following bug has been logged on the website:

Bug reference: 17326
Logged by: James Pang
Email address: chaolpan@cisco.com
PostgreSQL version: 13.4
Operating system: RHEL8.4
Description:

we need SSL enabled for our production env, when I test renew a ssl certificate , and reload_conf, it crashed. even with same certificate and ssl parameters, run reload_conf often lead to Postgres crash. For example
:

=# select name,setting from pg_settings where name like 'ssl_%' order by name;
name | setting
----------------------------------------+-------------------------------
----------------------------------------+--------
ssl_ca_file |
/var/lib/pgsql/sslcerts/awstestca.crt
ssl_cert_file |
/var/lib/pgsql/sslcerts/server.crt
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file |
ssl_dh_params_file |
ssl_ecdh_curve | prime256v1
ssl_key_file |
/var/lib/pgsql/sslcerts/server.key
ssl_library | OpenSSL
ssl_max_protocol_version |
ssl_min_protocol_version | TLSv1.2
ssl_passphrase_command |
ssl_passphrase_command_supports_reload | off
ssl_prefer_server_ciphers | on
(13 rows)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

#3Dmitry Dolgov
9erthalion6@gmail.com
In reply to: James Pang (chaolpan) (#2)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

The following bug has been logged on the website:

Bug reference: 17326
Logged by: James Pang
Email address: chaolpan@cisco.com
PostgreSQL version: 13.4
Operating system: RHEL8.4
Description:

we need SSL enabled for our production env, when I test renew a ssl certificate , and reload_conf, it crashed. even with same certificate and ssl parameters, run reload_conf often lead to Postgres crash. For example
:

=# select name,setting from pg_settings where name like 'ssl_%' order by name;
name | setting
----------------------------------------+-------------------------------
----------------------------------------+--------
ssl_ca_file |
/var/lib/pgsql/sslcerts/awstestca.crt
ssl_cert_file |
/var/lib/pgsql/sslcerts/server.crt
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file |
ssl_dh_params_file |
ssl_ecdh_curve | prime256v1
ssl_key_file |
/var/lib/pgsql/sslcerts/server.key
ssl_library | OpenSSL
ssl_max_protocol_version |
ssl_min_protocol_version | TLSv1.2
ssl_passphrase_command |
ssl_passphrase_command_supports_reload | off
ssl_prefer_server_ciphers | on
(13 rows)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote:
From postgres logs , it show
2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21 UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:03.143 UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08 03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967 user=jamet database=jamet host=10.240.212.242 port=58646
2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02 UTC:psqlFATAL: terminating connection due to unexpected postmaster exit
2021-12-08 03:58:03.147 UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02 UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres database=jamet host=[local]

Hi,

Thanks for reporting the issue. Any chance to get a stack trace
corresponding to the crash, e.g. like in [1]https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD?

[1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

#4James Pang (chaolpan)
chaolpan@cisco.com
In reply to: Dmitry Dolgov (#3)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

Looks like this issue is related with "set_user" extension, I removed all extensions , pg_reload_conf() works without issue. When I installed and enable "set_user" extension, the issue got reproduced.
shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user'
#set_user
set_user.superuser_whitelist = '+dba'
#set_user.superuser_allowlist = '+dba'
set_user.block_log_statement=on
set_user.nosuperuser_target_whitelist = ''
#set_user.nosuperuser_target_allowlist = ''

Will try to get and update the stack.

James

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Wednesday, December 8, 2021 9:46 PM
To: James Pang (chaolpan) <chaolpan@cisco.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

The following bug has been logged on the website:

Bug reference: 17326
Logged by: James Pang
Email address: chaolpan@cisco.com
PostgreSQL version: 13.4
Operating system: RHEL8.4
Description:

we need SSL enabled for our production env, when I test renew a ssl
certificate , and reload_conf, it crashed. even with same certificate
and ssl parameters, run reload_conf often lead to Postgres crash. For
example
:

=# select name,setting from pg_settings where name like 'ssl_%' order by name;
name | setting
----------------------------------------+-----------------------------
----------------------------------------+--
----------------------------------------+--------
ssl_ca_file |
/var/lib/pgsql/sslcerts/awstestca.crt
ssl_cert_file |
/var/lib/pgsql/sslcerts/server.crt
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file |
ssl_dh_params_file |
ssl_ecdh_curve | prime256v1
ssl_key_file |
/var/lib/pgsql/sslcerts/server.key
ssl_library | OpenSSL
ssl_max_protocol_version |
ssl_min_protocol_version | TLSv1.2
ssl_passphrase_command |
ssl_passphrase_command_supports_reload | off
ssl_prefer_server_ciphers | on
(13 rows)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote:
From postgres logs , it show
2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21
UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21
UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:03.143
UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08
03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967
user=jamet database=jamet host=10.240.212.242 port=58646
2021-12-08 03:58:03.147
UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02
UTC:psqlFATAL: terminating connection due to unexpected postmaster
exit
2021-12-08 03:58:03.147
UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02
UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres
database=jamet host=[local]

Hi,

Thanks for reporting the issue. Any chance to get a stack trace corresponding to the crash, e.g. like in [1]https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD?

[1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

#5James Pang (chaolpan)
chaolpan@cisco.com
In reply to: James Pang (chaolpan) (#4)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

try to install debug_info and get stack,
1. use coredump ,
]$ gdb -q -c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres
Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: Can't open file (null) during file-backed mapping note processing

warning: Can't open file (null) during file-backed mapping note processing

warning: Can't open file (null) during file-backed mapping note processing
[New LWP 1317550]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/pgsql-13/bin/postgres'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f72e3290094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1

2. when gdb log ,
Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6
Continuing.

Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007f4fb5eef094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

Should I install debug info for set_user module too?

Thanks,

James

-----Original Message-----
From: James Pang (chaolpan)
Sent: Thursday, December 9, 2021 11:34 AM
To: Dmitry Dolgov <9erthalion6@gmail.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

Looks like this issue is related with "set_user" extension, I removed all extensions , pg_reload_conf() works without issue. When I installed and enable "set_user" extension, the issue got reproduced.
shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user'
#set_user
set_user.superuser_whitelist = '+dba'
#set_user.superuser_allowlist = '+dba'
set_user.block_log_statement=on
set_user.nosuperuser_target_whitelist = ''
#set_user.nosuperuser_target_allowlist = ''

Will try to get and update the stack.

James

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Wednesday, December 8, 2021 9:46 PM
To: James Pang (chaolpan) <chaolpan@cisco.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

The following bug has been logged on the website:

Bug reference: 17326
Logged by: James Pang
Email address: chaolpan@cisco.com
PostgreSQL version: 13.4
Operating system: RHEL8.4
Description:

we need SSL enabled for our production env, when I test renew a ssl
certificate , and reload_conf, it crashed. even with same certificate
and ssl parameters, run reload_conf often lead to Postgres crash. For
example
:

=# select name,setting from pg_settings where name like 'ssl_%' order by name;
name | setting
----------------------------------------+-----------------------------
----------------------------------------+--
----------------------------------------+--------
ssl_ca_file |
/var/lib/pgsql/sslcerts/awstestca.crt
ssl_cert_file |
/var/lib/pgsql/sslcerts/server.crt
ssl_ciphers | HIGH:MEDIUM:+3DES:!aNULL
ssl_crl_file |
ssl_dh_params_file |
ssl_ecdh_curve | prime256v1
ssl_key_file |
/var/lib/pgsql/sslcerts/server.key
ssl_library | OpenSSL
ssl_max_protocol_version |
ssl_min_protocol_version | TLSv1.2
ssl_passphrase_command |
ssl_passphrase_command_supports_reload | off
ssl_prefer_server_ciphers | on
(13 rows)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

=# select pg_reload_conf();
FATAL: terminating connection due to unexpected postmaster exit server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

On Wed, Dec 08, 2021 at 06:22:11AM +0000, James Pang (chaolpan) wrote:
From postgres logs , it show
2021-12-08 03:57:55.826 UTC::@:[1291058]:[9-1]:2021-12-08 03:33:21
UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:02.832 UTC::@:[1291058]:[10-1]:2021-12-08 03:33:21
UTC:LOG: received SIGHUP, reloading configuration files
2021-12-08 03:58:03.143
UTC:10.240.212.242(58646):jamet@jamet:[1291076]:[9-1]:2021-12-08
03:33:24 UTC:testsubLOG: disconnection: session time: 0:24:38.967
user=jamet database=jamet host=10.240.212.242 port=58646
2021-12-08 03:58:03.147
UTC:[local]:postgres@jamet:[1291397]:[3-1]:2021-12-08 03:57:02
UTC:psqlFATAL: terminating connection due to unexpected postmaster
exit
2021-12-08 03:58:03.147
UTC:[local]:postgres@jamet:[1291397]:[4-1]:2021-12-08 03:57:02
UTC:psqlLOG: disconnection: session time: 0:01:00.405 user=postgres
database=jamet host=[local]

Hi,

Thanks for reporting the issue. Any chance to get a stack trace corresponding to the crash, e.g. like in [1]https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD?

[1]: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

#6Dmitry Dolgov
9erthalion6@gmail.com
In reply to: James Pang (chaolpan) (#5)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Fri, Dec 10, 2021 at 09:05:19AM +0000, James Pang (chaolpan) wrote:
try to install debug_info and get stack,
1. use coredump ,
]$ gdb -q -c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres
Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: Can't open file (null) during file-backed mapping note processing

warning: Can't open file (null) during file-backed mapping note processing

warning: Can't open file (null) during file-backed mapping note processing
[New LWP 1317550]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/pgsql-13/bin/postgres'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f72e3290094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1

2. when gdb log ,
Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6
Continuing.

Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007f4fb5eef094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

Should I install debug info for set_user module too?

Eventually yes, but judging from the logs you've posted
("/usr/pgsql-13/bin/postgres...(no debugging symbols found)") the
debugging symbols for postgres itself are not there yet. Do you get a
meaningful stack trace from the coredump with the `bt` command right now?

#7James Pang (chaolpan)
chaolpan@cisco.com
In reply to: Dmitry Dolgov (#6)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

1. gdb attache postgres
]# ps -ef | grep postgres
postgres 8790 1 4 06:53 ? 00:00:00 /usr/pgsql-13/bin/postgres
# gdb -p 8790
...
Attaching to process 8790
Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(no debuggin g symbols found)...done.

2. start another psql session to run pg_reload_conf()
jamet=# select pg_reload_conf();
pg_reload_conf
----------------
t
(1 row)

Edit postgresql.conf to change ssl_certificate parameter ,

3. (gdb) cont
Continuing.
[Detaching after fork from child process 8828]

Program received signal SIGHUP, Hangup.
0x00007ff49879d25b in select () from /lib64/libc.so.6
(gdb) cont
Continuing.

4. psql session run pg_reload_conf again
$ psql
select pg_reload_conf();

5. gdb receive SEGSEGV
(gdb) cont
Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007ff49a2fe094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1
(gdb) bt
#0 0x00007ff49a2fe094 in asn1_string_embed_free () from /lib64/libcrypto.so.1.1
#1 0x00007ff49a30824f in asn1_primitive_free.localalias () from /lib64/libcrypto.so.1.1
#2 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1
#3 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1
#4 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1
#5 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1
#6 0x00007ff49a3086b8 in asn1_template_free () from /lib64/libcrypto.so.1.1
#7 0x00007ff49a308376 in asn1_item_embed_free () from /lib64/libcrypto.so.1.1
#8 0x00007ff49a3085d9 in ASN1_item_free () from /lib64/libcrypto.so.1.1
#9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()
#14 <signal handler called>
#15 0x00007ff49879d25b in select () from /lib64/libc.so.6
#16 0x000000000072a20c in ServerLoop ()
#17 0x000000000072bd10 in PostmasterMain ()
#18 0x00000000004869a0 in main ()
(gdb) cont
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

Thanks,

James
-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Friday, December 10, 2021 10:23 PM
To: James Pang (chaolpan) <chaolpan@cisco.com>
Cc: pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Fri, Dec 10, 2021 at 09:05:19AM +0000, James Pang (chaolpan) wrote:
try to install debug_info and get stack, 1. use coredump , ]$ gdb -q
-c /pgdata/core.1317550.sig11.1639122870s /usr/pgsql-13/bin/postgres
Reading symbols from /usr/pgsql-13/bin/postgres...Reading symbols from .gnu_debugdata for /usr/pgsql-13/bin/postgres...(no debugging symbols found)...done.
(no debugging symbols found)...done.

warning: Can't open file (null) during file-backed mapping note
processing

warning: Can't open file (null) during file-backed mapping note
processing

warning: Can't open file (null) during file-backed mapping note
processing [New LWP 1317550] [Thread debugging using libthread_db
enabled] Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `/usr/pgsql-13/bin/postgres'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007f72e3290094 in asn1_string_embed_free () from
/lib64/libcrypto.so.1.1

2. when gdb log ,
Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing.

Program received signal SIGHUP, Hangup.
0x00007f4fb438e25b in select () from /lib64/libc.so.6 Continuing.

Program received signal SIGSEGV, Segmentation fault.
0x00007f4fb5eef094 in asn1_string_embed_free () from
/lib64/libcrypto.so.1.1 Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.

Should I install debug info for set_user module too?

Eventually yes, but judging from the logs you've posted ("/usr/pgsql-13/bin/postgres...(no debugging symbols found)") the debugging symbols for postgres itself are not there yet. Do you get a meaningful stack trace from the coredump with the `bt` command right now?

#8Michael Paquier
michael@paquier.xyz
In reply to: James Pang (chaolpan) (#7)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote:

Edit postgresql.conf to change ssl_certificate parameter ,

Do you mean ssl_cert_file here? Also, something that's not completely
clear to me is if this is a problem with a vanilla PostgreSQL
instance or if this is related to the pgaudit extension set_user, as
it has been mentioned as one potential origin of the problem upthread,
but you are not telling if this is the case here. So what do you have
for shared_preload_libraries in this crash?

#9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()

Why is secure_initialize() not showing up in this stack? That would
be the caller of be_tls_init() in the SIGHUP handler. The version of
OpenSSL you are linking your binaries to would be useful here. That
would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter?
--
Michael

#9Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Michael Paquier (#8)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote:
On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote:

Edit postgresql.conf to change ssl_certificate parameter ,

Do you mean ssl_cert_file here? Also, something that's not completely
clear to me is if this is a problem with a vanilla PostgreSQL
instance or if this is related to the pgaudit extension set_user, as
it has been mentioned as one potential origin of the problem upthread,
but you are not telling if this is the case here. So what do you have
for shared_preload_libraries in this crash?

#9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()

Why is secure_initialize() not showing up in this stack? That would
be the caller of be_tls_init() in the SIGHUP handler. The version of
OpenSSL you are linking your binaries to would be useful here. That
would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter?

I think I can actually reproduce the issue. In my case the stack is
fine, it contains secure_initialize, and overall it looks like some sort
of memory corruption -- at least openssl gets segfault because it can't
access some memory address it tries to verify in asn1_primitive_free.
Not sure yet why, investigating.

#10Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Dmitry Dolgov (#9)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Tue, Dec 14, 2021 at 04:46:04PM +0100, Dmitry Dolgov wrote:

On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote:
On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote:

Edit postgresql.conf to change ssl_certificate parameter ,

Do you mean ssl_cert_file here? Also, something that's not completely
clear to me is if this is a problem with a vanilla PostgreSQL
instance or if this is related to the pgaudit extension set_user, as
it has been mentioned as one potential origin of the problem upthread,
but you are not telling if this is the case here. So what do you have
for shared_preload_libraries in this crash?

#9 0x00007ff49a78059c in ssl_cert_clear_certs () from /lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()

Why is secure_initialize() not showing up in this stack? That would
be the caller of be_tls_init() in the SIGHUP handler. The version of
OpenSSL you are linking your binaries to would be useful here. That
would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter?

I think I can actually reproduce the issue. In my case the stack is
fine, it contains secure_initialize, and overall it looks like some sort
of memory corruption -- at least openssl gets segfault because it can't
access some memory address it tries to verify in asn1_primitive_free.
Not sure yet why, investigating.

After a short investigation looks like it's set_user problem. The
extension has duplicating set of parameters, where one is the actual set
and another one is "deprecated options". If I have both sets set
simultaneously in configuration (e.g. set_user.superuser_whitelist and
set_user.superuser_allowlist), on sighup in

set_config_option / PGC_STRING branch / makeDefault condition

something weird happens after set_extra_field, and after this point ssl
context memory seems to be corrupted. Right before that an assign_hook
from set_user is invoked to do something around "deprecated" options,
that's why it looks suspicious. As soon as no "deprecated" options left
in the config the issue disappears.

#11Michael Paquier
michael@paquier.xyz
In reply to: Dmitry Dolgov (#10)
Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Tue, Dec 14, 2021 at 06:36:54PM +0100, Dmitry Dolgov wrote:

something weird happens after set_extra_field, and after this point ssl
context memory seems to be corrupted. Right before that an assign_hook
from set_user is invoked to do something around "deprecated" options,
that's why it looks suspicious. As soon as no "deprecated" options left
in the config the issue disappears.

Hmm, okay. Thanks. I have no idea if this extension is doing
something it should not, but I'd like to keep in mind that there could
be something that could be improved in core depending on what this
module is trying to achieve. At least that's a possibility.
--
Michael

#12James Pang (chaolpan)
chaolpan@cisco.com
In reply to: Dmitry Dolgov (#9)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

It's a new project that need security compliance , SSL is a MUST here , and pgaudit,set_user is installed here too to meeting the compliance request. We test renew SSL certificate, and change the ssl_cert_file and ssl_key_file parameter to renewed ssl certificates.
ssl = on
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'

ssl_crl_file = ''
#ssl_min_protocol_version = 'TLSv1.2'
ssl_ca_file = '/var/lib/pgsql/sslrenew/idtrca.cer'
#ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres-109798.crt'
#ssl_key_file = '/var/lib/pgsql/sslrenew/postgres-109798.key'

ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres014-110388.crt'
ssl_key_file = '/var/lib/pgsql/sslrenew/postgres014-11038.key'

--
shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user'
pgaudit.log_catalog='on'
pgaudit.log_level='log'
pgaudit.log_parameter=on
pgaudit.log_statement_once=off
pgaudit.log='all, -misc'
pgaudit.log='ddl,role'
pgaudit.role='postgres,jamet'

#set_user
set_user.superuser_whitelist = '+dba'
#set_user.superuser_allowlist = '+dba'
set_user.block_log_statement=on
#set_user.nosuperuser_target_whitelist = ''
set_user.nosuperuser_target_allowlist = ''

#pre_warm
pg_prewarm.autoprewarm = true
pg_prewarm.autoprewarm_interval = 600

the Operating system got some security hardening too, too meet compliance requirement. The OpenSSL 1.1.1g with FIPS enabled.
$ openssl version
OpenSSL 1.1.1g FIPS 21 Apr 2020

Yes, interesting thing is when I remove all extensions and try the test again, then install orafce, pg_background, pgaudit, looks like not reproduced the issue, until install set_user rpm it's ok, but when create extension again, reproduced the issue.

=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+-----------------------------------------------------------------------------------------------
amcheck | 1.2 | public | functions for verifying relation integrity
orafce | 3.15 | public | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
pageinspect | 1.8 | public | inspect the contents of database pages at a low level
pg_background | 1.0 | public | Run SQL queries in the background
pg_buffercache | 1.3 | public | examine the shared buffer cache
pg_cron | 1.4 | public | Job scheduler for PostgreSQL
pg_freespacemap | 1.2 | public | examine the free space map (FSM)
pg_permissions | 1.1 | public | view object permissions and compare them with the desired state
pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed
pgaudit | 1.5 | public | provides auditing functionality
pgstattuple | 1.5 | public | show tuple-level statistics
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers
set_user | 3.0 | public | similar to SET ROLE but with added logging
(14 rows)

Thanks,

James

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Tuesday, December 14, 2021 11:46 PM
To: Michael Paquier <michael@paquier.xyz>
Cc: James Pang (chaolpan) <chaolpan@cisco.com>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote:
On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote:

Edit postgresql.conf to change ssl_certificate parameter ,

Do you mean ssl_cert_file here? Also, something that's not completely
clear to me is if this is a problem with a vanilla PostgreSQL instance
or if this is related to the pgaudit extension set_user, as it has
been mentioned as one potential origin of the problem upthread, but
you are not telling if this is the case here. So what do you have for
shared_preload_libraries in this crash?

#9 0x00007ff49a78059c in ssl_cert_clear_certs () from
/lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()

Why is secure_initialize() not showing up in this stack? That would
be the caller of be_tls_init() in the SIGHUP handler. The version of
OpenSSL you are linking your binaries to would be useful here. That
would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter?

I think I can actually reproduce the issue. In my case the stack is fine, it contains secure_initialize, and overall it looks like some sort of memory corruption -- at least openssl gets segfault because it can't access some memory address it tries to verify in asn1_primitive_free.
Not sure yet why, investigating.

#13James Pang (chaolpan)
chaolpan@cisco.com
In reply to: James Pang (chaolpan) (#12)
RE: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

It's a new project that need security compliance , SSL is a MUST here , and pgaudit,set_user is installed here too to meeting the compliance request. We test renew SSL certificate, and change the ssl_cert_file and ssl_key_file parameter to renewed ssl certificates.
ssl = on
ssl_ciphers = 'HIGH:MEDIUM:+3DES:!aNULL'

ssl_crl_file = ''
#ssl_min_protocol_version = 'TLSv1.2'
ssl_ca_file = '/var/lib/pgsql/sslrenew/idtrca.cer'
#ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres-109798.crt'
#ssl_key_file = '/var/lib/pgsql/sslrenew/postgres-109798.key'

ssl_cert_file = '/var/lib/pgsql/sslrenew/postgres014-110388.crt'
ssl_key_file = '/var/lib/pgsql/sslrenew/postgres014-11038.key'

--
shared_preload_libraries = 'orafce,pgaudit,pg_cron,pg_stat_statements,pg_prewarm,set_user'
pgaudit.log_catalog='on'
pgaudit.log_level='log'
pgaudit.log_parameter=on
pgaudit.log_statement_once=off
pgaudit.log='all, -misc'
pgaudit.log='ddl,role'
pgaudit.role='postgres,jamet'

#set_user
set_user.superuser_whitelist = '+dba'
#set_user.superuser_allowlist = '+dba'
set_user.block_log_statement=on
#set_user.nosuperuser_target_whitelist = ''
set_user.nosuperuser_target_allowlist = ''

#pre_warm
pg_prewarm.autoprewarm = true
pg_prewarm.autoprewarm_interval = 600

the Operating system got some security hardening too, too meet compliance requirement. The OpenSSL 1.1.1g with FIPS enabled.
$ openssl version
OpenSSL 1.1.1g FIPS 21 Apr 2020

Yes, interesting thing is when I remove all extensions and try the test again, then install orafce, pg_background, pgaudit, looks like not reproduced the issue, until install set_user rpm it's ok, but when create extension again, reproduced the issue.

=# \dx
List of installed extensions
Name | Version | Schema | Description
--------------------+---------+------------+----------------------------
--------------------+---------+------------+----------------------------
--------------------+---------+------------+----------------------------
--------------------+---------+------------+-----------
amcheck | 1.2 | public | functions for verifying relation integrity
orafce | 3.15 | public | Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
pageinspect | 1.8 | public | inspect the contents of database pages at a low level
pg_background | 1.0 | public | Run SQL queries in the background
pg_buffercache | 1.3 | public | examine the shared buffer cache
pg_cron | 1.4 | public | Job scheduler for PostgreSQL
pg_freespacemap | 1.2 | public | examine the free space map (FSM)
pg_permissions | 1.1 | public | view object permissions and compare them with the desired state
pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed
pgaudit | 1.5 | public | provides auditing functionality
pgstattuple | 1.5 | public | show tuple-level statistics
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
postgres_fdw | 1.0 | public | foreign-data wrapper for remote PostgreSQL servers
set_user | 3.0 | public | similar to SET ROLE but with added logging
(14 rows)

Thanks,

James

-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Tuesday, December 14, 2021 11:46 PM
To: Michael Paquier <michael@paquier.xyz>
Cc: James Pang (chaolpan) <chaolpan@cisco.com>; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17326: Postgres crashed when pg_reload_conf() with ssl certificate parameters

On Mon, Dec 13, 2021 at 08:10:57PM +0900, Michael Paquier wrote:
On Mon, Dec 13, 2021 at 07:06:16AM +0000, James Pang (chaolpan) wrote:

Edit postgresql.conf to change ssl_certificate parameter ,

Do you mean ssl_cert_file here? Also, something that's not completely
clear to me is if this is a problem with a vanilla PostgreSQL instance
or if this is related to the pgaudit extension set_user, as it has
been mentioned as one potential origin of the problem upthread, but
you are not telling if this is the case here. So what do you have for
shared_preload_libraries in this crash?

#9 0x00007ff49a78059c in ssl_cert_clear_certs () from
/lib64/libssl.so.1.1
#10 0x00007ff49a780645 in ssl_cert_free () from /lib64/libssl.so.1.1
#11 0x00007ff49a78a25c in SSL_CTX_free () from /lib64/libssl.so.1.1
#12 0x000000000068b6b8 in be_tls_init ()
#13 0x00000000007271e1 in SIGHUP_handler ()

Why is secure_initialize() not showing up in this stack? That would
be the caller of be_tls_init() in the SIGHUP handler. The version of
OpenSSL you are linking your binaries to would be useful here. That
would be a 1.1.0 or a 1.1.1, no? Any specific minor version letter?

I think I can actually reproduce the issue. In my case the stack is fine, it contains secure_initialize, and overall it looks like some sort of memory corruption -- at least openssl gets segfault because it can't access some memory address it tries to verify in asn1_primitive_free.
Not sure yet why, investigating.