LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Started by Mike Yeapabout 7 years ago27 messagesgeneral
Jump to latest
#1Mike Yeap
wkk1020@gmail.com

Hi all, I have encountered a problem related to LDAP authenticated session
with Postgres foreign data wrapper (postgres_fdw).

The server crashed with following errors and other active server processes
are terminated as well:
2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database=
host(port)=] LOG: server process (PID 26306) was terminated by signal 11:
Segmentation fault

2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database=
host(port)=] LOG: terminating any other active server processes

I can reproduce it in a test server with many other sessions connected:

1. login using non-LDAP-authenticated user, query local & foreign tables -
OK
2. login using LDAP-authenticated user, query local table - OK
3. login using LDAP-authenticated user, query foreign table - ERROR, server
crashes with signal 11: Segmentation fault error when I quit the psql
session

It seems like the problem only when the LDAP-authenticated session (which
queried foreign table) is terminated. In dmesg log, I can see following:

[16385512.182231] traps: postmaster[26306] general protection
ip:7f1e758b638c sp:7ffef7ed8858 error:0 in libc-2.17.so[7f1e75836000+1b6000]

Has anyone encountered similar issue?

######################
PostgreSQL version: 10.6
Platform: CentOS Linux
######################

Thank you.

Regards,
Mike Yeap

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Mike Yeap (#1)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap wrote:

I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).

The server crashed with following errors and other active server processes are terminated as well:
2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: server process (PID 26306) was terminated by signal 11: Segmentation fault

2019-02-20 14:53:30.496 SGT [PID=1353 application="" user_name= database= host(port)=] LOG: terminating any other active server processes

I can reproduce it in a test server with many other sessions connected:

1. login using non-LDAP-authenticated user, query local & foreign tables - OK
2. login using LDAP-authenticated user, query local table - OK
3. login using LDAP-authenticated user, query foreign table - ERROR, server crashes with signal 11: Segmentation fault error when I quit the psql session

Are the "postgres" executable and libpq linked with the same version of OpenLDAP?

Any other extensions installed?

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#2)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Mike Yeap wrote:

I have encountered a problem related to LDAP authenticated session with Postgres foreign data wrapper (postgres_fdw).

Are the "postgres" executable and libpq linked with the same version of OpenLDAP?

And which version is that? (And which version of Postgres?)

Digging around in our git history, I came across this:

Author: Noah Misch <noah@leadboat.com>
Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400

Diagnose incompatible OpenLDAP versions during build and test.

With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based on
the compile-time OpenLDAP version number, and test the crash scenario in
the dblink test suite. Back-patch to 9.0 (all supported versions).

which sounds a fair bit like what you are describing.

regards, tom lane

#4Mike Yeap
wkk1020@gmail.com
In reply to: Tom Lane (#3)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Are the "postgres" executable and libpq linked with the same version of

OpenLDAP?
How should I check whether they are linked?

My Postgres version is 10.6 and I have this output for "yum list | grep
ldap | sort":
$ yum list | grep ldap | sort

apr-util-ldap.x86_64 1.5.2-6.el7 base
bind-dyndb-ldap.x86_64 11.1-4.el7 base
compat-openldap.i686 1:2.3.43-5.el7 base
compat-openldap.x86_64 1:2.3.43-5.el7 base
cyrus-sasl-ldap.i686 2.1.26-23.el7 base
cyrus-sasl-ldap.x86_64 2.1.26-23.el7 base
freeradius-ldap.x86_64 3.0.13-9.el7_5 base
ipsilon-authldap.noarch 1.0.0-13.el7_3 base
krb5-server-ldap.x86_64 1.15.1-37.el7_6
updates
ldapjdk-javadoc.noarch 4.19-5.el7 base
ldapjdk.noarch 4.19-5.el7 base
mod_ldap.x86_64 2.4.6-88.el7.centos base
nss-pam-ldapd.i686 0.8.13-16.el7 base
nss-pam-ldapd.x86_64 0.8.13-16.el7 base
openldap-clients.x86_64 2.4.44-21.el7_6
@updates
openldap-devel.i686 2.4.44-21.el7_6
updates
openldap-devel.x86_64 2.4.44-21.el7_6
updates
openldap.i686 2.4.44-21.el7_6
updates
openldap-servers-sql.x86_64 2.4.44-21.el7_6
updates
openldap-servers.x86_64 2.4.44-21.el7_6
updates
openldap.x86_64 2.4.44-21.el7_6
@updates
openssh-ldap.x86_64 7.4p1-16.el7 base
php-ldap.x86_64 5.4.16-46.el7 base
python-ldap2pg-doc.x86_64 4.11-1.rhel7
pgdg10
python-ldap2pg.x86_64 4.11-1.rhel7
pgdg10
python-ldap.x86_64 2.4.15-2.el7 base
sssd-ldap.x86_64 1.16.2-13.el7_6.5
updates

And in the database where I encountered this issue I have these extensions
installed:

repdb=# \dx
List of installed extensions
Name | Version | Schema |
Description
--------------------+---------+------------+------------------------------------------------------------
hstore | 1.4 | public | data type for storing sets of
(key, value) pairs
pg_stat_statements | 1.6 | repdb | track execution statistics of
all SQL statements executed
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
postgres_fdw | 1.0 | repdb | foreign-data wrapper for
remote PostgreSQL servers
tablefunc | 1.0 | repdb | functions that manipulate
whole tables, including crosstab
(5 rows)

Thank you.

Regards,
Mike Yeap

On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Laurenz Albe <laurenz.albe@cybertec.at> writes:

Mike Yeap wrote:

I have encountered a problem related to LDAP authenticated session with

Postgres foreign data wrapper (postgres_fdw).

Are the "postgres" executable and libpq linked with the same version of

OpenLDAP?

And which version is that? (And which version of Postgres?)

Digging around in our git history, I came across this:

Author: Noah Misch <noah@leadboat.com>
Branch: master Release: REL9_5_BR [d7cdf6ee3] 2014-07-22 11:01:03 -0400

Diagnose incompatible OpenLDAP versions during build and test.

With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based
on
the compile-time OpenLDAP version number, and test the crash scenario
in
the dblink test suite. Back-patch to 9.0 (all supported versions).

which sounds a fair bit like what you are describing.

regards, tom lane

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mike Yeap (#4)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Mike Yeap <wkk1020@gmail.com> writes:

Are the "postgres" executable and libpq linked with the same version of
OpenLDAP?

How should I check whether they are linked?

"ldd" should show the dependencies of whatever executable or library
you point it at.

regards, tom lane

#6Mike Yeap
wkk1020@gmail.com
In reply to: Tom Lane (#5)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Hi Tom, when I run "ldd /usr/pgsql-10/bin/postmaster" I got this output:

# ldd /usr/pgsql-10/bin/postmaster
linux-vdso.so.1 => (0x00007ffd4ec65000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x00007eff8b5d3000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00007eff8b268000)
libpam.so.0 => /lib64/libpam.so.0 (0x00007eff8b059000)
libssl.so.10 => /lib64/libssl.so.10 (0x00007eff8ade7000)
libcrypto.so.10 => /lib64/libcrypto.so.10 (0x00007eff8a985000)
libgssapi_krb5.so.2 => /lib64/libgssapi_krb5.so.2 (0x00007eff8a738000)
librt.so.1 => /lib64/librt.so.1 (0x00007eff8a530000)
libdl.so.2 => /lib64/libdl.so.2 (0x00007eff8a32b000)
libm.so.6 => /lib64/libm.so.6 (0x00007eff8a029000)
libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)
libicui18n.so.50 => /lib64/libicui18n.so.50 (0x00007eff899d4000)
libicuuc.so.50 => /lib64/libicuuc.so.50 (0x00007eff8965b000)
libsystemd.so.0 => /lib64/libsystemd.so.0 (0x00007eff89633000)
libc.so.6 => /lib64/libc.so.6 (0x00007eff89271000)
/lib64/ld-linux-x86-64.so.2 (0x00007eff8b7f9000)
libz.so.1 => /lib64/libz.so.1 (0x00007eff8905b000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007eff88e35000)
libaudit.so.1 => /lib64/libaudit.so.1 (0x00007eff88c0c000)
libkrb5.so.3 => /lib64/libkrb5.so.3 (0x00007eff88924000)
libcom_err.so.2 => /lib64/libcom_err.so.2 (0x00007eff88720000)
libk5crypto.so.3 => /lib64/libk5crypto.so.3 (0x00007eff884ec000)
libkrb5support.so.0 => /lib64/libkrb5support.so.0 (0x00007eff882de000)
libkeyutils.so.1 => /lib64/libkeyutils.so.1 (0x00007eff880da000)
libresolv.so.2 => /lib64/libresolv.so.2 (0x00007eff87ebf000)
liblber-2.4.so.2 => /lib64/liblber-2.4.so.2 (0x00007eff87cb0000)
libsasl2.so.3 => /lib64/libsasl2.so.3 (0x00007eff87a93000)
libssl3.so => /lib64/libssl3.so (0x00007eff8784f000)
libsmime3.so => /lib64/libsmime3.so (0x00007eff87628000)
libnss3.so => /lib64/libnss3.so (0x00007eff87302000)
libnssutil3.so => /lib64/libnssutil3.so (0x00007eff870d5000)
libplds4.so => /lib64/libplds4.so (0x00007eff86ed1000)
libplc4.so => /lib64/libplc4.so (0x00007eff86ccc000)
libnspr4.so => /lib64/libnspr4.so (0x00007eff86a8d000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007eff86785000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007eff8656f000)
libicudata.so.50 => /lib64/libicudata.so.50 (0x00007eff84f9a000)
libcap.so.2 => /lib64/libcap.so.2 (0x00007eff84d95000)
libselinux.so.1 => /lib64/libselinux.so.1 (0x00007eff84b6e000)
libgcrypt.so.11 => /lib64/libgcrypt.so.11 (0x00007eff848ec000)
libgpg-error.so.0 => /lib64/libgpg-error.so.0 (0x00007eff846e7000)
libdw.so.1 => /lib64/libdw.so.1 (0x00007eff844a0000)
libcap-ng.so.0 => /lib64/libcap-ng.so.0 (0x00007eff84299000)
libcrypt.so.1 => /lib64/libcrypt.so.1 (0x00007eff84062000)
libattr.so.1 => /lib64/libattr.so.1 (0x00007eff83e5c000)
libpcre.so.1 => /lib64/libpcre.so.1 (0x00007eff83bfa000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007eff839e2000)
libbz2.so.1 => /lib64/libbz2.so.1 (0x00007eff837d1000)
libfreebl3.so => /lib64/libfreebl3.so (0x00007eff835ce000)

On the line that has ldap in it:

libldap-2.4.so.2 => /lib64/libldap-2.4.so.2 (0x00007eff89dd4000)

Sorry but in this case what is my libpq?

Regards,
Mike Yeap

On Thu, Feb 21, 2019 at 10:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Mike Yeap <wkk1020@gmail.com> writes:

Are the "postgres" executable and libpq linked with the same version of
OpenLDAP?

How should I check whether they are linked?

"ldd" should show the dependencies of whatever executable or library
you point it at.

regards, tom lane

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Mike Yeap (#4)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <wkk1020@gmail.com> wrote:

openldap-clients.x86_64 2.4.44-21.el7_6 @updates
openldap-devel.i686 2.4.44-21.el7_6 updates
openldap-devel.x86_64 2.4.44-21.el7_6 updates
openldap.i686 2.4.44-21.el7_6 updates
openldap-servers-sql.x86_64 2.4.44-21.el7_6 updates
openldap-servers.x86_64 2.4.44-21.el7_6 updates
openldap.x86_64 2.4.44-21.el7_6 @updates

On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure" based on
the compile-time OpenLDAP version number, and test the crash scenario in
the dblink test suite. Back-patch to 9.0 (all supported versions).

Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31. Perhaps the
dangerous range is out of date? Hmm, so Noah's analysis[1]/messages/by-id/20140612210219.GA705509@tornado.leadboat.com says this
is a clash between libldap_r.so (used by libpq) and libldap.so (used
by the server), specifically in destructor/exit code. Curiously, in a
thread about Curl's struggles with this problem, I found a claim[2]https://www.openldap.org/lists/openldap-technical/201608/msg00094.html
that Debian decided to abandon the non-"_r" variant and just use _r
always. Sure enough, on my Debian buster VM I see a symlink
libldap-2.4.so.2 -> libldap_r-2.4.so.2. So essentially Debian and
friends have already forced Noah's first option on users:

1. Link the backend with libldap_r, so we never face the mismatch. On some
platforms, this means also linking in threading libraries.

FreeBSD and CentOS systems near me have separate libraries still.

[1]: /messages/by-id/20140612210219.GA705509@tornado.leadboat.com
[2]: https://www.openldap.org/lists/openldap-technical/201608/msg00094.html

--
Thomas Munro
https://enterprisedb.com

#8Mike Yeap
wkk1020@gmail.com
In reply to: Thomas Munro (#7)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Hi Thomas, does that mean the bug is still there?

Regards,
Mike Yeap

On Mon, Feb 25, 2019 at 4:06 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Show quoted text

On Thu, Feb 21, 2019 at 2:42 PM Mike Yeap <wkk1020@gmail.com> wrote:

openldap-clients.x86_64 2.4.44-21.el7_6

@updates

openldap-devel.i686 2.4.44-21.el7_6

updates

openldap-devel.x86_64 2.4.44-21.el7_6

updates

openldap.i686 2.4.44-21.el7_6

updates

openldap-servers-sql.x86_64 2.4.44-21.el7_6

updates

openldap-servers.x86_64 2.4.44-21.el7_6

updates

openldap.x86_64 2.4.44-21.el7_6

@updates

On Wed, Feb 20, 2019 at 10:17 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, PostgreSQL
backends can crash at exit. Raise a warning during "configure"

based on

the compile-time OpenLDAP version number, and test the crash

scenario in

the dblink test suite. Back-patch to 9.0 (all supported versions).

Clearly 2.4.44 is not in the range 2.4.24 through 2.4.31. Perhaps the
dangerous range is out of date? Hmm, so Noah's analysis[1] says this
is a clash between libldap_r.so (used by libpq) and libldap.so (used
by the server), specifically in destructor/exit code. Curiously, in a
thread about Curl's struggles with this problem, I found a claim[2]
that Debian decided to abandon the non-"_r" variant and just use _r
always. Sure enough, on my Debian buster VM I see a symlink
libldap-2.4.so.2 -> libldap_r-2.4.so.2. So essentially Debian and
friends have already forced Noah's first option on users:

1. Link the backend with libldap_r, so we never face the mismatch. On

some

platforms, this means also linking in threading libraries.

FreeBSD and CentOS systems near me have separate libraries still.

[1]
/messages/by-id/20140612210219.GA705509@tornado.leadboat.com
[2] https://www.openldap.org/lists/openldap-technical/201608/msg00094.html

--
Thomas Munro
https://enterprisedb.com

#9Thomas Munro
thomas.munro@gmail.com
In reply to: Mike Yeap (#8)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote:

Hi Thomas, does that mean the bug is still there?

Hi Mike,

I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for. We need to
figure out a proper solution here, though I'm not sure what. Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

--
Thomas Munro
https://enterprisedb.com

#10Mike Yeap
wkk1020@gmail.com
In reply to: Thomas Munro (#9)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-(

Hopefully this problem is solved in future version, thank you!

Regards,
Mike Yeap

On Tue, Feb 26, 2019 at 4:12 PM Thomas Munro <thomas.munro@gmail.com> wrote:

Show quoted text

On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote:

Hi Thomas, does that mean the bug is still there?

Hi Mike,

I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for. We need to
figure out a proper solution here, though I'm not sure what. Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

--
Thomas Munro
https://enterprisedb.com

#11Thomas Munro
thomas.munro@gmail.com
In reply to: Thomas Munro (#9)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Tue, Feb 26, 2019 at 9:11 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Feb 26, 2019 at 8:17 PM Mike Yeap <wkk1020@gmail.com> wrote:

Hi Thomas, does that mean the bug is still there?

I haven't tried to repro this myself, but it certainly sounds like it.
It also sounds like it would probably go away if you switched to a
Debian-derived distro, instead of a Red Hat-derived distro, but I
doubt that's the kind of advice you were looking for. We need to
figure out a proper solution here, though I'm not sure what. Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

Concretely, why don't we just kill the LDAP_LIBS_FE/LDAP_LIBS_BE
distinction and use a single LDAP_LIBS? Then it'll always match. It
can still be the non-MT variant if you build with
--disable-thread-safety (who does that?), but then it'll be the same
in the server too so that postgres_fdw + ldap works that way too.
Sketch patch attached.

--
Thomas Munro
https://enterprisedb.com

Attachments:

same-lib-ldap-everywhere.patchapplication/x-patch; name=same-lib-ldap-everywhere.patchDownload+14-22
#12Stephen Frost
sfrost@snowman.net
In reply to: Mike Yeap (#10)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Greetings Mike,

* Mike Yeap (wkk1020@gmail.com) wrote:

Hi Thomas, I see..... guess I can't use LDAP authentication for now, :-(

If you're in an active directory environment, you should really be using
Kerberos for authentication and NOT LDAP anyway. LDAP-based
authentication involves sending the user's password (cleartext) to the
PG server, which is really bad security. Hopefully you're at least
connecting to PG with SSL, and from PG to LDAP with SSL, but you still
run the issue that a compromised server would expose the password of
everyone connecting to that server, and when you're using a centralized
authentication system like LDAP, that one password gets you access to
everything that account has access to.

Thanks!

Stephen

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#9)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro <thomas.munro@gmail.com> writes:

Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

The traditional reason for avoiding that is the risk of a server
process becoming multi-threaded. There are live bugs of that ilk
on Darwin, and we actually have cross-checks for the case in our
code (see HAVE_PTHREAD_IS_THREADED_NP stanzas).

If pthread_is_threaded_np(), or something equivalent, is widely available
then it might be all right to try solving this going forward by switching
to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid
to risk it in the back branches though ...

regards, tom lane

#14Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#13)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

Question
for the list: other stuff in the server needs libpthread (SSL, LLVM,
...), so why are we insisting on using non-MT LDAP?

The traditional reason for avoiding that is the risk of a server
process becoming multi-threaded. There are live bugs of that ilk
on Darwin, and we actually have cross-checks for the case in our
code (see HAVE_PTHREAD_IS_THREADED_NP stanzas).

If pthread_is_threaded_np(), or something equivalent, is widely available
then it might be all right to try solving this going forward by switching
to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid
to risk it in the back branches though ...

Hmm. Well here is a new data point: it looks like the Red Hat family
of distributions is in the process of making the same decision as
Debian (namely: to expunge the non-MT variant, because it bites
various projects in the same way that it bites us), but they haven't
quite hasn't pulled the trigger yet:

https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries

So if we do nothing at all, it seems likely that this problem will
eventually go away by itself on practically all Linux systems, leaving
this unfixed LDAP vs postgres_fdw bug to trip up the other Unix
systems. Bleugh.

I don't see pthread_is_threaded_np() on any non-Apple systems in my
lab. Clearly libdap_r is *capable* of creating threads: it contains a
function ldap_pvt_thread_create(), and we can see that slapd and other
OpenLDAP things use that, but AFAICT that's a private facility not
intended for end users to call, so there's no danger if you just use
the documented LDAP client API. Since pthread_is_threaded_np() is a
Mac thing, note also that Macs aren't directly exposed to this
particular choice anyway because (at least if you use system-provided
libraries rather than MacPorts et al) libldap.dylib and
libldap_r.dylib are already symlinks to the same Apple voodoo
"/System/Library/Frameworks/LDAP.framework/Versions/A/LDAP".

--
Thomas Munro
https://enterprisedb.com

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#14)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Thomas Munro <thomas.munro@gmail.com> writes:

On Wed, Feb 27, 2019 at 3:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

If pthread_is_threaded_np(), or something equivalent, is widely available
then it might be all right to try solving this going forward by switching
to libldap_r and seeing if anyone hits those cross-checks. I'd be afraid
to risk it in the back branches though ...

Hmm. Well here is a new data point: it looks like the Red Hat family
of distributions is in the process of making the same decision as
Debian (namely: to expunge the non-MT variant, because it bites
various projects in the same way that it bites us), but they haven't
quite hasn't pulled the trigger yet:
https://fedoraproject.org/wiki/Changes/OpenLDAPwithoutNonthreadedLibraries

Interesting, but that's going to be a very slow change. That says they'll
pull the trigger in Fedora 30, which I think is due to be released this
spring --- but it won't show up in RHEL till the next major release (8
or maybe even 9 at this point), and the existing major releases have got
10-year support lifespans.

I don't see pthread_is_threaded_np() on any non-Apple systems in my
lab.

Yeah, I thought that might be a Mac thing. I wonder if POSIX has any
usable equivalent.

Clearly libdap_r is *capable* of creating threads: it contains a
function ldap_pvt_thread_create(), and we can see that slapd and other
OpenLDAP things use that, but AFAICT that's a private facility not
intended for end users to call, so there's no danger if you just use
the documented LDAP client API.

That seems promising, but I'd sure be happier if we could cross-check
that there's still just one thread at the completion of authentication.

regards, tom lane

#16Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#15)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Adding Noah to thread.

On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

I don't see pthread_is_threaded_np() on any non-Apple systems in my
lab.

Yeah, I thought that might be a Mac thing. I wonder if POSIX has any
usable equivalent.

I don't see anything like that (the concept doesn't seem very
portable). I couldn't find a way on Glibc (but I'm not saying there
isn't one hiding somewhere). FreeBSD has a thing much like macOS's
(and I think some more BSDs do too); it's set to true by libthr when
the first thread is created, to make libc start locking various stuff.

The macOS one probably isn't a good canary to protect us from OpenLDAP
creating threads since on typical macOS builds we're using Apple's
LDAP thing (which cybersquats libldap.dylib and libldap_r.dylib via
symlinks). So adding a FreeBSD check seems like a good idea, because
at least one FreeBSD system in our buildfarm runs the ldap checks on
real OpenLDAP (elver).

Clearly libdap_r is *capable* of creating threads: it contains a
function ldap_pvt_thread_create(), and we can see that slapd and other
OpenLDAP things use that, but AFAICT that's a private facility not
intended for end users to call, so there's no danger if you just use
the documented LDAP client API.

That seems promising, but I'd sure be happier if we could cross-check
that there's still just one thread at the completion of authentication.

Ok, here's that patch again with a commit message and with the
configure version warning removed, and a make-sure-we're-not-threaded
patch for FreeBSD.

I'm not sure what to do about the LDAP test in
contrib/dblink/sql/dblink.sql. Do we still want this?

I propose this for master only, for now. I also think it'd be nice to
consider back-patching it after a while, especially since this
reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
for a good while. Hmm, I wonder if it's OK to subtly change library
dependencies in a minor release; I don't see any problem with it since
I expect both variants to be provided by the same package in every
distro but we'd certainly want to highlight this to the package
maintainers if we did it.

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Test-__isthreaded-on-FreeBSD-and-friends.patchapplication/octet-stream; name=0001-Test-__isthreaded-on-FreeBSD-and-friends.patchDownload+34-7
0002-Use-the-same-libldap-variant-in-the-frontend-and-bac.patchapplication/octet-stream; name=0002-Use-the-same-libldap-variant-in-the-frontend-and-bac.patchDownload+14-94
#17Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#16)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Thu, Mar 07, 2019 at 10:45:56AM +1300, Thomas Munro wrote:

On Wed, Feb 27, 2019 at 11:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Thomas Munro <thomas.munro@gmail.com> writes:

I don't see pthread_is_threaded_np() on any non-Apple systems in my
lab.

Yeah, I thought that might be a Mac thing. I wonder if POSIX has any
usable equivalent.

I don't see anything like that (the concept doesn't seem very
portable).

I'm not aware of one.

Clearly libdap_r is *capable* of creating threads: it contains a
function ldap_pvt_thread_create(), and we can see that slapd and other
OpenLDAP things use that, but AFAICT that's a private facility not
intended for end users to call, so there's no danger if you just use
the documented LDAP client API.

That seems promising, but I'd sure be happier if we could cross-check
that there's still just one thread at the completion of authentication.

Ok, here's that patch again with a commit message and with the
configure version warning removed, and a make-sure-we're-not-threaded
patch for FreeBSD.

I'm not sure what to do about the LDAP test in
contrib/dblink/sql/dblink.sql. Do we still want this?

Mike, does the dblink test suite not fail on your system? It's designed to
catch this exact problem.

Has anyone else reproduced this?

I propose this for master only, for now. I also think it'd be nice to
consider back-patching it after a while, especially since this
reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
for a good while. Hmm, I wonder if it's OK to subtly change library
dependencies in a minor release; I don't see any problem with it since
I expect both variants to be provided by the same package in every
distro but we'd certainly want to highlight this to the package
maintainers if we did it.

It's not great to change library dependencies in a minor release. If every
RHEL 7 installation can crash this way, changing the dependencies is probably
the least bad thing.

#18Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#17)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote:

Has anyone else reproduced this?

I tried, but could not reproduce this problem on "CentOS Linux release
7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
reported, what yum install is currently serving up). I tried "make
check" in contrib/dblink, and the only strange thing I noticed was
this FATAL error at the top of contrib/dblink/log/postmaster.log:

2019-03-14 03:51:33.058 UTC [20131] LOG: database system is ready to
accept connections
2019-03-14 03:51:33.059 UTC [20135] [unknown] FATAL: the database
system is starting up

I don't see that on other systems and don't understand it.

I also tried a test of my own which I thought corresponded directly to
what Mike described, on both master and REL_10_STABLE. I'll record my
steps here so perhaps someone can see what's missing.

1. Run the regression test under src/test/ldap so that you get some
canned slapd configuration files.
2. cd into src/test/ldap/tmp_check and run "slapd -f slapd.conf -h
ldap://localhost:5555". It should daemonify itself, and run until you
kill it with SIGINT.
3. Put this into pg_hba.conf:
host postgres test1 127.0.0.1/32 ldap ldapserver=localhost
ldapport=5555 ldapbasedn="dc=example,dc=net"
4. Create database objects as superuser:
create user test1;
create table t (i int);
grant all on t to test1;
create extension postgres_fdw;
create server foreign_server foreign data wrapper postgres_fdw options
(dbname 'postgres', host '127.0.0.1');
create foreign table ft (i int) server foreign_server options (table_name 't');
create user mapping for test1 server foreign_server options (user
'test1', password 'secret1');
grant all on ft to test1;
5. Now you should be able to log in with "psql -h 127.0.0.1 postgres
test1" and password "secret1", and run queries like: select * from ft;

When exiting the session, I was expecting the backend to crash,
because it had executed libldap.so code during authentication, and
then it had linked in libldap_r.so via libpq.so while connecting via
postgres_fdw. But it doesn't crash. I wonder what is different for
Mike; am I missing something, or is there non-determinism here?

I propose this for master only, for now. I also think it'd be nice to
consider back-patching it after a while, especially since this
reported broke on CentOS/RHEL7, a pretty popular OS that'll be around
for a good while. Hmm, I wonder if it's OK to subtly change library
dependencies in a minor release; I don't see any problem with it since
I expect both variants to be provided by the same package in every
distro but we'd certainly want to highlight this to the package
maintainers if we did it.

It's not great to change library dependencies in a minor release. If every
RHEL 7 installation can crash this way, changing the dependencies is probably
the least bad thing.

+1, once we get a repro and/or better understanding.

--
Thomas Munro
https://enterprisedb.com

#19Noah Misch
noah@leadboat.com
In reply to: Thomas Munro (#18)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote:

On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote:

Has anyone else reproduced this?

I tried, but could not reproduce this problem on "CentOS Linux release
7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
reported, what yum install is currently serving up).

When exiting the session, I was expecting the backend to crash,
because it had executed libldap.so code during authentication, and
then it had linked in libldap_r.so via libpq.so while connecting via
postgres_fdw. But it doesn't crash. I wonder what is different for
Mike; am I missing something, or is there non-determinism here?

The test is deterministic. I'm guessing Mike's system is finding ldap
libraries other than the usual system ones. Mike, would you check as follows?

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" | psql -X &
[1]: 2530123 pg_backend_pid ---------------- 2530124 (1 row)
pg_backend_pid
----------------
2530124
(1 row)

LOAD

$ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6
From To Syms Read Shared Object Library
0x00007ffff65e1ee0 0x00007ffff6613304 Yes (*) /lib64/libldap-2.4.so.2
0x00007fffe998f6d0 0x00007fffe99c3ae4 Yes (*) /lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.

#20Mike Yeap
wkk1020@gmail.com
In reply to: Noah Misch (#19)
Re: LDAP authenticated session terminated by signal 11: Segmentation fault, PostgresSQL server terminates other active server processes

Hi Noah, below is the output from one of the servers having this issue:

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" |
psql -X &
[1]: 9731

$ select pg_backend_pid(); load 'dblink'; select pg_sleep(100)
pg_backend_pid
----------------
9732
(1 row)

LOAD

$ gdb --batch --pid 9732 -ex 'info sharedlibrary ldap'

warning: .dynamic section for "/lib64/libldap-2.4.so.2" is not at the
expected address (wrong library or version mismatch?)

warning: .dynamic section for "/lib64/liblber-2.4.so.2" is not at the
expected address (wrong library or version mismatch?)
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007f1e7592dcf3 in __epoll_wait_nocancel () from /lib64/libc.so.6
From To Syms Read Shared Object Library
0x00007f1e7637d0f8 0x00007f1e763ae51c Yes (*) /lib64/libldap-2.4.so.2
0x00007f1d9f2c16d0 0x00007f1d9f2f5ae4 Yes (*)
/lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.

Regards,
Mike Yeap

On Thu, Mar 14, 2019 at 1:42 PM Noah Misch <noah@leadboat.com> wrote:

Show quoted text

On Thu, Mar 14, 2019 at 05:18:49PM +1300, Thomas Munro wrote:

On Thu, Mar 7, 2019 at 4:19 PM Noah Misch <noah@leadboat.com> wrote:

Has anyone else reproduced this?

I tried, but could not reproduce this problem on "CentOS Linux release
7.6.1810 (Core)" using OpenLDAP "2.4.44-21.el7_6" (same as Mike
reported, what yum install is currently serving up).

When exiting the session, I was expecting the backend to crash,
because it had executed libldap.so code during authentication, and
then it had linked in libldap_r.so via libpq.so while connecting via
postgres_fdw. But it doesn't crash. I wonder what is different for
Mike; am I missing something, or is there non-determinism here?

The test is deterministic. I'm guessing Mike's system is finding ldap
libraries other than the usual system ones. Mike, would you check as
follows?

$ echo "select pg_backend_pid(); load 'dblink'; select pg_sleep(100)" |
psql -X &
[1] 2530123
pg_backend_pid
----------------
2530124
(1 row)

LOAD

$ gdb --batch --pid 2530124 -ex 'info sharedlibrary ldap'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
0x00007ffff6303463 in __epoll_wait_nocancel () from /lib64/libc.so.6
From To Syms Read Shared Object Library
0x00007ffff65e1ee0 0x00007ffff6613304 Yes (*) /lib64/libldap-2.4.so.2
0x00007fffe998f6d0 0x00007fffe99c3ae4 Yes (*)
/lib64/libldap_r-2.4.so.2
(*): Shared library is missing debugging information.

#21Noah Misch
noah@leadboat.com
In reply to: Mike Yeap (#20)
#22Thomas Munro
thomas.munro@gmail.com
In reply to: Noah Misch (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#22)
#24Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#23)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#24)
#26Thomas Munro
thomas.munro@gmail.com
In reply to: Tom Lane (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Thomas Munro (#26)