pgsql: Refactor dlopen() support

Started by Peter Eisentrautover 7 years ago6 messages
#1Peter Eisentraut
peter_e@gmx.net

Refactor dlopen() support

Nowadays, all platforms except Windows and older HP-UX have standard
dlopen() support. So having a separate implementation per platform
under src/backend/port/dynloader/ is a bit excessive. Instead, treat
dlopen() like other library functions that happen to be missing
sometimes and put a replacement implementation under src/port/.

Discussion: /messages/by-id/e11a49cb-570a-60b7-707d-7084c8de0e61@2ndquadrant.com

Branch
------
master

Details
-------
https://git.postgresql.org/pg/commitdiff/842cb9fa62fc99598086166bdeec9d6ae6e3c50f

Modified Files
--------------
configure | 43 +++++++++++--
configure.in | 8 +--
src/backend/Makefile | 2 +-
src/backend/port/.gitignore | 1 -
src/backend/port/Makefile | 2 +-
src/backend/port/dynloader/aix.c | 7 ---
src/backend/port/dynloader/aix.h | 39 ------------
src/backend/port/dynloader/cygwin.c | 3 -
src/backend/port/dynloader/cygwin.h | 36 -----------
src/backend/port/dynloader/darwin.c | 35 -----------
src/backend/port/dynloader/darwin.h | 8 ---
src/backend/port/dynloader/freebsd.c | 7 ---
src/backend/port/dynloader/freebsd.h | 38 ------------
src/backend/port/dynloader/hpux.c | 68 --------------------
src/backend/port/dynloader/hpux.h | 25 --------
src/backend/port/dynloader/linux.c | 7 ---
src/backend/port/dynloader/linux.h | 38 ------------
src/backend/port/dynloader/netbsd.c | 7 ---
src/backend/port/dynloader/netbsd.h | 38 ------------
src/backend/port/dynloader/openbsd.c | 7 ---
src/backend/port/dynloader/openbsd.h | 38 ------------
src/backend/port/dynloader/solaris.c | 7 ---
src/backend/port/dynloader/solaris.h | 38 ------------
src/backend/port/dynloader/win32.h | 19 ------
src/backend/postmaster/postmaster.c | 1 -
src/backend/utils/fmgr/dfmgr.c | 31 +++++-----
src/include/.gitignore | 1 -
src/include/Makefile | 4 +-
src/include/pg_config.h.in | 8 +++
src/include/pg_config.h.win32 | 8 +++
src/include/port.h | 23 +++++++
src/include/utils/dynamic_loader.h | 25 --------
.../port/dynloader/win32.c => port/dlopen.c} | 72 ++++++++++++++++++++--
src/tools/msvc/Install.pm | 5 +-
src/tools/msvc/Mkvcbuild.pm | 5 +-
src/tools/msvc/Solution.pm | 7 ---
src/tools/msvc/clean.bat | 1 -
37 files changed, 172 insertions(+), 540 deletions(-)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#1)
Re: pgsql: Refactor dlopen() support

Peter Eisentraut <peter_e@gmx.net> writes:

Refactor dlopen() support

Buildfarm member locust doesn't like this much. I've been able to
reproduce the problem on an old Mac laptop running the same macOS release,
viz 10.5.8. (Note that we're not seeing it on earlier or later releases,
which is odd in itself.) According to my machine, the crash is happening
here:

#0 _PG_init () at plpy_main.c:98
98 *plpython_version_bitmask_ptr |= (1 << PY_MAJOR_VERSION);

and the reason is that the rendezvous variable sometimes contains garbage.
Most sessions correctly see it as initially zero, but sometimes it
contains

(gdb) p plpython_version_bitmask_ptr
$1 = (int *) 0x1d

and I've also seen

(gdb) p plpython_version_bitmask_ptr
$1 = (int *) 0x7f7f7f7f

It's mostly repeatable but not completely so: the 0x1d case seems
to come up every time through the plpython_do test, but I don't
always see the 0x7f7f7f7f case. (Maybe that's a timing artifact?
It takes a variable amount of time to recover from the first crash
in plpython_do, so the rest of the plpython test run isn't exactly
operating in uniform conditions.)

No idea what's going on here, and I'm about out of steam for tonight.

regards, tom lane

#3Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: pgsql: Refactor dlopen() support

On 07/09/2018 08:30, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Refactor dlopen() support

Buildfarm member locust doesn't like this much. I've been able to
reproduce the problem on an old Mac laptop running the same macOS release,
viz 10.5.8. (Note that we're not seeing it on earlier or later releases,
which is odd in itself.)

Nothing should have changed on macOS except that the intermediate
functions pg_dl*() were replaced by direct calls to dl*(). Very strange.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#3)
Re: pgsql: Refactor dlopen() support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 07/09/2018 08:30, Tom Lane wrote:

Buildfarm member locust doesn't like this much. I've been able to
reproduce the problem on an old Mac laptop running the same macOS release,
viz 10.5.8. (Note that we're not seeing it on earlier or later releases,
which is odd in itself.)

Nothing should have changed on macOS except that the intermediate
functions pg_dl*() were replaced by direct calls to dl*(). Very strange.

Somehow or other, the changes you made in dfmgr.c's #include lines
have made it so that find_rendezvous_variable's local "bool found"
variable is actually of type _Bool (which is word-wide on these
machines). However, hash_search thinks its output variable is
of type pointer to "typedef char bool". The proximate cause of
the observed failure is that find_rendezvous_variable sees "found"
as true when it should not, and thus fails to zero out the variable's
value.

No time to look further right now, but there's something rotten
about the way we're handling bool.

regards, tom lane

#5Peter Eisentraut
peter.eisentraut@2ndquadrant.com
In reply to: Tom Lane (#4)
Re: pgsql: Refactor dlopen() support

On 07/09/2018 16:19, Tom Lane wrote:

Somehow or other, the changes you made in dfmgr.c's #include lines
have made it so that find_rendezvous_variable's local "bool found"
variable is actually of type _Bool (which is word-wide on these
machines). However, hash_search thinks its output variable is
of type pointer to "typedef char bool". The proximate cause of
the observed failure is that find_rendezvous_variable sees "found"
as true when it should not, and thus fails to zero out the variable's
value.

Ah because dlfcn.h includes stdbool.h. Hmm.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#5)
1 attachment(s)
Re: pgsql: Refactor dlopen() support

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 07/09/2018 16:19, Tom Lane wrote:

Somehow or other, the changes you made in dfmgr.c's #include lines
have made it so that find_rendezvous_variable's local "bool found"
variable is actually of type _Bool (which is word-wide on these
machines).

Ah because dlfcn.h includes stdbool.h. Hmm.

Yeah, and that's still true as of current macOS, it seems.

I can make the problem go away with the attached patch (borrowed from
similar code in plperl.h). It's kind of grotty but I'm not sure there's
a better way.

regards, tom lane

Attachments:

undo-stdbool-damage-in-dfmgr.patchtext/x-diff; charset=us-ascii; name=undo-stdbool-damage-in-dfmgr.patchDownload
diff --git a/src/backend/utils/fmgr/dfmgr.c b/src/backend/utils/fmgr/dfmgr.c
index c2a2572..4a5cc7c 100644
*** a/src/backend/utils/fmgr/dfmgr.c
--- b/src/backend/utils/fmgr/dfmgr.c
***************
*** 18,24 ****
--- 18,34 ----
  
  #ifdef HAVE_DLOPEN
  #include <dlfcn.h>
+ 
+ /*
+  * On macOS, <dlfcn.h> insists on including <stdbool.h>.  If we're not
+  * using stdbool, undef bool to undo the damage.
+  */
+ #ifndef USE_STDBOOL
+ #ifdef bool
+ #undef bool
  #endif
+ #endif
+ #endif							/* HAVE_DLOPEN */
  
  #include "fmgr.h"
  #include "lib/stringinfo.h"