Rethinking -L switch handling and construction of LDFLAGS
I noticed that if I build with --with-libxml on my Mac platforms,
"make installcheck" stops working for certain contrib modules such
as postgres_fdw. I finally got around to diagnosing the reason why,
and it goes like this:
1. --with-libxml causes configure to include
-L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib
in the LDFLAGS value put into Makefile.global. That's because
"xml2-config --libs" emits that, and we do need it if we want to link
to the platform-supplied libxml2.
2. However, that directory also contains a symlink to the
platform-supplied libpq.
3. When we go to build postgres_fdw.so, the link command line looks like
ccache gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -Wno-unused-command-line-argument -g -O2 -bundle -multiply_defined suppress -o postgres_fdw.so postgres_fdw.o option.o deparse.o connection.o shippable.o -L../../src/port -L../../src/common -L/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk/usr/lib -L/usr/local/ssl/lib -Wl,-dead_strip_dylibs -L../../src/interfaces/libpq -lpq -bundle_loader ../../src/backend/postgres
The details of this might vary depending on your configure options,
but the key point is that the -L/Applications/... switch is before the
-L../../src/interfaces/libpq one. This means that the linker resolves
"-lpq" to the platform-supplied libpq, not the one in the build tree.
We can confirm that with
$ otool -L postgres_fdw.so
postgres_fdw.so:
/usr/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.6.0)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.50.4)
So, quite aside from any problems stemming from using a 9.3-vintage libpq
with HEAD client code, we are stuck with a libpq that uses Apple's idea of
the default socket location, rather than what the rest of our build uses.
That explains the failures seen in "make installcheck", which look like
2018-04-01 13:09:48.744 EDT [10758] ERROR: could not connect to server "loopback"
2018-04-01 13:09:48.744 EDT [10758] DETAIL: could not connect to server: No such file or directory
Is the server running locally and accepting
connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?
Of course, /var/pgsql_socket is *not* where my postmaster is putting
its socket.
In short, we need to deal more honestly with the positioning of -L
switches in link commands. Somebody's idea that we could embed
both -L and -l into $(libpq), and then pay basically no attention to
where that ends up in the final link command, is just too simplistic.
I think that we want to establish an ironclad rule that -L switches
referencing directories in our own build tree must appear before -L
switches referencing external libraries.
I don't have a concrete patch to propose yet, but the design idea
I have in mind is to split LDFLAGS into two or more parts, so that
-L switches for the build tree are supposed to be put in the first
part and external -L switches in the second. It'd be sufficient
to have Makefile.global do something like
ifdef PGXS
LDFLAGS_INTERNAL = -L$(libdir)
else
LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
endif
LDFLAGS = $(LDFLAGS_INTERNAL) @LDFLAGS@
and then teach relevant places that they need to add $(libpq) to
LDFLAGS_INTERNAL not LDFLAGS. (Perhaps "BUILD" would be a better keyword
than "INTERNAL" here?) Not sure how that would play exactly with
Makefile.shlib's SHLIB_LINK, but maybe we need SHLIB_LINK_INTERNAL along
with SHLIB_LINK. I'd also like to try to clean up the mess that is
$(libpq_pgport), though I'm not sure just how yet.
Or we could try to create a full separation between -L and -l switches,
ending up with three or more parts for LDFLAGS not just two. But I'm
not sure if that gains anything.
I have no idea whether the MSVC build infrastructure has comparable
problems, and would not be willing to fix it myself if it does.
But I am willing to try to fix this in the gmake infrastructure.
Comments, better ideas?
regards, tom lane
Hi,
On 2018-04-01 13:38:15 -0400, Tom Lane wrote:
In short, we need to deal more honestly with the positioning of -L
switches in link commands. Somebody's idea that we could embed
both -L and -l into $(libpq), and then pay basically no attention to
where that ends up in the final link command, is just too simplistic.
Sounds right.
I don't have a concrete patch to propose yet, but the design idea
I have in mind is to split LDFLAGS into two or more parts, so that
-L switches for the build tree are supposed to be put in the first
part and external -L switches in the second. It'd be sufficient
to have Makefile.global do something likeifdef PGXS
LDFLAGS_INTERNAL = -L$(libdir)
else
LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
endif
LDFLAGS = $(LDFLAGS_INTERNAL) @LDFLAGS@
I'm not sure I like doing this in Makefile.global. We've various files
that extend LDFLAGS in other places, and that's going to be hard if it's
already mushed together again. We end up re-building it from parts in
those files too.
Why don't we change the link commands to reference LDFLAGS_INTERNAL
explicitly? That seems like it'd be cleaner.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-04-01 13:38:15 -0400, Tom Lane wrote:
I don't have a concrete patch to propose yet, but the design idea
I have in mind is to split LDFLAGS into two or more parts, so that
-L switches for the build tree are supposed to be put in the first
part and external -L switches in the second.
I'm not sure I like doing this in Makefile.global. We've various files
that extend LDFLAGS in other places, and that's going to be hard if it's
already mushed together again. We end up re-building it from parts in
those files too.
Yeah, one of the things I'd like to fix is that some of the makefiles,
eg psql's, do
override LDFLAGS := -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport) $(LDFLAGS)
which goes *directly* against this commandment in Makefile.global:
# We want -L for libpgport.a and libpgcommon.a to be first in LDFLAGS. We
# also need LDFLAGS to be a "recursively expanded" variable, else adjustments
# to rpathdir don't work right. So we must NOT do LDFLAGS := something,
# meaning this has to be done first and elsewhere we must only do LDFLAGS +=
# something.
It's a bit surprising that rpath works at all for these makefiles.
But with what I'm imagining here, I think we could replace that with
LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
and thereby preserve the recursively-expanded virginity of both
LDFLAGS_INTERNAL and LDFLAGS. But I've not tried to test anything yet.
Why don't we change the link commands to reference LDFLAGS_INTERNAL
explicitly? That seems like it'd be cleaner.
I'm hesitant to do that because LDFLAGS is a name known to make's
default rules, and I don't want to bet that we're not relying on
those default rules anywhere. I also disagree with the idea that using
"$(LDFLAGS_INTERNAL) $(LDFLAGS)" in every link command we have is better
or less error-prone than just "$(LDFLAGS)". Especially not if we end up
with more than two parts.
regards, tom lane
Hi,
On 2018-04-01 13:55:05 -0400, Tom Lane wrote:
Why don't we change the link commands to reference LDFLAGS_INTERNAL
explicitly? That seems like it'd be cleaner.I'm hesitant to do that because LDFLAGS is a name known to make's
default rules, and I don't want to bet that we're not relying on
those default rules anywhere.
FWIW, postgres builds cleanly with -r -R in MAKELAGS.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2018-04-01 13:55:05 -0400, Tom Lane wrote:
I'm hesitant to do that because LDFLAGS is a name known to make's
default rules, and I don't want to bet that we're not relying on
those default rules anywhere.
FWIW, postgres builds cleanly with -r -R in MAKELAGS.
That's pretty hard to believe. Why would we bother to override every
default rule? Even if it's true today, I would not accept it as project
policy that we must do so. Perhaps more to the point, I would strongly
object to any design in which the standard Make variables don't mean
what the default rules expect them to mean. That's just a recipe for
confusing people and creating hard-to-spot bugs.
regards, tom lane
So here's a draft patch for this. It fixes the originally complained-of
issue about --with-libxml breaking the build on Macs. I've not tested
it very far beyond that plus a sanity cross-check on Linux, but I doubt
there's any point in further manual testing; we might as well just
throw it at the buildfarm.
I'm pretty happy with the way this turned out. It's not a complicated
change, and we're no longer breaking our own rules about how to manage
LDFLAGS adjustments.
Some notes:
* I ended up adding not only LDFLAGS_INTERNAL and SHLIB_LINK_INTERNAL,
but also PG_LIBS_INTERNAL. As things stood, the only uses of PG_LIBS
were for within-tree library references, so that it was fine that the
make rule that uses this variable put it before LDFLAGS. But if anyone
ever tried to use PG_LIBS for the seemingly obvious purpose of referencing
an external library, we'd be right back in hazard land. So that variable
needs to be split as well.
* I documented SHLIB_LINK_INTERNAL and PG_LIBS_INTERNAL in the comments
in pgxs.mk, but not in the user-facing documentation about using PGXS.
This is on the theory that only within-tree modules need either of those
variables, so users don't need them. I could be convinced otherwise,
though.
* Some of these changes aren't really essential, eg the change in
hstore_plperl/Makefile to use SHLIB_LINK_INTERNAL instead of SHLIB_LINK;
since there's no -L switch there, it'd work either way. But I thought
it best to maintain a consistent rule of using the _INTERNAL variable
for within-tree references.
* The change to STD_LDFLAGS in src/common/Makefile is because I noticed
that "pg_config --ldflags" has been printing "-L../../../src/common",
which it surely should not. Apparently we forgot to fix this rule when
we rearranged things to create src/common/. I'm tempted to make the
rule be
STD_LDFLAGS := $(filter-out $(LDFLAGS_INTERNAL),$(LDFLAGS))
in hopes of preventing a repetition of such silliness, but desisted
for the moment, because I'm not quite sure if this is a good idea,
or if it risks taking too much out. Anybody have an opinion about it?
regards, tom lane