`make check` doesn't pass on MacOS Catalina
Hi hackers,
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:
```
============== removing existing temp instance ==============
============== creating temporary instance ==============
============== initializing database system ==============
============== starting postmaster ==============
sh: line 1: 33559 Abort trap: 6 "psql" -X postgres < /dev/null 2>
/dev/null
sh: line 1: 33562 Abort trap: 6 "psql" -X postgres < /dev/null 2>
/dev/null
...
sh: line 1: 33742 Abort trap: 6 "psql" -X postgres < /dev/null 2>
/dev/null
pg_regress: postmaster did not respond within 60 seconds
Examine
/Users/eax/projects/c/postgresql/src/test/regress/log/postmaster.log for
the reason
make[1]https://github.com/rbenv/rbenv/issues/962#issuecomment-275858404: *** [check] Error 2
make: *** [check] Error 2
```
A little investigation revealed that pg_regres executes postgres like this:
```
PATH="/Users/eax/projects/c/postgresql/tmp_install/Users/eax/pginstall/bin:$PATH"
DYLD_LIBRARY_PATH="/Users/eax/projects/c/postgresql/tmp_install/Users/eax/pginstall/lib"
"postgres" -D
"/Users/eax/projects/c/postgresql/src/test/regress/./tmp_check/data" -F -c
"listen_addresses=" -k "/Users/eax/pgtmp/pg_regress-S34sXM" >
"/Users/eax/projects/c/postgresql/src/test/regress/log/postmaster.log"
```
... and checks that it's online by executing:
```
PATH="/Users/eax/projects/c/postgresql/tmp_install/Users/eax/pginstall/bin:$PATH"
DYLD_LIBRARY_PATH="/Users/eax/projects/c/postgresql/tmp_install/Users/eax/pginstall/lib"
psql -X postgres
```
The last command fails with:
```
psql: error: connection to server on socket "/tmp/.s.PGSQL.5432" failed: No
such file or directory. Is the server running locally and accepting
connections on that socket?
```
This is because the actual path to the socket is:
```
~/pgtmp/pg_regress-S34sXM/.s.PGSQL.5432
```
While debugging this I also discovered that psql uses
/usr/lib/libpq.5.dylib library, according to the `image list` command in
LLDB. The library is provided with the system and can't be moved or
deleted. In other words, it seems to ignore DYLD_LIBRARY_PATH. I've found
an instruction [1]https://github.com/rbenv/rbenv/issues/962#issuecomment-275858404 that suggests that this is a behavior of MacOS integrity
protection and describes how it can be disabled. Sadly it made no
difference in my case, psql still ignores DYLD_LIBRARY_PATH.
While I'm still in the progress of investigating this I just wanted to ask
if anyone is developing on MacOS and observes anything similar and had any
luck solving the problem? I tried to search through the mailing list but
didn't find anything relevant. The complete script that reproduces the
issue is attached. I'm using the same script on Ubuntu VM, where it works
just fine.
[1]: https://github.com/rbenv/rbenv/issues/962#issuecomment-275858404
--
Best regards,
Aleksander Alekseev
Attachments:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:
This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
regards, tom lane
Hi Tom,
Many thanks, running "make install" before "make check" helped.
On Tue, Apr 20, 2021 at 6:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
regards, tom lane
--
Best regards,
Aleksander Alekseev
On 4/20/21 11:02 AM, Tom Lane wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
FYI the buildfarm client has a '--delay-check' option that does exactly
this. It's useful on Alpine Linux as well as MacOS
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Hi hackers,
Thank you very much. I'm facing the same problem yesterday. May I
suggest that document it in the installation guide on MacOS platform?
On 4/21/21, Andrew Dunstan <andrew@dunslane.net> wrote:
On 4/20/21 11:02 AM, Tom Lane wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on
a
MacBook I discovered that `make check` fails:This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
FYI the buildfarm client has a '--delay-check' option that does exactly
this. It's useful on Alpine Linux as well as MacOScheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
--
Best Regards,
Xing
Xing GUO <higuoxing@gmail.com> writes:
Thank you very much. I'm facing the same problem yesterday. May I
suggest that document it in the installation guide on MacOS platform?
It is documented --- see last para under
https://www.postgresql.org/docs/current/installation-platform-notes.html#INSTALLATION-NOTES-MACOS
regards, tom lane
On 4/21/21, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Xing GUO <higuoxing@gmail.com> writes:
Thank you very much. I'm facing the same problem yesterday. May I
suggest that document it in the installation guide on MacOS platform?It is documented --- see last para under
https://www.postgresql.org/docs/current/installation-platform-notes.html#INSTALLATION-NOTES-MACOS
Thank you! Sorry for my carelessness...
regards, tom lane
--
Best Regards,
Xing
On Tue, Apr 20, 2021 at 9:06 AM Andrew Dunstan <andrew@dunslane.net> wrote:
On 4/20/21 11:02 AM, Tom Lane wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
FYI the buildfarm client has a '--delay-check' option that does exactly
this. It's useful on Alpine Linux as well as MacOS
I was trying to set up a buildfarm animal, and this exact problem lead
to a few hours of debugging and hair-pulling. Can the default
behaviour be changed in buildfarm client to perform `make check` only
after `make install`.
Current buildfarm client code looks something like:
make();
make_check() unless $delay_check;
... other steps ...
make_install();
... other steps-2...
make_check() if $delay_check;
There are no comments as to why one should choose to use --delay-check
($delay_check). This email, and the pointer to the paragraph buried in
the docs, shared by Tom, are the only two ways one can understand what
is causing this failure, and how to get around it.
Naive question: What's stopping us from rewriting the code as follows.
make();
make_install();
make_check();
... other steps ...
... other steps-2...
# or move make_check() call here
With a quick google search I could not find why --delay-check is
necessary on Apline linux, as well; can you please elaborate.
Best regards,
Gurjeet
http://Gurje.et
On 2022-08-06 Sa 06:49, Gurjeet Singh wrote:
On Tue, Apr 20, 2021 at 9:06 AM Andrew Dunstan <andrew@dunslane.net> wrote:
On 4/20/21 11:02 AM, Tom Lane wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
While trying to build PostgreSQL from source (master branch, 95c3a195) on a
MacBook I discovered that `make check` fails:This is the usual symptom of not having disabled SIP :-(.
If you don't want to do that, do "make install" before "make check".
FYI the buildfarm client has a '--delay-check' option that does exactly
this. It's useful on Alpine Linux as well as MacOSI was trying to set up a buildfarm animal, and this exact problem lead
to a few hours of debugging and hair-pulling. Can the default
behaviour be changed in buildfarm client to perform `make check` only
after `make install`.Current buildfarm client code looks something like:
make();
make_check() unless $delay_check;
... other steps ...
make_install();
... other steps-2...
make_check() if $delay_check;There are no comments as to why one should choose to use --delay-check
($delay_check). This email, and the pointer to the paragraph buried in
the docs, shared by Tom, are the only two ways one can understand what
is causing this failure, and how to get around it.Naive question: What's stopping us from rewriting the code as follows.
make();
make_install();
make_check();
... other steps ...
... other steps-2...
# or move make_check() call hereWith a quick google search I could not find why --delay-check is
necessary on Apline linux, as well; can you please elaborate.
I came across this when I was working on setting up some Dockerfiles for
the buildfarm. Apparently LD_LIBRARY_PATH doesn't work on Alpine, at
least out of the box, as it uses a different linker, and "make check"
relies on it (or the moral equivalent) if "make install" hasn't been run.
In general we want to run "make check" as soon as possible after running
"make" on the core code. That's why I didn't simply delay it
unconditionally.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
On 2022-08-06 Sa 06:49, Gurjeet Singh wrote:
There are no comments as to why one should choose to use --delay-check
($delay_check). This email, and the pointer to the paragraph buried in
the docs, shared by Tom, are the only two ways one can understand what
is causing this failure, and how to get around it.
In general we want to run "make check" as soon as possible after running
"make" on the core code. That's why I didn't simply delay it
unconditionally.
In general --- that is, on non-broken platforms --- "make check"
*should* work without a prior "make install". I am absolutely
not in favor of changing the buildfarm so that it fails to detect
the problem if we break that. But for sure it'd make sense to add
some comments to the wiki and/or sample config file explaining
that you need to set this option on systems X,Y,Z.
On macOS you need to use it if you haven't disabled SIP.
I don't have the details about any other problem platforms.
regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes:
I came across this when I was working on setting up some Dockerfiles for
the buildfarm. Apparently LD_LIBRARY_PATH doesn't work on Alpine, at
least out of the box, as it uses a different linker, and "make check"
relies on it (or the moral equivalent) if "make install" hasn't been run.
I did some quick googling on this point. We seem not to be the only
project having linking issues on Alpine, and yet it does support
LD_LIBRARY_PATH according to some fairly authoritative-looking pages, eg
https://www.musl-libc.org/doc/1.0.0/manual.html
I suspect the situation is similar to macOS, ie there is some limitation
somewhere on whether LD_LIBRARY_PATH gets passed through. If memory
serves, the problem on SIP-enabled Mac is that DYLD_LIBRARY_PATH is
cleared upon invoking bash, so that we lose it anywhere that "make"
invokes a shell to run a subprogram. (Hmm ... I wonder whether ninja
uses the shell ...) I don't personally care at all about Alpine, but
maybe somebody who does could dig a little harder and characterize
the problem there better.
regards, tom lane
Hi,
On 2022-08-06 11:25:09 -0400, Tom Lane wrote:
(Hmm ... I wonder whether ninja uses the shell ...)
It does, but even if it didn't, we'd use a shell somewhere below perl or
pg_regress :(.
The meson build should still work without disabling SIP, I did the necessary
hackery to set up the rpath equivalent up relatively. So both the real install
target and the tmp_install/ should find libraries within themselves.
Greetings,
Andres Freund
On 2022-08-06 Sa 11:25, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
I came across this when I was working on setting up some Dockerfiles for
the buildfarm. Apparently LD_LIBRARY_PATH doesn't work on Alpine, at
least out of the box, as it uses a different linker, and "make check"
relies on it (or the moral equivalent) if "make install" hasn't been run.I did some quick googling on this point. We seem not to be the only
project having linking issues on Alpine, and yet it does support
LD_LIBRARY_PATH according to some fairly authoritative-looking pages, eghttps://www.musl-libc.org/doc/1.0.0/manual.html
I suspect the situation is similar to macOS, ie there is some limitation
somewhere on whether LD_LIBRARY_PATH gets passed through. If memory
serves, the problem on SIP-enabled Mac is that DYLD_LIBRARY_PATH is
cleared upon invoking bash, so that we lose it anywhere that "make"
invokes a shell to run a subprogram. (Hmm ... I wonder whether ninja
uses the shell ...) I don't personally care at all about Alpine, but
maybe somebody who does could dig a little harder and characterize
the problem there better.
We probably should care about Alpine, because it's a good distro to use
as the basis for Docker images, being fairly secure, very small, and
booting very fast.
I'll dig some more, and possibly set up a (docker based) buildfarm instance.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 2022-08-06 Sa 12:10, Andrew Dunstan wrote:
On 2022-08-06 Sa 11:25, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
I came across this when I was working on setting up some Dockerfiles for
the buildfarm. Apparently LD_LIBRARY_PATH doesn't work on Alpine, at
least out of the box, as it uses a different linker, and "make check"
relies on it (or the moral equivalent) if "make install" hasn't been run.I did some quick googling on this point. We seem not to be the only
project having linking issues on Alpine, and yet it does support
LD_LIBRARY_PATH according to some fairly authoritative-looking pages, eghttps://www.musl-libc.org/doc/1.0.0/manual.html
I suspect the situation is similar to macOS, ie there is some limitation
somewhere on whether LD_LIBRARY_PATH gets passed through. If memory
serves, the problem on SIP-enabled Mac is that DYLD_LIBRARY_PATH is
cleared upon invoking bash, so that we lose it anywhere that "make"
invokes a shell to run a subprogram. (Hmm ... I wonder whether ninja
uses the shell ...) I don't personally care at all about Alpine, but
maybe somebody who does could dig a little harder and characterize
the problem there better.We probably should care about Alpine, because it's a good distro to use
as the basis for Docker images, being fairly secure, very small, and
booting very fast.I'll dig some more, and possibly set up a (docker based) buildfarm instance.
It appears that LD_LIBRARY_PATH is supported on Alpine but it fails if
chained, which seems somewhat braindead. The regression tests get errors
like this:
+ERROR: could not load library
"/app/buildroot/HEAD/pgsql.build/tmp_install/app/buildroot/HEAD/inst/lib/postgresql/libpqwalreceiver.so":
Error loading shared library libpq.so.5: No such file or directory
(needed by
/app/buildroot/HEAD/pgsql.build/tmp_install/app/buildroot/HEAD/inst/lib/postgresql/libpqwalreceiver.so)
If the check stage is delayed until after the install stage the tests pass.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com