pgsql: Improved parallel make support

Started by Peter Eisentrautabout 15 years ago30 messages
#1Peter Eisentraut
peter_e@gmx.net

Improved parallel make support

Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.

Branch
------
master

Details
-------
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=19e231bbdaef792dce22100012b504e2fb72f971

Modified Files
--------------
GNUmakefile.in | 56 +++++----------------
contrib/Makefile | 12 +----
contrib/dblink/Makefile | 1 +
doc/src/sgml/installation.sgml | 5 +-
src/Makefile | 47 +++++------------
src/Makefile.global.in | 47 +++++++++++++++++-
src/Makefile.shlib | 11 ++--
src/backend/Makefile | 4 +-
src/backend/common.mk | 6 +--
src/backend/replication/libpqwalreceiver/Makefile | 3 +-
src/backend/utils/mb/conversion_procs/Makefile | 6 +--
src/bin/Makefile | 3 +-
src/bin/initdb/Makefile | 4 +-
src/bin/pg_config/Makefile | 4 +-
src/bin/pg_controldata/Makefile | 4 +-
src/bin/pg_ctl/Makefile | 4 +-
src/bin/pg_dump/Makefile | 8 ++--
src/bin/pg_resetxlog/Makefile | 4 +-
src/bin/psql/Makefile | 4 +-
src/bin/scripts/Makefile | 20 ++++----
src/interfaces/Makefile | 3 +-
src/interfaces/ecpg/Makefile | 16 ++----
src/interfaces/ecpg/compatlib/Makefile | 9 +++
src/interfaces/ecpg/ecpglib/Makefile | 9 ++--
src/interfaces/ecpg/preproc/Makefile | 7 ++-
src/pl/Makefile | 10 +---
src/test/regress/GNUmakefile | 16 +++---
src/timezone/Makefile | 4 +-
src/tools/findoidjoins/Makefile | 4 +-
src/tools/fsync/Makefile | 4 +-
30 files changed, 156 insertions(+), 179 deletions(-)

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#1)
Re: pgsql: Improved parallel make support

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

cheers

andrew

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: pgsql: Improved parallel make support

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

There are also at least three non-Windows buildfarm members failing like
so:

gmake -C src all
gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake[1]: *** virtual memory exhausted. Stop.
gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake: *** [all-src-recursive] Error 2

I think we may have pushed too far in terms of what actually works
reliably across different make versions.

regards, tom lane

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/12/2010 11:25 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

There are also at least three non-Windows buildfarm members failing like
so:

gmake -C src all
gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake[1]: *** virtual memory exhausted. Stop.
gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake: *** [all-src-recursive] Error 2

I think we may have pushed too far in terms of what actually works
reliably across different make versions.

Yeah, possibly. And now it looks like this has broken the Solaris
buildfarm members too.

I'm curious to know how much all this buys us. One reason I haven't
enabled parallel make in the buildfarm is that it interleaves the
output, which can be a pain. And build speed isn't really the
buildfarm's foremost concern anyway. I know waiting for a build can be
mildly annoying (ccache can be a big help if you're building
repeatedly). But I don't feel we need to squeeze every last pip out of
the build system.

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

Andrew Dunstan <andrew@dunslane.net> writes:

I'm curious to know how much all this buys us.

It *would* be nice if "make -k" worked better. I frequently run into
the fact that (with the pre-existing setup) a compile error in the
backend prevented make from proceeding with builds of interfaces/,
bin/, etc, meaning that that work still remains to be done after I've
finished fixing the backend error.

But having said that, I won't shed many tears if we have to revert this.

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make 3.80
...

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

BTW, there's another problem here: "make -j2" on my Mac blows up with
this on stderr:

ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recursive] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recursive] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from gram.y:12101:
scan.c: In function 'yy_try_NUL_trans':
scan.c:16242: warning: unused variable 'yyg'
make: *** [all-src-recursive] Error 2

Consulting stdout shows that indeed it's launched this series of jobs:

make -C backend/utils/mb/conversion_procs all
make -C ascii_and_mic all
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -I../../../../../../src/include -c -o ascii_and_mic.o ascii_and_mic.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -bundle -multiply_defined suppress -o ascii_and_mic.so ascii_and_mic.o -L../../../../../../src/port -Wl,-d\
ead_strip_dylibs -bundle_loader ../../../../../../src/backend/postgres

immediately after completing the src/timezone build, before the backend
build is even well begun let alone finished. So the parallel build
dependency interlocks are basically not working. This machine has gmake
3.81.

regards, tom lane

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/13/2010 11:12 AM, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make 3.80
...

Maybe we need to put back make version logging. Interestingly, narwhal,
the mingw machine that has reported, didn't complain. It's running 3.81.

cheers

andrew

#8Dave Page
dpage@pgadmin.org
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, Nov 13, 2010 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error.  I wonder whether they are all using make 3.80

Both my Sparc and Intel Solaris critters have 3.80.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:06 -0500, Andrew Dunstan wrote:

But I don't feel we need to squeeze every last pip out of
the build system.

Probably not on the buildfarm, but when you are developing, saving 20
seconds or 2 minutes per cycle can lead to hours saved.

#10Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:23 -0500, Tom Lane wrote:

Consulting stdout shows that indeed it's launched this series of jobs:

make -C backend/utils/mb/conversion_procs all
make -C ascii_and_mic all
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -I../../../../../../src/include -c -o ascii_and_mic.o
ascii_and_mic.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -bundle -multiply_defined suppress -o ascii_and_mic.so
ascii_and_mic.o -L../../../../../../src/port -Wl,-d\
ead_strip_dylibs
-bundle_loader ../../../../../../src/backend/postgres

immediately after completing the src/timezone build, before the
backend build is even well begun let alone finished. So the parallel
build dependency interlocks are basically not working.

On some platforms, you need to have backend/postgres built before any
dynamically loadable modules. For those platforms, additional
dependencies will be necessary, I suppose.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#10)
Re: [COMMITTERS] pgsql: Improved parallel make support

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

regards, tom lane

#13Erik Rijkers
er@xs4all.nl
In reply to: Peter Eisentraut (#10)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, November 13, 2010 18:15, Peter Eisentraut wrote:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

Just to mention another effect of the recent changes:

make 3.81, Centos 5.5

On a dual quadcore system where I used to built with -j 16, it now only succeeds with -j 8.

(I seem to remember that 16 as opposed to 8 shaved a couple of seconds off, although I'm not quite
sure anymore)

make -j 16 gives:

cc1: error: thread.c: No such file or directory
make[4]: *** [thread.o] Error 1
make[3]: *** [submake-libpq] Error 2
make[2]: *** [all-pg_ctl-recursive] Error 2
make[1]: *** [all-bin-recursive] Error 2
make[1]: *** Waiting for unfinished jobs....
Use of assignment to $[ is deprecated at ./parse.pl line 21.
In file included from gram.y:12101:
scan.c: In function &#8216;yy_try_NUL_trans&#8217;:
scan.c:16242: warning: unused variable &#8216;yyg&#8217;
Use of assignment to $[ is deprecated at ./check_rules.pl line 18.
make: *** [all-src-recursive] Error 2

( A similar effect I see on a dual core fedora system (2.6.27.5-117.fc10.i686), where -j 16 always
ran, but now it needs -j 4 or less (it also has make 3.81) )

Erik Rijkers

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#12)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function. I think
I have an idea how to work around it, but I'll need some time.

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#14)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote:

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function. I think
I have an idea how to work around it, but I'll need some time.

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#15)
Re: [COMMITTERS] pgsql: Improved parallel make support

Peter Eisentraut <peter_e@gmx.net> writes:

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

[ emerges from some grubbing about in the gmake sources... ]
It looks to me like the bug in 3.80 is only triggered when "eval"
expands to a long enough string to trigger reallocation of the variable
buffer. (Ergo, the reason they didn't find it sooner was they only
tested on relatively short strings.)

I wonder whether the bug could be worked around if you did the iteration
on SUBDIRS in a foreach surrounding the eval call, so that each eval
dealt with only one subdir target. This would result in a bit of
redundancy in the generated rules, but that seems tolerable.

regards, tom lane

#17Dave Page
dpage@pgadmin.org
In reply to: Peter Eisentraut (#15)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote:

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error.  I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around.  3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"?  Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function.  I think
I have an idea how to work around it, but I'll need some time.

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

3.81 might be a problem for Solaris - unless I pay for a support
contract from Oracle, I'm not going to get any updates from them,
which means I'll have to install a custom build. Now that's no biggie
for me, but it does see to raise the bar somewhat for users that might
want to build from source.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Page (#17)
Re: [COMMITTERS] pgsql: Improved parallel make support

Dave Page <dpage@pgadmin.org> writes:

On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

3.81 might be a problem for Solaris - unless I pay for a support
contract from Oracle, I'm not going to get any updates from them,
which means I'll have to install a custom build. Now that's no biggie
for me, but it does see to raise the bar somewhat for users that might
want to build from source.

For another data point, I find make 3.80 in OS X 10.4, while 10.5 and
10.6 have 3.81. 10.4 is certainly behind the curve, but Apple still
seem to be releasing security updates for it.

I was about to draw an analogy to flex -- we are now requiring a version
of flex that's roughly contemporaneous with make 3.81. However, we
don't require flex to build from a tarball, so on second thought that
situation isn't very comparable. Moving the goalposts for make would
definitely affect more people.

On the third hand, gmake is very very easy to install: if you're
capable of building Postgres from source, it's hard to believe that
gmake should scare you off. (I've installed multiple versions on my
ancient HPUX dinosaur, and it's never been any harder than ./configure,
make, make check, make install.)

And on the fourth hand, what we're buying here is pretty marginal for
developers and of no interest whatever for users.

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Nov 14, 2010, at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

+1. The current master branch fails to build on my (rather new) Mac with make -j2. I could upgrade my toolchain but it seems like more trouble than it's worth, not to mention a possible obstacle to new users and developers.

...Robert

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#18)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/14/2010 10:44 AM, Tom Lane wrote:

And on the fourth hand, what we're buying here is pretty marginal for
developers and of no interest whatever for users.

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

+1

cheers

andrew

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
Re: [COMMITTERS] pgsql: Improved parallel make support

Robert Haas <robertmhaas@gmail.com> writes:

On Nov 14, 2010, at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

+1. The current master branch fails to build on my (rather new) Mac
with make -j2.

I complained of the same thing, but AFAICS that's not a make bug;
it's a missing build dependency, which could be fixed if we choose to
keep this infrastructure. It probably ought to be fixed even if we
don't.

regards, tom lane

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#18)
Re: [COMMITTERS] pgsql: Improved parallel make support

I wrote:

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls.

In fact, that does seem to work; I'll commit a patch after testing a
bit more.

We still need someone to add the missing build dependencies so that
make -j is trustworthy again.

regards, tom lane

#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#22)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sun, Nov 14, 2010 at 12:13 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I wrote:

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls.

In fact, that does seem to work; I'll commit a patch after testing a
bit more.

We still need someone to add the missing build dependencies so that
make -j is trustworthy again.

Yes, please. This is currently failing for me:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -Werror -bundle -multiply_defined suppress -o
ascii_and_mic.so ascii_and_mic.o -L../../../../../../src/port
-L/opt/local/lib -Wl,-dead_strip_dylibs -Werror -bundle_loader
../../../../../../src/backend/postgres^M
ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recurse] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#24Bernd Helmle
mailings@oopsware.de
In reply to: Robert Haas (#19)
Re: [COMMITTERS] pgsql: Improved parallel make support

--On 14. November 2010 11:08:13 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

+1. The current master branch fails to build on my (rather new) Mac with
make -j2. I could upgrade my toolchain but it seems like more trouble
than it's worth, not to mention a possible obstacle to new users and
developers.

The same here, too. And it doesn't matter if you use the shipped make
(3.81) or the one from macports (currently 3.82), both are failing with:

ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recurse] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....

--
Thanks

Bernd

#25Peter Eisentraut
peter_e@gmx.net
In reply to: Bernd Helmle (#24)
1 attachment(s)
Re: [COMMITTERS] pgsql: Improved parallel make support

On mån, 2010-11-15 at 11:13 +0100, Bernd Helmle wrote:

--On 14. November 2010 11:08:13 -0500 Robert Haas <robertmhaas@gmail.com>
wrote:

+1. The current master branch fails to build on my (rather new) Mac with
make -j2. I could upgrade my toolchain but it seems like more trouble
than it's worth, not to mention a possible obstacle to new users and
developers.

The same here, too. And it doesn't matter if you use the shipped make
(3.81) or the one from macports (currently 3.82), both are failing with:

ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recurse] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....

Untested, but the following should help you, by partially restoring the
old builder order on platforms that need it.

Attachments:

darwin-parallel-make-fix.patchtext/x-patch; charset=UTF-8; name=darwin-parallel-make-fix.patchDownload
diff --git i/src/Makefile w/src/Makefile
index 0d4a6ee..2a5330a 100644
--- i/src/Makefile
+++ w/src/Makefile
@@ -28,6 +28,13 @@ SUBDIRS = \
 
 $(recurse)
 
+# On platforms that require the backend to be built before dynamically
+# loadable modules, partially constraint the build order for more
+# efficient and robust builds.
+ifdef BE_DLLLIBS
+all-backend/utils/mb/conversion_procs-recurse all-backend/replication/libpqwalreceiver-recurse all-pl-recurse: all-backend-recurse
+endif
+
 install: install-local
 
 install-local: installdirs-local
#26Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#25)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Mon, Nov 15, 2010 at 4:10 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recurse] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....

Untested, but the following should help you, by partially restoring the
old builder order on platforms that need it.

Very odd, but this completely blew up the first time I tried it.

In file included from path.c:34:
pg_config_paths.h:2:11: error: missing terminating " character
In file included from path.c:34:
pg_config_paths.h:2: error: missing terminating " character
path.c:49: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’
before ‘static’

That file had a line in it that looked like this:

postgresql"

On a subsequent retry, I got:

gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -Werror -bundle -multiply_defined suppress -o
dict_snowball.so dict_snowball.o api.o utilities.o
stem_ISO_8859_1_danish.o stem_ISO_8859_1_dutch.o
stem_ISO_8859_1_english.o stem_ISO_8859_1_finnish.o
stem_ISO_8859_1_french.o stem_ISO_8859_1_german.o
stem_ISO_8859_1_hungarian.o stem_ISO_8859_1_italian.o
stem_ISO_8859_1_norwegian.o stem_ISO_8859_1_porter.o
stem_ISO_8859_1_portuguese.o stem_ISO_8859_1_spanish.o
stem_ISO_8859_1_swedish.o stem_ISO_8859_2_romanian.o
stem_KOI8_R_russian.o stem_UTF_8_danish.o stem_UTF_8_dutch.o
stem_UTF_8_english.o stem_UTF_8_finnish.o stem_UTF_8_french.o
stem_UTF_8_german.o stem_UTF_8_hungarian.o stem_UTF_8_italian.o
stem_UTF_8_norwegian.o stem_UTF_8_porter.o stem_UTF_8_portuguese.o
stem_UTF_8_romanian.o stem_UTF_8_russian.o stem_UTF_8_spanish.o
stem_UTF_8_swedish.o stem_UTF_8_turkish.o -L../../../src/port
-L/opt/local/lib -Wl,-dead_strip_dylibs -Werror -bundle_loader
../../../src/backend/postgres
ld: file not found: ../../../src/backend/postgres
collect2: ld returned 1 exit status
make[2]: *** [dict_snowball.so] Error 1
make[1]: *** [all-backend/snowball-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#26)
Re: [COMMITTERS] pgsql: Improved parallel make support

Robert Haas <robertmhaas@gmail.com> writes:

Very odd, but this completely blew up the first time I tried it.

In file included from path.c:34:
pg_config_paths.h:2:11: error: missing terminating " character

FWIW, I didn't replicate that, but I did get this during one attempt
with -j4:

/usr/bin/ranlib: archive member: libecpg.a(typename.o) size too large (archive \
member extends past the end of the file)
ar: internal ranlib command failed
make[5]: *** [libecpg.a] Error 1
make[5]: *** Deleting file `libecpg.a'
make[4]: *** [submake-ecpglib] Error 2
make[3]: *** [all-compatlib-recurse] Error 2
make[3]: *** Waiting for unfinished jobs....
/usr/bin/ranlib: can't stat file output file: libecpg.a (No such file or direct\
ory)
ar: internal ranlib command failed
make[4]: *** [libecpg.a] Error 1
make[3]: *** [all-ecpglib-recurse] Error 2
make[2]: *** [all-ecpg-recurse] Error 2
make[1]: *** [all-interfaces-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from gram.y:12101:
scan.c: In function 'yy_try_NUL_trans':
scan.c:16242: warning: unused variable 'yyg'
make: *** [all-src-recurse] Error 2

Examination of the stdout trace makes it appear that two independent
make runs were trying to build src/interfaces/ecpg/ecpglib/libecpg.a
concurrently. I haven't dug into it but I suspect that there are
multiple dependency chains leading to ecpg/ecpglib/. I wonder whether
what you saw was also the result of multiple recursion paths leading
to trying to build the same target at once. If so, that's going to
put a rather serious crimp in the idea of constraining build order
by adding more dependencies.

On a subsequent retry, I got:
ld: file not found: ../../../src/backend/postgres
collect2: ld returned 1 exit status
make[2]: *** [dict_snowball.so] Error 1

Yeah, I got that too, but adding all-backend/snowball-recurse to the
set of dependencies proposed in Peter's patch made it go away.
A cursory search for other appearances of -bundle_loader in the
make output suggests that contrib/ and src/test/regress/ are also
at risk. This leads me to the thought that concentrating knowledge
of this issue in src/Makefile is not the right way to go at it.
And, again, the more paths leading to a make attempt in the same
directory, the worse off we are as far as the first problem goes.
But surely the "make" guys recognized this risk and have a solution?
Otherwise parallel make would be pretty useless.

regards, tom lane

#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#27)
Re: [COMMITTERS] pgsql: Improved parallel make support

I tried another experiment, which was "make -j100 all" on my relatively
new Linux box (2 dual-core CPUs). It blew up real good, as per attached
stderr output, which shows evidence of more missing dependencies as well
as some additional cases of concurrent attempts to build the same
target.

It's clear to me that we are very far from having a handle on what it'll
really take to run parallel builds safely, and I am therefore now of the
opinion that we ought to revert the patch. Hypothetical gains in
parallelism are useless if we can't actually use parallel building
reliably. We are currently worse off than before in terms of time to
build the system.

regards, tom lane

/usr/bin/ld: cannot find -lpgport
collect2: ld returned 1 exit status
make[3]: *** [refint.so] Error 1
make[2]: *** [../../../contrib/spi/refint.so] Error 2
make[2]: *** Waiting for unfinished jobs....
path.c: In function 'get_html_path':
path.c:615: error: 'HTMLDIR' undeclared (first use in this function)
path.c:615: error: (Each undeclared identifier is reported only once
path.c:615: error: for each function it appears in.)
path.c: In function 'get_man_path':
path.c:624: error: 'MANDIR' undeclared (first use in this function)
make[3]: *** [path.o] Error 1
make[3]: *** Deleting file `path.o'
make[3]: *** Waiting for unfinished jobs....
/usr/bin/ld: cannot find -lpgport
collect2: ld returned 1 exit status
make[3]: *** [autoinc.so] Error 1
make[2]: *** [../../../contrib/spi/autoinc.so] Error 2
make[2]: *** [submake-libpgport] Error 2
make[2]: *** Waiting for unfinished jobs....
ln: creating symbolic link `libpgtypes.so.3': File exists
make[4]: *** [libpgtypes.so.3.2] Error 1
make[4]: *** Deleting file `libpgtypes.so.3.2'
make[3]: *** [all-pgtypeslib-recurse] Error 2
make[3]: *** Waiting for unfinished jobs....
make[1]: *** [all-test/regress-recurse] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from gram.y:12102:
scan.c: In function 'yy_try_NUL_trans':
scan.c:16246: warning: unused variable 'yyg'
ln: creating symbolic link `libpq.so.5': File exists
make[4]: *** [libpq.so.5.4] Error 1
make[4]: *** Deleting file `libpq.so.5.4'
make[3]: *** [submake-libpq] Error 2
make[2]: *** [all-pg_dump-recurse] Error 2
make[2]: *** Waiting for unfinished jobs....
ln: creating symbolic link `libpq.so.5': File exists
make[6]: *** [libpq.so.5.4] Error 1
make[6]: *** Deleting file `libpq.so.5.4'
make[5]: *** [submake-libpq] Error 2
make[4]: *** [submake-ecpglib] Error 2
make[3]: *** [all-compatlib-recurse] Error 2
/usr/bin/ld: cannot open linker script file ../../../src/interfaces/libpq/libpq.so: No such file or directory
collect2: ld returned 1 exit status
make[3]: *** [psql] Error 1
make[2]: *** [all-psql-recurse] Error 2
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [createdb] Error 1
make[3]: *** Waiting for unfinished jobs....
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [createuser] Error 1
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [dropuser] Error 1
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [vacuumdb] Error 1
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [dropdb] Error 1
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [clusterdb] Error 1
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [reindexdb] Error 1
make[2]: *** [all-scripts-recurse] Error 2
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_reset_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1426: undefined reference to `pthread_sigmask'
../../../src/interfaces/libpq/libpq.a(fe-secure.o): In function `pq_block_sigpipe':
/home/tgl/pgsql/src/interfaces/libpq/fe-secure.c:1363: undefined reference to `pthread_sigmask'
collect2: ld returned 1 exit status
make[3]: *** [pg_ctl] Error 1
make[2]: *** [all-pg_ctl-recurse] Error 2
make[1]: *** [all-bin-recurse] Error 2
make[2]: *** [all-ecpg-recurse] Error 2
make[1]: *** [all-interfaces-recurse] Error 2
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2

#29Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#28)
1 attachment(s)
Re: [COMMITTERS] pgsql: Improved parallel make support

On mån, 2010-11-15 at 23:34 -0500, Tom Lane wrote:

It's clear to me that we are very far from having a handle on what
it'll really take to run parallel builds safely, and I am therefore
now of the opinion that we ought to revert the patch. Hypothetical
gains in parallelism are useless if we can't actually use parallel
building reliably. We are currently worse off than before in terms of
time to build the system.

We don't have to revert it, we just have to insert .NOTPARALLEL targets
into some places that are not properly "parallelized", thus effectively
restoring the behavior of the old for loop. I have attached a patch
that gets make -j 100+ working for me. Other platforms might need more
things, perhaps.

Btw., my original notes for this development were labeled "make make -k
work properly". So I would really like to keep that. It just turned
out that parallel make could benefit from the same changes, and it's a
better marketing name. ;-)

Attachments:

parallel-make-fix.patchtext/x-patch; charset=UTF-8; name=parallel-make-fix.patchDownload
diff --git i/src/Makefile w/src/Makefile
index 0d4a6ee..a8d4e4b 100644
--- i/src/Makefile
+++ w/src/Makefile
@@ -26,6 +26,8 @@ SUBDIRS = \
 	makefiles \
 	test/regress
 
+.NOTPARALLEL:
+
 $(recurse)
 
 install: install-local
diff --git i/src/interfaces/Makefile w/src/interfaces/Makefile
index 2c034bc..9fe368e 100644
--- i/src/interfaces/Makefile
+++ w/src/interfaces/Makefile
@@ -15,3 +15,5 @@ include $(top_builddir)/src/Makefile.global
 SUBDIRS = libpq ecpg
 
 $(recurse)
+
+all-ecpg-recurse: all-libpq-recurse
diff --git i/src/interfaces/ecpg/Makefile w/src/interfaces/ecpg/Makefile
index d955cee..ca434c8 100644
--- i/src/interfaces/ecpg/Makefile
+++ w/src/interfaces/ecpg/Makefile
@@ -6,7 +6,8 @@ SUBDIRS = include pgtypeslib ecpglib compatlib preproc
 
 $(recurse)
 
-all-compatlib-recursive: all-ecpglib-recursive
+all-compatlib-recurse: all-ecpglib-recurse
+all-ecpglib-recurse: all-pgtypeslib-recurse
 
 clean distclean maintainer-clean:
 	$(MAKE) -C test clean
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#29)
Re: [COMMITTERS] pgsql: Improved parallel make support

Peter Eisentraut <peter_e@gmx.net> writes:

On mån, 2010-11-15 at 23:34 -0500, Tom Lane wrote:

It's clear to me that we are very far from having a handle on what
it'll really take to run parallel builds safely, and I am therefore
now of the opinion that we ought to revert the patch.

We don't have to revert it, we just have to insert .NOTPARALLEL targets
into some places that are not properly "parallelized", thus effectively
restoring the behavior of the old for loop. I have attached a patch
that gets make -j 100+ working for me. Other platforms might need more
things, perhaps.

If we don't have to revert it entirely, that's of course better. Please
apply what you've got.

regards, tom lane