pgsql: Improved parallel make support

Started by Peter Eisentrautover 15 years ago30 messageshackers
Jump to latest
#1Peter Eisentraut
peter_e@gmx.net

Improved parallel make support

Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.

Branch
------
master

Details
-------
http://git.postgresql.org/gitweb?p=postgresql.git;a=commitdiff;h=19e231bbdaef792dce22100012b504e2fb72f971

Modified Files
--------------
GNUmakefile.in | 56 +++++----------------
contrib/Makefile | 12 +----
contrib/dblink/Makefile | 1 +
doc/src/sgml/installation.sgml | 5 +-
src/Makefile | 47 +++++------------
src/Makefile.global.in | 47 +++++++++++++++++-
src/Makefile.shlib | 11 ++--
src/backend/Makefile | 4 +-
src/backend/common.mk | 6 +--
src/backend/replication/libpqwalreceiver/Makefile | 3 +-
src/backend/utils/mb/conversion_procs/Makefile | 6 +--
src/bin/Makefile | 3 +-
src/bin/initdb/Makefile | 4 +-
src/bin/pg_config/Makefile | 4 +-
src/bin/pg_controldata/Makefile | 4 +-
src/bin/pg_ctl/Makefile | 4 +-
src/bin/pg_dump/Makefile | 8 ++--
src/bin/pg_resetxlog/Makefile | 4 +-
src/bin/psql/Makefile | 4 +-
src/bin/scripts/Makefile | 20 ++++----
src/interfaces/Makefile | 3 +-
src/interfaces/ecpg/Makefile | 16 ++----
src/interfaces/ecpg/compatlib/Makefile | 9 +++
src/interfaces/ecpg/ecpglib/Makefile | 9 ++--
src/interfaces/ecpg/preproc/Makefile | 7 ++-
src/pl/Makefile | 10 +---
src/test/regress/GNUmakefile | 16 +++---
src/timezone/Makefile | 4 +-
src/tools/findoidjoins/Makefile | 4 +-
src/tools/fsync/Makefile | 4 +-
30 files changed, 156 insertions(+), 179 deletions(-)

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Peter Eisentraut (#1)
Re: pgsql: Improved parallel make support

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Replace for loops in makefiles with proper dependencies. Parallel
make can now span across directories. Also, make -k and make -q work
properly.

GNU make 3.80 or newer is now required.

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

cheers

andrew

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: pgsql: Improved parallel make support

Andrew Dunstan <andrew@dunslane.net> writes:

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

There are also at least three non-Windows buildfarm members failing like
so:

gmake -C src all
gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake[1]: *** virtual memory exhausted. Stop.
gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake: *** [all-src-recursive] Error 2

I think we may have pushed too far in terms of what actually works
reliably across different make versions.

regards, tom lane

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/12/2010 11:25 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 11/12/2010 03:16 PM, Peter Eisentraut wrote:

Improved parallel make support

Looks like this patch has pretty comprehensively broken the MSVC build
system. I'll see what I can recover from the wreckage.

There are also at least three non-Windows buildfarm members failing like
so:

gmake -C src all
gmake[1]: Entering directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake[1]: *** virtual memory exhausted. Stop.
gmake[1]: Leaving directory `/home/pgbuild/pgbuildfarm/HEAD/pgsql.6736/src'
gmake: *** [all-src-recursive] Error 2

I think we may have pushed too far in terms of what actually works
reliably across different make versions.

Yeah, possibly. And now it looks like this has broken the Solaris
buildfarm members too.

I'm curious to know how much all this buys us. One reason I haven't
enabled parallel make in the buildfarm is that it interleaves the
output, which can be a pain. And build speed isn't really the
buildfarm's foremost concern anyway. I know waiting for a build can be
mildly annoying (ccache can be a big help if you're building
repeatedly). But I don't feel we need to squeeze every last pip out of
the build system.

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

Andrew Dunstan <andrew@dunslane.net> writes:

I'm curious to know how much all this buys us.

It *would* be nice if "make -k" worked better. I frequently run into
the fact that (with the pre-existing setup) a compile error in the
backend prevented make from proceeding with builds of interfaces/,
bin/, etc, meaning that that work still remains to be done after I've
finished fixing the backend error.

But having said that, I won't shed many tears if we have to revert this.

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make 3.80
...

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

BTW, there's another problem here: "make -j2" on my Mac blows up with
this on stderr:

ld: file not found: ../../../../../../src/backend/postgres
collect2: ld returned 1 exit status
make[3]: *** [ascii_and_mic.so] Error 1
make[2]: *** [all-ascii_and_mic-recursive] Error 2
make[1]: *** [all-backend/utils/mb/conversion_procs-recursive] Error 2
make[1]: *** Waiting for unfinished jobs....
In file included from gram.y:12101:
scan.c: In function 'yy_try_NUL_trans':
scan.c:16242: warning: unused variable 'yyg'
make: *** [all-src-recursive] Error 2

Consulting stdout shows that indeed it's launched this series of jobs:

make -C backend/utils/mb/conversion_procs all
make -C ascii_and_mic all
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -I../../../../../../src/include -c -o ascii_and_mic.o ascii_and_mic.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing -fwrapv -g -bundle -multiply_defined suppress -o ascii_and_mic.so ascii_and_mic.o -L../../../../../../src/port -Wl,-d\
ead_strip_dylibs -bundle_loader ../../../../../../src/backend/postgres

immediately after completing the src/timezone build, before the backend
build is even well begun let alone finished. So the parallel build
dependency interlocks are basically not working. This machine has gmake
3.81.

regards, tom lane

#7Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/13/2010 11:12 AM, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make 3.80
...

Maybe we need to put back make version logging. Interestingly, narwhal,
the mingw machine that has reported, didn't complain. It's running 3.81.

cheers

andrew

#8Dave Page
dpage@pgadmin.org
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, Nov 13, 2010 at 4:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error.  I wonder whether they are all using make 3.80

Both my Sparc and Intel Solaris critters have 3.80.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#9Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#4)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:06 -0500, Andrew Dunstan wrote:

But I don't feel we need to squeeze every last pip out of
the build system.

Probably not on the buildfarm, but when you are developing, saving 20
seconds or 2 minutes per cycle can lead to hours saved.

#10Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#5)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

#11Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#6)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 11:23 -0500, Tom Lane wrote:

Consulting stdout shows that indeed it's launched this series of jobs:

make -C backend/utils/mb/conversion_procs all
make -C ascii_and_mic all
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -I../../../../../../src/include -c -o ascii_and_mic.o
ascii_and_mic.c
gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels -fno-strict-aliasing
-fwrapv -g -bundle -multiply_defined suppress -o ascii_and_mic.so
ascii_and_mic.o -L../../../../../../src/port -Wl,-d\
ead_strip_dylibs
-bundle_loader ../../../../../../src/backend/postgres

immediately after completing the src/timezone build, before the
backend build is even well begun let alone finished. So the parallel
build dependency interlocks are basically not working.

On some platforms, you need to have backend/postgres built before any
dynamically loadable modules. For those platforms, additional
dependencies will be necessary, I suppose.

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#10)
Re: [COMMITTERS] pgsql: Improved parallel make support

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

regards, tom lane

#13Erik Rijkers
er@xs4all.nl
In reply to: Peter Eisentraut (#10)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, November 13, 2010 18:15, Peter Eisentraut wrote:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

Just to mention another effect of the recent changes:

make 3.81, Centos 5.5

On a dual quadcore system where I used to built with -j 16, it now only succeeds with -j 8.

(I seem to remember that 16 as opposed to 8 shaved a couple of seconds off, although I'm not quite
sure anymore)

make -j 16 gives:

cc1: error: thread.c: No such file or directory
make[4]: *** [thread.o] Error 1
make[3]: *** [submake-libpq] Error 2
make[2]: *** [all-pg_ctl-recursive] Error 2
make[1]: *** [all-bin-recursive] Error 2
make[1]: *** Waiting for unfinished jobs....
Use of assignment to $[ is deprecated at ./parse.pl line 21.
In file included from gram.y:12101:
scan.c: In function &#8216;yy_try_NUL_trans&#8217;:
scan.c:16242: warning: unused variable &#8216;yyg&#8217;
Use of assignment to $[ is deprecated at ./check_rules.pl line 18.
make: *** [all-src-recursive] Error 2

( A similar effect I see on a dual core fedora system (2.6.27.5-117.fc10.i686), where -j 16 always
ran, but now it needs -j 4 or less (it also has make 3.81) )

Erik Rijkers

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#12)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function. I think
I have an idea how to work around it, but I'll need some time.

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Peter Eisentraut (#14)
Re: [COMMITTERS] pgsql: Improved parallel make support

On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote:

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error. I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around. 3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"? Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function. I think
I have an idea how to work around it, but I'll need some time.

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#15)
Re: [COMMITTERS] pgsql: Improved parallel make support

Peter Eisentraut <peter_e@gmx.net> writes:

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

[ emerges from some grubbing about in the gmake sources... ]
It looks to me like the bug in 3.80 is only triggered when "eval"
expands to a long enough string to trigger reallocation of the variable
buffer. (Ergo, the reason they didn't find it sooner was they only
tested on relatively short strings.)

I wonder whether the bug could be worked around if you did the iteration
on SUBDIRS in a foreach surrounding the eval call, so that each eval
dealt with only one subdir target. This would result in a bit of
redundancy in the generated rules, but that seems tolerable.

regards, tom lane

#17Dave Page
dpage@pgadmin.org
In reply to: Peter Eisentraut (#15)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

On lör, 2010-11-13 at 20:07 +0200, Peter Eisentraut wrote:

On lör, 2010-11-13 at 12:20 -0500, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

On lör, 2010-11-13 at 11:12 -0500, Tom Lane wrote:

It looks like all the unhappy critters are getting the same "virtual
memory exhausted" error.  I wonder whether they are all using make
3.80 ...

It turns out that there is an unrelated bug in 3.80 that some Linux
distributions have patched around.  3.81 or 3.82 are OK.

So what do you mean by "unrelated bug"?  Can we work around it?

The information is fuzzy, but the problem has been reported around the
internet, and it appears to be related to the foreach function.  I think
I have an idea how to work around it, but I'll need some time.

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

3.81 might be a problem for Solaris - unless I pay for a support
contract from Oracle, I'm not going to get any updates from them,
which means I'll have to install a custom build. Now that's no biggie
for me, but it does see to raise the bar somewhat for users that might
want to build from source.

--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Dave Page (#17)
Re: [COMMITTERS] pgsql: Improved parallel make support

Dave Page <dpage@pgadmin.org> writes:

On Sat, Nov 13, 2010 at 8:13 PM, Peter Eisentraut <peter_e@gmx.net> wrote:

Well, it looks like $(eval) is pretty broken in 3.80, so either we
require 3.81 or we abandon this line of thought.

3.81 might be a problem for Solaris - unless I pay for a support
contract from Oracle, I'm not going to get any updates from them,
which means I'll have to install a custom build. Now that's no biggie
for me, but it does see to raise the bar somewhat for users that might
want to build from source.

For another data point, I find make 3.80 in OS X 10.4, while 10.5 and
10.6 have 3.81. 10.4 is certainly behind the curve, but Apple still
seem to be releasing security updates for it.

I was about to draw an analogy to flex -- we are now requiring a version
of flex that's roughly contemporaneous with make 3.81. However, we
don't require flex to build from a tarball, so on second thought that
situation isn't very comparable. Moving the goalposts for make would
definitely affect more people.

On the third hand, gmake is very very easy to install: if you're
capable of building Postgres from source, it's hard to believe that
gmake should scare you off. (I've installed multiple versions on my
ancient HPUX dinosaur, and it's never been any harder than ./configure,
make, make check, make install.)

And on the fourth hand, what we're buying here is pretty marginal for
developers and of no interest whatever for users.

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

regards, tom lane

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#18)
Re: [COMMITTERS] pgsql: Improved parallel make support

On Nov 14, 2010, at 10:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

+1. The current master branch fails to build on my (rather new) Mac with make -j2. I could upgrade my toolchain but it seems like more trouble than it's worth, not to mention a possible obstacle to new users and developers.

...Robert

#20Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#18)
Re: [COMMITTERS] pgsql: Improved parallel make support

On 11/14/2010 10:44 AM, Tom Lane wrote:

And on the fourth hand, what we're buying here is pretty marginal for
developers and of no interest whatever for users.

I still think it's worth looking into whether the bug can be dodged
by shortening the eval calls. But if not, I think I'd vote for
reverting. Maybe we could revisit this in a couple of years.

+1

cheers

andrew

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#19)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#18)
#23Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#22)
#24Bernd Helmle
mailings@oopsware.de
In reply to: Robert Haas (#19)
#25Peter Eisentraut
peter_e@gmx.net
In reply to: Bernd Helmle (#24)
#26Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#25)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#27)
#29Peter Eisentraut
peter_e@gmx.net
In reply to: Tom Lane (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#29)