Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

Started by Tom Laneover 14 years ago8 messages
#1Tom Lane
tgl@sss.pgh.pa.us

I spent a bit of time looking into $SUBJECT. The cause of the failure
is that configure mistakenly decides that setproctitle and some other
functions are available, when they aren't; this eventually leads to link
failures of course.

Now 8.2 and 8.3 use autoconf 2.59. 8.4 and up, which do not exhibit
this failure, use autoconf 2.61 or later. Sure enough, there is a
difference in the test program generated by the more recent autoconfs:
they actually try to call the function, where the previous ones do
something weird involving a function pointer comparison. I dug in the
autoconf change log and found this:

2005-10-19 Paul Eggert <eggert@cs.ucla.edu>

(AC_LANG_FUNC_LINK_TRY(C)): Call the function rather than simply
comparing its address. Intel's interprocedural optimization was
outsmarting the old heuristic. Problem reported by
Mikulas Patocka.

Since anchovy is using the "gold" linker at -O3, it's not exactly
surprising that it might be carrying out aggressive interprocedural
optimizations that we're not seeing used on other platforms.

The bottom line seems to be that autoconf 2.59 is seriously broken on
recent toolchains. Should we try to do something about that, like
migrate the 8.2 and 8.3 releases to a newer autoconf? 8.2 is close
enough to EOL that I don't mind answering "no" for it, but maybe we
should do that in 8.3.

Comments?

regards, tom lane

#2Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#1)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

On 08/28/2011 04:15 PM, Tom Lane wrote:

I spent a bit of time looking into $SUBJECT. The cause of the failure
is that configure mistakenly decides that setproctitle and some other
functions are available, when they aren't; this eventually leads to link
failures of course.

Now 8.2 and 8.3 use autoconf 2.59. 8.4 and up, which do not exhibit
this failure, use autoconf 2.61 or later. Sure enough, there is a
difference in the test program generated by the more recent autoconfs:
they actually try to call the function, where the previous ones do
something weird involving a function pointer comparison. I dug in the
autoconf change log and found this:

2005-10-19 Paul Eggert<eggert@cs.ucla.edu>

(AC_LANG_FUNC_LINK_TRY(C)): Call the function rather than simply
comparing its address. Intel's interprocedural optimization was
outsmarting the old heuristic. Problem reported by
Mikulas Patocka.

Since anchovy is using the "gold" linker at -O3, it's not exactly
surprising that it might be carrying out aggressive interprocedural
optimizations that we're not seeing used on other platforms.

The bottom line seems to be that autoconf 2.59 is seriously broken on
recent toolchains. Should we try to do something about that, like
migrate the 8.2 and 8.3 releases to a newer autoconf? 8.2 is close
enough to EOL that I don't mind answering "no" for it, but maybe we
should do that in 8.3.

If we're going to do it for 8.3 we might as well for 8.2 at the same
time, ISTM, even if it is close to EOL.

Is -O3 a recommended setting for icc?

cheers

andrew

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/28/2011 04:15 PM, Tom Lane wrote:

The bottom line seems to be that autoconf 2.59 is seriously broken on
recent toolchains. Should we try to do something about that, like
migrate the 8.2 and 8.3 releases to a newer autoconf? 8.2 is close
enough to EOL that I don't mind answering "no" for it, but maybe we
should do that in 8.3.

If we're going to do it for 8.3 we might as well for 8.2 at the same
time, ISTM, even if it is close to EOL.

Yeah, possibly, if it's not too invasive. I've not yet done any
research about what would need to change.

Is -O3 a recommended setting for icc?

No idea. But after a bit of man-page-reading I think it's probably not
the -O level that counts, so much as the fact that anchovy is using
-flto (link-time optimization) in CFLAGS. I don't see any indication
that that's being selected by the buildfarm script itself, so it must be
coming from an environment setting of CFLAGS.

regards, tom lane

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#3)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

On 08/28/2011 05:51 PM, Tom Lane wrote:

Is -O3 a recommended setting for icc?

No idea. But after a bit of man-page-reading I think it's probably not
the -O level that counts, so much as the fact that anchovy is using
-flto (link-time optimization) in CFLAGS. I don't see any indication
that that's being selected by the buildfarm script itself, so it must be
coming from an environment setting of CFLAGS.

The buildfarm member is using:

'CFLAGS' => '-O3 -xN -parallel -ip'
'CC' => 'icc'

cheers

andrew

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/28/2011 05:51 PM, Tom Lane wrote:

Is -O3 a recommended setting for icc?

No idea. But after a bit of man-page-reading I think it's probably not
the -O level that counts, so much as the fact that anchovy is using
-flto (link-time optimization) in CFLAGS. I don't see any indication
that that's being selected by the buildfarm script itself, so it must be
coming from an environment setting of CFLAGS.

The buildfarm member is using:
'CFLAGS' => '-O3 -xN -parallel -ip'
'CC' => 'icc'

Er, anchovy? Where do you see that? The only thing I see it forcing
is

'config_env' => {
'CC' => 'ccache cc'
},

regards, tom lane

#6Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#5)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

On 08/28/2011 06:21 PM, Tom Lane wrote:

Andrew Dunstan<andrew@dunslane.net> writes:

On 08/28/2011 05:51 PM, Tom Lane wrote:

Is -O3 a recommended setting for icc?

No idea. But after a bit of man-page-reading I think it's probably not
the -O level that counts, so much as the fact that anchovy is using
-flto (link-time optimization) in CFLAGS. I don't see any indication
that that's being selected by the buildfarm script itself, so it must be
coming from an environment setting of CFLAGS.

The buildfarm member is using:
'CFLAGS' => '-O3 -xN -parallel -ip'
'CC' => 'icc'

Er, anchovy? Where do you see that? The only thing I see it forcing
is

'config_env' => {
'CC' => 'ccache cc'
},

Sorry, yes, you're right. I was looking at mongoose.

cheers

andrew

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#2)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/28/2011 04:15 PM, Tom Lane wrote:

The bottom line seems to be that autoconf 2.59 is seriously broken on
recent toolchains. Should we try to do something about that, like
migrate the 8.2 and 8.3 releases to a newer autoconf? 8.2 is close
enough to EOL that I don't mind answering "no" for it, but maybe we
should do that in 8.3.

If we're going to do it for 8.3 we might as well for 8.2 at the same
time, ISTM, even if it is close to EOL.

I looked into this and decided that actually starting to use autoconf
2.61 or later isn't very feasible. The trouble is that in 2.61, there
is a new set of install-directory options, eg htmldir exists where it
did not before. When Peter updated us to 2.61 in PG 8.4, he made some
significant user-visible changes in configure's set of installation
directory options, and even in the default locations of some installed
documentation files. I think changing that sort of thing in 8.3.16
(to say nothing of 8.2.22) is out of the question; for example, it's
likely to break packaging scripts. I thought for a little bit about
whether we could hack on 2.61 until it presented the same
installation-directory behavior as 2.59, but that would take a lot more
work and testing than I have any desire to put in.

What *does* seem feasible is to back-port just the single change we
actually need, by copying the two relevant macros into one of our
config/ source files for the configure script. I've tested that in
8.3 and it seems to work --- at least, the generated configure script
changes in the expected way. This also seems like a reasonably sane
thing to back-port to 8.2. So I'll go ahead and commit those things
and see if anchovy likes it.

regards, tom lane

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Why buildfarm member anchovy is failing on 8.2 and 8.3 branches

I wrote:

What *does* seem feasible is to back-port just the single change we
actually need, by copying the two relevant macros into one of our
config/ source files for the configure script. I've tested that in
8.3 and it seems to work --- at least, the generated configure script
changes in the expected way. This also seems like a reasonably sane
thing to back-port to 8.2. So I'll go ahead and commit those things
and see if anchovy likes it.

So the upshot is that that fixed the 8.3 build, but anchovy is still
failing on 8.2, with some different errors:

/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function base_yyparse:y.tab.c:12777: error: undefined reference to 'filtered_base_yylex'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans0.ltrans.o: in function base_yyparse:gram.y:494: error: undefined reference to 'parsetree'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans7.ltrans.o: in function parseTypeString:parse_type.c:445: error: undefined reference to 'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function simplify_function.128434.2836:postgres.c:544: error: undefined reference to 'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function pg_parse_and_rewrite:postgres.c:544: error: undefined reference to 'raw_parser'
/usr/bin/ld.gold: /tmp/ccn7RPJJ.ltrans19.ltrans.o: in function fmgr_sql_validator:postgres.c:544: error: undefined reference to 'raw_parser'
collect2: ld returned 1 exit status

I went so far as to install Arch Linux here, but I cannot duplicate
the above. (Although I wonder whether my machine is really doing
link-time optimization, since it doesn't generate any compiler warning
messages during the link step, as anchovy is doing.)

But these errors seem like they should be impossible anyway,
since there is nothing platform-specific about our uses of any of the
mentioned functions. I wonder if there is something messed up with
anchovy's copy of REL8_2_STABLE. Marti, could I trouble you to
blow away and recreate that machine's 8.2 checkout, as well as any
compiler cache directories?

regards, tom lane