Sun performance - Major discovery!

Started by Jeffover 22 years ago31 messages

threshar@torgo.978.org

over 22 years ago

Well, as you guys know I've been tinkering with sun-vs-linux postgres for
a while trying to come up with reasons for the HUGE performance
differences. We've all had our anecdotal thoughts (fork sucks, ipc sucks,
ufs sucks, etc) and I've had a breakthrough.

Knowing that GCC only produces good code on x86 (and powerpc with apple's
mods, but it is doubtful that is as good as ibm's power compiler) I
decided to try out Sunsoft CC. I'd heard from more than one person/place
that gcc makes abysmal sparc code. Given that the performance profiles
for both the linux and sun boxes showed the same functions taking up most
of the time I thought I'd see what a difference sunsoft could give me.

So - hardware -
Sun E450 4x400mhz ultrasparc IIi, 4GB ram, scsi soemthing disk. (not
raid) solaris 2.6

Linux - 2xP3 500mhz, 2GB, scsi disk of some flavor (not raid) linux 2.2.17
(old I know!)

So here's the results using my load tester (single connection per beater,
repeats the query 1000 times with different input each time (we'll get
~20k rows back), the query is a common query around here.

I discounted the first run of the test as caches populated.

Linux - 1x - 35 seconds, 20x - 180 seconds

Sun - gcc - 1x 60 seconds 20x 245 seconds
Sun - sunsoft defaults - 1x 52 seonds 20x [similar to gcc most likely]
Sun - sunsoft -fast - 1x 28 seconds 20x 164 seconds

As you math guru's can probably deduce - that is a rather large
improvement. And by rather large I mean hugely significant. With results
like this, I think it warrants mentioning in the FAQ_Solaris, and probably
the performance guide.

Connecting will always be a bit slower. But I think most people realize
that connecting to a db is not cheap.

I think update/etc will cause more locking, but I think IO will become the
bottle neck much sooner than lock/unlock will. (This is mostly anecdotal
given how fast solaris can lock/unlock a semaphore and how much IO I know
I have)

Oh yes, with was with 7.3.4 and sunsoft cc Sun WorkShop 6 update 1 C
5.2 2000/09/11 (which is old, perhaps newer ones make even better code?)

I'm not sure of PG's policy of non-gcc things in configure, but perhaps if
we detect sunsoft we toss in the -fast flag and maybe make it the
preferred one on sun? [btw, it compiled with no changes but it did spew
out tons of warnings]

comments?

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Jeff (#1)

Re: Sun performance - Major discovery!

On Wed, Oct 08, 2003 at 08:36:56AM -0400, Jeff wrote:

So here's the results using my load tester (single connection per beater,
repeats the query 1000 times with different input each time (we'll get
~20k rows back), the query is a common query around here.

My worry about this test is that it gives us precious little
knowledge about concurrent connection slowness, which is where I find
the most significant problems. When we tried a Sunsoft cc vs gcc 2.95
on Sol 7 about 1 1/2 years ago, we found more or less no difference
once we added more than 5 connections (and we always have more than 5
connections). It might be worth trying again, though, since we moved
to Sol 8.

Thanks for the result.

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

Neil Conway

neilc@samurai.com

over 22 years ago

In reply to: Jeff (#1)

Re: Sun performance - Major discovery!

On Wed, 2003-10-08 at 08:36, Jeff wrote:

So here's the results using my load tester (single connection per beater,
repeats the query 1000 times with different input each time (we'll get
~20k rows back), the query is a common query around here.

What is the query?

Linux - 1x - 35 seconds, 20x - 180 seconds

"20x" means 20 concurrent testing processes, right?

Sun - gcc - 1x 60 seconds 20x 245 seconds
Sun - sunsoft defaults - 1x 52 seonds 20x [similar to gcc most likely]
Sun - sunsoft -fast - 1x 28 seconds 20x 164 seconds

Interesting (and surprising that the performance differential is that
large, to me at least). Can you tell if the performance gain comes from
an improvement in a particular subsystem? (i.e. could you get a profile
of Sun/gcc and compare it with Sun/sunsoft).

-Neil

Jeff

threshar@torgo.978.org

over 22 years ago

In reply to: Andrew Sullivan (#2)

Re: Sun performance - Major discovery!

On Wed, 8 Oct 2003, Andrew Sullivan wrote:

My worry about this test is that it gives us precious little
knowledge about concurrent connection slowness, which is where I find
the most significant problems. When we tried a Sunsoft cc vs gcc 2.95
on Sol 7 about 1 1/2 years ago, we found more or less no difference
once we added more than 5 connections (and we always have more than 5
connections). It might be worth trying again, though, since we moved
to Sol 8.

The 20x column are the results when I fired up 20 beater concurrently.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Jeff

threshar@torgo.978.org

over 22 years ago

In reply to: Neil Conway (#3)

Re: Sun performance - Major discovery!

On Wed, 8 Oct 2003, Neil Conway wrote:

What is the query?

It retrieves an index listing for our boards. The boards are flat (not
threaded) and messages are numbered starting at 1 for each board.

If you pass in 0 for the start_from it assumes the latest 60.

And it should be noted - in some cases some boards have nearly 2M posts.
Index on board_name, number.

I cannot give out too too much stuff ;)

create or replace function get_index2(integer, varchar, varchar)
returns setof snippet
as '
DECLARE
p_start alias for $1;
p_board alias for $2;
v_start integer;
v_num integer;
v_body text;
v_sender varchar(35);
v_time timestamptz;
v_finish integer;
v_row record;
v_ret snippet;
BEGIN

v_start := p_start;

if v_start = 0 then
select * into v_start from get_high_msg(p_board);
v_start := v_start - 59;
end if;

v_finish := v_start + 60;

for v_row in
select number, substr(body, 0, 50) as snip, member_handle,
timestamp
from posts
where board_name = p_board and
number >= v_start and
number < v_finish
order by number desc
LOOP
return next v_row;
END LOOP;

return;
END;
' language 'plpgsql';

Interesting (and surprising that the performance differential is that
large, to me at least). Can you tell if the performance gain comes from
an improvement in a particular subsystem? (i.e. could you get a profile
of Sun/gcc and compare it with Sun/sunsoft).

I'll get these later today.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Jeff

threshar@torgo.978.org

over 22 years ago

In reply to: Neil Conway (#3)

Re: Sun performance - Major discovery!

On Wed, 8 Oct 2003, Neil Conway wrote:

Interesting (and surprising that the performance differential is that
large, to me at least). Can you tell if the performance gain comes from
an improvement in a particular subsystem? (i.e. could you get a profile
of Sun/gcc and compare it with Sun/sunsoft).

Yeah - like I expected it was able to generate much better code for
_bt_checkkeys which was the #1 function in gcc on both sun & linux.

and as you can see, suncc was just able to generate much nicer code. I'd
look at the assembler output but that won't be useful since I am very
unfamiliar with the [ultra]sparc instruction set..

Here's the prof and gprof output for the latest run:
GCC:
% cumulative self self total
time seconds seconds calls ms/call ms/call name
31.52 19.44 19.44 internal_mcount
20.28 31.95 12.51 8199466 0.00 0.00 _bt_checkkeys
5.61 35.41 3.46 8197422 0.00 0.00 _bt_step
5.01 38.50 3.09 24738620 0.00 0.00 FunctionCall2
3.00 40.35 1.85 8194186 0.00 0.00 varchareq
2.61 41.96 1.61 24309 0.07 0.28 _bt_next
2.42 43.45 1.49 1003 1.49 1.51 AtEOXact_Buffers
2.37 44.91 1.46 12642 0.12 0.12 _read
2.33 46.35 1.44 16517771 0.00 0.00 pg_detoast_datum
2.08 47.63 1.28 8193186 0.00 0.00 int4lt
1.35 48.46 0.83 8237204 0.00 0.00 BufferGetBlockNumber
1.35 49.29 0.83 8193888 0.00 0.00 int4ge
1.35 50.12 0.83 _mcount

SunCC -pg -fast.
%Time Seconds Cumsecs #Calls msec/call Name

23.2 4.27 4.27108922056 0.0000 _mcount
20.7 3.82 8.09 8304052 0.0005 _bt_checkkeys
13.7 2.53 10.6225054788 0.0001 FunctionCall2
5.1 0.94 11.56 24002 0.0392 _bt_next
4.4 0.81 12.37 8301867 0.0001 _bt_step
3.4 0.63 13.00 8298219 0.0001 varchareq
2.7 0.50 13.5016726855 0.0000 pg_detoast_datum
2.4 0.45 13.95 8342464 0.0001 BufferGetBlockNumber
2.4 0.44 14.39 8297941 0.0001 int4ge
2.2 0.41 14.80 1003 0.409 AtEOXact_Buffers
2.0 0.37 15.17 4220349 0.0001 lc_collate_is_c
2.0 0.37 15.54 8297219 0.0000 int4lt
1.6 0.29 15.83 26537 0.0109 AllocSetContextCreate
0.9 0.16 15.99 1887 0.085 pglz_decompress
0.7 0.13 16.12 159966 0.0008 nocachegetattr
0.7 0.13 16.25 4220349 0.0000 varstr_cmp
0.6 0.11 16.36 937576 0.0001 MemoryContextAlloc
0.5 0.09 16.45 150453 0.0006 hash_search

-Neil

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

Neil Conway

neilc@samurai.com

over 22 years ago

In reply to: Andrew Sullivan (#2)

Re: Sun performance - Major discovery!

On Wed, 2003-10-08 at 10:48, Andrew Sullivan wrote:

My worry about this test is that it gives us precious little
knowledge about concurrent connection slowness, which is where I find
the most significant problems.

As Jeff points out, the second set of results is for 20 concurrent
connections. Note that the advantage sunsoft cc has over gcc decreases
as the number of connections increases (which makes sense, as the 20x
workload is likely to be more I/O bound).

-Neil

Neil Conway

neilc@samurai.com

over 22 years ago

In reply to: Jeff (#6)

Re: Sun performance - Major discovery!

On Wed, 2003-10-08 at 11:46, Jeff wrote:

Yeah - like I expected it was able to generate much better code for
_bt_checkkeys which was the #1 function in gcc on both sun & linux.

and as you can see, suncc was just able to generate much nicer code.

What CFLAGS does configure pick for gcc? From
src/backend/template/solaris, I'd guess it's not enabling any
optimization. Is that the case? If so, some gcc numbers with -O and -O2
would be useful.

-Neil

Jeff

threshar@torgo.978.org

over 22 years ago

In reply to: Neil Conway (#8)

Re: Sun performance - Major discovery!

On Wed, 8 Oct 2003, Neil Conway wrote:

What CFLAGS does configure pick for gcc? From
src/backend/template/solaris, I'd guess it's not enabling any
optimization. Is that the case? If so, some gcc numbers with -O and -O2
would be useful.

I can't believe I didn't think of this before! heh.
Turns out gcc was getting nothing for flags.

I added -O2 to CFLAGS and my 60 seconds went down to 21. A rather mild
improvment huh?

I did a few more tests and suncc still beats it out - but not by too much
now (Not enought to justify buying a license just for compiling pg)

I'll go run the regression test suite with my gcc -O2 pg and the suncc pg.
See if they pass the test.

If they do we should consider adding -O2 and -fast to the CFLAGS.

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

#10

Bruce Momjian

pgman@candle.pha.pa.us

over 22 years ago

In reply to: Jeff (#9)

Re: Sun performance - Major discovery!

Jeff wrote:

On Wed, 8 Oct 2003, Neil Conway wrote:

What CFLAGS does configure pick for gcc? From
src/backend/template/solaris, I'd guess it's not enabling any
optimization. Is that the case? If so, some gcc numbers with -O and -O2
would be useful.

I can't believe I didn't think of this before! heh.
Turns out gcc was getting nothing for flags.

I added -O2 to CFLAGS and my 60 seconds went down to 21. A rather mild
improvment huh?

I did a few more tests and suncc still beats it out - but not by too much
now (Not enought to justify buying a license just for compiling pg)

I'll go run the regression test suite with my gcc -O2 pg and the suncc pg.
See if they pass the test.

If they do we should consider adding -O2 and -fast to the CFLAGS.

[ CC added for hackers.]

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

darwin
dgux
freebsd (non-alpha)
irix5
nextstep
osf (gcc)
qnx4
solaris
sunos4
svr4
ultrix4

I thought we used to have code that did -O for any platforms that set no
cflags, but I don't see that around anywhere. I recommend adding -O2,
or at leaset -O to all these platforms --- we can then use platform
testing to make sure they are working.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#11

Neil Conway

neilc@samurai.com

over 22 years ago

In reply to: Bruce Momjian (#10)

Re: Sun performance - Major discovery!

On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms.

ISTM the most legitimate reason for not enabling compilater
optimizations on a given compiler/OS/architecture combination is might
cause compiler errors / bad code generation.

Can we get these optimizations enabled in time for the next 7.4 beta? It
might also be good to add an item in the release notes about it.

-Neil

#12

Jeff

threshar@torgo.978.org

over 22 years ago

In reply to: Neil Conway (#11)

Re: [HACKERS] Sun performance - Major discovery!

On Wed, 8 Oct 2003, Neil Conway wrote:

ISTM the most legitimate reason for not enabling compilater
optimizations on a given compiler/OS/architecture combination is might
cause compiler errors / bad code generation.

Can we get these optimizations enabled in time for the next 7.4 beta? It
might also be good to add an item in the release notes about it.

-Neil

I just ran make check for sun with gcc -O2 and suncc -fast and both
passed.

We'll need other arguments to suncc to supress some warnings, etc. (-fast
generates a warning for every file compiled telling you it will only
run on ultrasparc machines)

--
Jeff Trout <jeff@jefftrout.com>
http://www.jefftrout.com/
http://www.stuarthamm.net/

#13

Andrew Dunstan

andrew@dunslane.net

over 22 years ago

In reply to: Bruce Momjian (#10)

Re: [PERFORM] Sun performance - Major discovery!

Bruce Momjian wrote:

Jeff wrote:

On Wed, 8 Oct 2003, Neil Conway wrote:

What CFLAGS does configure pick for gcc? From
src/backend/template/solaris, I'd guess it's not enabling any
optimization. Is that the case? If so, some gcc numbers with -O and -O2
would be useful.

I can't believe I didn't think of this before! heh.
Turns out gcc was getting nothing for flags.

I added -O2 to CFLAGS and my 60 seconds went down to 21. A rather mild
improvment huh?

I did a few more tests and suncc still beats it out - but not by too much
now (Not enought to justify buying a license just for compiling pg)

I'll go run the regression test suite with my gcc -O2 pg and the suncc pg.
See if they pass the test.

If they do we should consider adding -O2 and -fast to the CFLAGS.

[ CC added for hackers.]

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

darwin
dgux
freebsd (non-alpha)
irix5
nextstep
osf (gcc)
qnx4
solaris
sunos4
svr4
ultrix4

I thought we used to have code that did -O for any platforms that set no
cflags, but I don't see that around anywhere. I recommend adding -O2,
or at leaset -O to all these platforms --- we can then use platform
testing to make sure they are working.

Actually, I would not be surprised to see gains on Solaris/SPARC from
-O3 with gcc, which enables inlining and register-renaming, although
this does make debugging pretty much impossible.

worth testing at least (but I no longer have access to a Solaris machine).

cheers

andrew

#14

Noname

pgsql-performance@sysd.com

over 22 years ago

In reply to: Jeff (#9)

Re: Sun performance - Major discovery!

In message <Pine.BSF.4.44.0310081408370.64781-100000@torgo.978.org>, Jeff writes:

I'll go run the regression test suite with my gcc -O2 pg and the suncc pg.
See if they pass the test.

My default set of gcc optimization flags is:

-O3 -funroll-loops -frerun-cse-after-loop -frerun-loop-opt -falign-functions -mcpu=i686 -march=i686

Obviously the last two flags product CPU specific code, so would have
to differ...autoconf is always possible, but so is just lopping them off.

I have found these flags to produce faster code that a simple -O2, but
I understand the exact combination which is best for you is
code-dependent. Of course, if you are getting really excited, you can
use -fbranch-probabilities, but as you will see if you investigate
that requires some profiling information, so is not very easy to
actually practically use.

-Seth Robertson

#15

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Neil Conway (#11)

Re: Sun performance - Major discovery!

Neil Conway <neilc@samurai.com> writes:

On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms.

I believe that autoconf will automatically select -O2 (when CFLAGS isn't
already set) *if* it's chosen gcc. It won't select anything for vendor
ccs.

Can we get these optimizations enabled in time for the next 7.4 beta?

I think it's too late in the beta cycle to add optimization flags except
for platforms we can get specific success results for. (Solaris is
probably okay for instance.) The risk of breaking things seems too
high.

regards, tom lane

#16

Peter Eisentraut

peter_e@gmx.net

over 22 years ago

In reply to: Bruce Momjian (#10)

Re: [HACKERS] Sun performance - Major discovery!

Bruce Momjian writes:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

freebsd (non-alpha)

I'm wondering what that had in mind:

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/template/freebsd.diff?r1=1.10&r2=1.11

--
Peter Eisentraut peter_e@gmx.net

#17

Bruce Momjian

pgman@candle.pha.pa.us

over 22 years ago

In reply to: Tom Lane (#15)

Re: Sun performance - Major discovery!

Tom Lane wrote:

Neil Conway <neilc@samurai.com> writes:

On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms.

I believe that autoconf will automatically select -O2 (when CFLAGS isn't
already set) *if* it's chosen gcc. It won't select anything for vendor
ccs.

I think the problem is that template/solaris overrides that with:

CFLAGS=

Can we get these optimizations enabled in time for the next 7.4 beta?

I think it's too late in the beta cycle to add optimization flags except
for platforms we can get specific success results for. (Solaris is
probably okay for instance.) The risk of breaking things seems too
high.

Agreed. Do we set them all to -O2, then remove it from the ones we
don't get successful reports on?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#18

Bruce Momjian

pgman@candle.pha.pa.us

over 22 years ago

In reply to: Peter Eisentraut (#16)

Re: [HACKERS] Sun performance - Major discovery!

Peter Eisentraut wrote:

Bruce Momjian writes:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

freebsd (non-alpha)

I'm wondering what that had in mind:

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/template/freebsd.diff?r1=1.10&r2=1.11

I was wondering that myself. I think the idea was that we already do
-O2 in configure if it is gcc, so why do it in the template files. What
is killing us is the CFLAGS= lines in the configuration files.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#19

Christopher Kings-Lynne

chriskl@familyhealth.com.au

over 22 years ago

In reply to: Peter Eisentraut (#16)

Re: [HACKERS] Sun performance - Major discovery!

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

freebsd (non-alpha)

I'm wondering what that had in mind:

http://developer.postgresql.org/cvsweb.cgi/pgsql-server/src/template/freebsd.diff?r1=1.10&r2=1.11

When I used to build pgsql on freebsd/alpha, I would get heaps of GCC
warnings saying 'optimisations for the alpha are broken'. I can't
remember if that meant anything more than just -O or not though.

Chris

#20

Bruce Momjian

pgman@candle.pha.pa.us

over 22 years ago

In reply to: Tom Lane (#15)

1 attachment(s)

Re: [HACKERS] Sun performance - Major discovery!

Tom Lane wrote:

Neil Conway <neilc@samurai.com> writes:

On Wed, 2003-10-08 at 14:31, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms.

I believe that autoconf will automatically select -O2 (when CFLAGS isn't
already set) *if* it's chosen gcc. It won't select anything for vendor
ccs.

Can we get these optimizations enabled in time for the next 7.4 beta?

I think it's too late in the beta cycle to add optimization flags except
for platforms we can get specific success results for. (Solaris is
probably okay for instance.) The risk of breaking things seems too
high.

OK, patch attached and applied. It centralizes the optimization
defaults into configure.in, rather than having CFLAGS= in the template
files.

It used -O2 for gcc (generated automatically by autoconf), and -O for
non-gcc, unless the template overrides it.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Attachments:

/bjm/difftext/plainDownload

Index: configure
===================================================================
RCS file: /cvsroot/pgsql-server/configure,v
retrieving revision 1.302
diff -c -c -r1.302 configure
*** configure	3 Oct 2003 03:08:14 -0000	1.302
--- configure	9 Oct 2003 03:16:44 -0000
***************
*** 2393,2398 ****
--- 2393,2402 ----
  if test "$ac_env_CFLAGS_set" = set; then
    CFLAGS=$ac_env_CFLAGS_value
  fi
+ # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+ if test x"$CFLAGS" = x""; then
+ 	CFLAGS="-O"
+ fi
  if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
    CFLAGS="$CFLAGS -g"
  fi
Index: configure.in
===================================================================
RCS file: /cvsroot/pgsql-server/configure.in,v
retrieving revision 1.293
diff -c -c -r1.293 configure.in
*** configure.in	3 Oct 2003 03:08:14 -0000	1.293
--- configure.in	9 Oct 2003 03:16:46 -0000
***************
*** 238,243 ****
--- 238,247 ----
  if test "$ac_env_CFLAGS_set" = set; then
    CFLAGS=$ac_env_CFLAGS_value
  fi
+ # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+ if test x"$CFLAGS" = x""; then	
+ 	CFLAGS="-O"
+ fi
  if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
    CFLAGS="$CFLAGS -g"
  fi
Index: src/template/beos
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/beos,v
retrieving revision 1.6
diff -c -c -r1.6 beos
*** src/template/beos	21 Oct 2000 22:36:13 -0000	1.6
--- src/template/beos	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS='-O2'
--- 0 ----
Index: src/template/bsdi
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/bsdi,v
retrieving revision 1.16
diff -c -c -r1.16 bsdi
*** src/template/bsdi	27 Sep 2003 16:24:44 -0000	1.16
--- src/template/bsdi	9 Oct 2003 03:16:51 -0000
***************
*** 5,13 ****
  esac
  
  case $host_os in
!   bsdi2.0 | bsdi2.1 | bsdi3*)
!     CC=gcc2
!     ;;
  esac
  
  THREAD_SUPPORT=yes
--- 5,11 ----
  esac
  
  case $host_os in
!   bsdi2.0 | bsdi2.1 | bsdi3*) CC=gcc2;;
  esac
  
  THREAD_SUPPORT=yes
Index: src/template/cygwin
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/cygwin,v
retrieving revision 1.2
diff -c -c -r1.2 cygwin
*** src/template/cygwin	9 Oct 2003 02:37:09 -0000	1.2
--- src/template/cygwin	9 Oct 2003 03:16:51 -0000
***************
*** 1,2 ****
- CFLAGS='-O2'
  SRCH_LIB='/usr/local/lib'
--- 1 ----
Index: src/template/dgux
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/dgux,v
retrieving revision 1.10
diff -c -c -r1.10 dgux
*** src/template/dgux	21 Oct 2000 22:36:13 -0000	1.10
--- src/template/dgux	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS=
--- 0 ----
Index: src/template/freebsd
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/freebsd,v
retrieving revision 1.23
diff -c -c -r1.23 freebsd
*** src/template/freebsd	27 Sep 2003 16:24:44 -0000	1.23
--- src/template/freebsd	9 Oct 2003 03:16:51 -0000
***************
*** 1,17 ****
- CFLAGS='-pipe'
- 
  case $host_cpu in
!   alpha*)   CFLAGS="$CFLAGS -O" ;;
  esac
  
  THREAD_SUPPORT=yes
  NEED_REENTRANT_FUNCS=yes
  THREAD_CPPFLAGS="-D_THREAD_SAFE"
  case $host_os in
! 		freebsd2*|freebsd3*|freebsd4*)
! 			THREAD_LIBS="-pthread"
! 			;;
! 		*)
! 			THREAD_LIBS="-lc_r"
! 			;;
  esac
--- 1,11 ----
  case $host_cpu in
!   alpha*)   CFLAGS="-O";;
  esac
  
  THREAD_SUPPORT=yes
  NEED_REENTRANT_FUNCS=yes
  THREAD_CPPFLAGS="-D_THREAD_SAFE"
  case $host_os in
! 	freebsd2*|freebsd3*|freebsd4*) THREAD_LIBS="-pthread";;
! 	*) THREAD_LIBS="-lc_r";;
  esac
Index: src/template/hpux
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/hpux,v
retrieving revision 1.7
diff -c -c -r1.7 hpux
*** src/template/hpux	2 Apr 2003 00:49:28 -0000	1.7
--- src/template/hpux	9 Oct 2003 03:16:51 -0000
***************
*** 1,8 ****
! if test "$GCC" = yes ; then
!   CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED"
!   CFLAGS="-O2"
! else
    CC="$CC -Ae"
-   CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED"
    CFLAGS="+O2"
  fi
--- 1,6 ----
! CPPFLAGS="-D_XOPEN_SOURCE_EXTENDED"
! 
! if test "$GCC" != yes ; then
    CC="$CC -Ae"
    CFLAGS="+O2"
  fi
Index: src/template/irix5
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/irix5,v
retrieving revision 1.9
diff -c -c -r1.9 irix5
*** src/template/irix5	21 Oct 2000 22:36:13 -0000	1.9
--- src/template/irix5	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS=
--- 0 ----
Index: src/template/linux
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/linux,v
retrieving revision 1.18
diff -c -c -r1.18 linux
*** src/template/linux	27 Sep 2003 22:23:35 -0000	1.18
--- src/template/linux	9 Oct 2003 03:16:51 -0000
***************
*** 1,4 ****
- CFLAGS=-O2
  # Force _GNU_SOURCE on; plperl is broken with Perl 5.8.0 otherwise
  CPPFLAGS="-D_GNU_SOURCE"
  
--- 1,3 ----
***************
*** 6,9 ****
  NEED_REENTRANT_FUNCS=yes	# Debian kernel 2.2 2003-09-27
  THREAD_CPPFLAGS="-D_REENTRANT -D_THREAD_SAFE -D_POSIX_PTHREAD_SEMANTICS"
  THREAD_LIBS="-lpthread"
- 
--- 5,7 ----
Index: src/template/netbsd
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/netbsd,v
retrieving revision 1.13
diff -c -c -r1.13 netbsd
*** src/template/netbsd	27 Sep 2003 16:24:44 -0000	1.13
--- src/template/netbsd	9 Oct 2003 03:16:51 -0000
***************
*** 1,4 ****
- CFLAGS='-O2 -pipe'
- 
  THREAD_SUPPORT=yes
  NEED_REENTRANT_FUNCS=yes	# 1.6 2003-09-14
--- 1,2 ----
Index: src/template/nextstep
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/nextstep,v
retrieving revision 1.7
diff -c -c -r1.7 nextstep
*** src/template/nextstep	15 Jul 2000 15:54:52 -0000	1.7
--- src/template/nextstep	9 Oct 2003 03:16:51 -0000
***************
*** 1,4 ****
  AROPT=rc
- CFLAGS=
  SHARED_LIB=
  DLSUFFIX=.o
--- 1,3 ----
Index: src/template/openbsd
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/openbsd,v
retrieving revision 1.8
diff -c -c -r1.8 openbsd
*** src/template/openbsd	21 Oct 2000 22:36:14 -0000	1.8
--- src/template/openbsd	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS='-O2 -pipe'
--- 0 ----
Index: src/template/osf
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/osf,v
retrieving revision 1.10
diff -c -c -r1.10 osf
*** src/template/osf	27 Sep 2003 16:24:45 -0000	1.10
--- src/template/osf	9 Oct 2003 03:16:51 -0000
***************
*** 1,6 ****
! if test "$GCC" = yes ; then
!   CFLAGS=
! else
    CC="$CC -std"
    CFLAGS='-O4 -Olimit 2000'
  fi
--- 1,4 ----
! if test "$GCC" != yes ; then
    CC="$CC -std"
    CFLAGS='-O4 -Olimit 2000'
  fi
Index: src/template/qnx4
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/qnx4,v
retrieving revision 1.4
diff -c -c -r1.4 qnx4
*** src/template/qnx4	24 May 2001 22:33:18 -0000	1.4
--- src/template/qnx4	9 Oct 2003 03:16:51 -0000
***************
*** 1,2 ****
! CFLAGS=-I/usr/local/include
! LIBS=-lunix
--- 1,2 ----
! CFLAGS="-O2 -I/usr/local/include"
! LIBS="-lunix"
Index: src/template/sco
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/sco,v
retrieving revision 1.10
diff -c -c -r1.10 sco
*** src/template/sco	11 Dec 2002 22:27:26 -0000	1.10
--- src/template/sco	9 Oct 2003 03:16:51 -0000
***************
*** 1,7 ****
- if test "$GCC" = yes; then
-   CFLAGS=-O2
- else
-   CFLAGS=-O
- fi
  CC="$CC -b elf"
  
--- 1,2 ----
Index: src/template/solaris
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/solaris,v
retrieving revision 1.5
diff -c -c -r1.5 solaris
*** src/template/solaris	27 Sep 2003 16:24:45 -0000	1.5
--- src/template/solaris	9 Oct 2003 03:16:51 -0000
***************
*** 1,8 ****
! if test "$GCC" = yes ; then
!   CFLAGS=
! else
    CC="$CC -Xa"			# relaxed ISO C mode
!   CFLAGS=-v			# -v is like gcc -Wall
  fi
  
  THREAD_SUPPORT=yes
--- 1,6 ----
! if test "$GCC" != yes ; then
    CC="$CC -Xa"			# relaxed ISO C mode
!   CFLAGS="-O -v"		# -v is like gcc -Wall
  fi
  
  THREAD_SUPPORT=yes
Index: src/template/sunos4
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/sunos4,v
retrieving revision 1.2
diff -c -c -r1.2 sunos4
*** src/template/sunos4	21 Oct 2000 22:36:14 -0000	1.2
--- src/template/sunos4	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS=
--- 0 ----
Index: src/template/svr4
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/svr4,v
retrieving revision 1.10
diff -c -c -r1.10 svr4
*** src/template/svr4	21 Oct 2000 22:36:14 -0000	1.10
--- src/template/svr4	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS=
--- 0 ----
Index: src/template/ultrix4
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/ultrix4,v
retrieving revision 1.10
diff -c -c -r1.10 ultrix4
*** src/template/ultrix4	21 Oct 2000 22:36:14 -0000	1.10
--- src/template/ultrix4	9 Oct 2003 03:16:51 -0000
***************
*** 1 ****
- CFLAGS=
--- 0 ----
Index: src/template/univel
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/univel,v
retrieving revision 1.13
diff -c -c -r1.13 univel
*** src/template/univel	21 Oct 2000 22:36:14 -0000	1.13
--- src/template/univel	9 Oct 2003 03:16:51 -0000
***************
*** 1,2 ****
  CFLAGS='-v -O -K i486,host,inline,loop_unroll -Dsvr4'
! LIBS=-lc89 
--- 1,2 ----
  CFLAGS='-v -O -K i486,host,inline,loop_unroll -Dsvr4'
! LIBS="-lc89"
Index: src/template/unixware
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/unixware,v
retrieving revision 1.24
diff -c -c -r1.24 unixware
*** src/template/unixware	27 Sep 2003 16:24:45 -0000	1.24
--- src/template/unixware	9 Oct 2003 03:16:51 -0000
***************
*** 1,5 ****
  if test "$GCC" = yes; then
-   CFLAGS=-O2
    THREAD_CPPFLAGS="-pthread"
  else
  # the -Kno_host is temporary for a bug in the compiler.  See -hackers
--- 1,4 ----
Index: src/template/win
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/win,v
retrieving revision 1.5
diff -c -c -r1.5 win
*** src/template/win	8 Oct 2003 18:23:08 -0000	1.5
--- src/template/win	9 Oct 2003 03:16:51 -0000
***************
*** 1,3 ****
- if test "$GCC" = yes; then
-   CFLAGS="-O2"
- fi
--- 0 ----
Index: src/template/win32
===================================================================
RCS file: /cvsroot/pgsql-server/src/template/win32,v
retrieving revision 1.1
diff -c -c -r1.1 win32
*** src/template/win32	15 May 2003 16:35:30 -0000	1.1
--- src/template/win32	9 Oct 2003 03:16:51 -0000
***************
*** 1,3 ****
- if test "$GCC" = yes; then
-   CFLAGS="-O2"
- fi
--- 0 ----

#21

Tom Lane

tgl@sss.pgh.pa.us

over 22 years ago

In reply to: Bruce Momjian (#20)

Re: [HACKERS] Sun performance - Major discovery!

#22

Andrew Sullivan

andrew@libertyrms.info

over 22 years ago

In reply to: Bruce Momjian (#10)

Re: [HACKERS] Sun performance - Major discovery!

On Wed, Oct 08, 2003 at 02:31:29PM -0400, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

I think gcc _used_ to generate bad code on SPARC if you set any
optimisation. We tested it on Sol7 with gcc 2.95 more than a year
ago, and tried various settings. -O2 worked, but other items were
really bad. Some of them would pass regression but cause strange
behaviour, random coredumps, &c. A little digging demonstrated that
anything beyond -O2 just didn't work for gcc at the time.

-- 
----
Andrew Sullivan                         204-4141 Yonge Street
Afilias Canada                        Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110

#23

Neil Conway

neilc@samurai.com

over 22 years ago

In reply to: Bruce Momjian (#17)

Re: Sun performance - Major discovery!

On Wed, 2003-10-08 at 21:44, Bruce Momjian wrote:

Agreed. Do we set them all to -O2, then remove it from the ones we
don't get successful reports on?

I took the time to compile CVS tip with a few different machines from
HP's TestDrive program, to see if there were any regressions using the
new optimization flags:

(1) (my usual dev machine)

$ uname -a
Linux tokyo 2.4.19-xfs #1 Mon Jan 20 19:12:29 EST 2003 i686 GNU/Linux
$ gcc --version
gcc (GCC) 3.3.2 20031005 (Debian prerelease)

'make check' passes

(2)

$ uname -a
Linux spe161 2.4.18-smp #1 SMP Sat Apr 6 21:42:22 EST 2002 alpha unknown
$ gcc --version
gcc (GCC) 3.3.1

'make check' passes

(3)

$ uname -a
Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown
$ gcc --version
3.0.4

'make check' passes

BTW, this platform doesn't have any code written for native spinlocks.

(4)

$ uname -a
Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002
ia64 unknown
$ gcc --version

When you compile PostgreSQL without changing the CFLAGS configure picks,
the initdb required for 'make check' fails with:

[...]
initializing pg_depend... ok
creating system views... ok
loading pg_description... ok
creating conversions... ERROR: could not identify operator 679

I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above
resulted from an optimization-induced compiler error, but I got the
following error:

$ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c
../../../../src/include/storage/s_lock.h: In function `tas':
../../../../src/include/storage/s_lock.h:125: error: inconsistent
operand constraints in an `asm'

Whereas this works fine:

$ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c
$

BTW, line 138 of s_lock.h is:

#if defined(__arm__) || defined(__arm__)

That seems a little redundant.

Anyway, I tried running initdb after compiling all of pgsql with "-O0',
except for the files that included s_lock.h, but make check still
failed:

creating information schema... ok
vacuuming database template1...
/house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: line 882: 22035 Segmentation fault (core dumped) "$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF
ANALYZE;
VACUUM FULL FREEZE;
EOF

The core file seems to indicate a stack overflow due to an infinitely
recursive function:

(gdb) bt 25
#0 0x4000000000645dc0 in hash_search ()
#1 0x4000000000616930 in RelationSysNameCacheGetRelation ()
#2 0x4000000000616db0 in RelationSysNameGetRelation ()
#3 0x4000000000082e40 in relation_openr ()
#4 0x4000000000083910 in heap_openr ()
#5 0x400000000060e6b0 in ScanPgRelation ()
#6 0x4000000000611d60 in RelationBuildDesc ()
#7 0x4000000000616e70 in RelationSysNameGetRelation ()
#8 0x4000000000082e40 in relation_openr ()
#9 0x4000000000083910 in heap_openr ()
#10 0x400000000060e6b0 in ScanPgRelation ()
#11 0x4000000000611d60 in RelationBuildDesc ()
#12 0x4000000000616e70 in RelationSysNameGetRelation ()
#13 0x4000000000082e40 in relation_openr ()
#14 0x4000000000083910 in heap_openr ()
#15 0x400000000060e6b0 in ScanPgRelation ()
#16 0x4000000000611d60 in RelationBuildDesc ()
#17 0x4000000000616e70 in RelationSysNameGetRelation ()
#18 0x4000000000082e40 in relation_openr ()
#19 0x4000000000083910 in heap_openr ()
#20 0x400000000060e6b0 in ScanPgRelation ()
#21 0x4000000000611d60 in RelationBuildDesc ()
#22 0x4000000000616e70 in RelationSysNameGetRelation ()
#23 0x4000000000082e40 in relation_openr ()
#24 0x4000000000083910 in heap_openr ()
(More stack frames follow...)

(It also dumps core in the same place during initdb if CFLAGS='-O' is
specified.)

So it looks like the Itanium port is a little broken. Does anyone have
an idea what needs to be done to fix it?

-Neil

#24

Bruce Momjian

pgman@candle.pha.pa.us

over 22 years ago

In reply to: Neil Conway (#23)

Re: Sun performance - Major discovery!

Isn't it great how you have the same directory on every host so you can
download once and run the same tests easily.

Neil Conway wrote:

$ uname -a
Linux spe170 2.4.17-64 #1 Sat Mar 16 17:31:44 MST 2002 parisc64 unknown
$ gcc --version
3.0.4

'make check' passes

I didn't know there was a pa-risc-64 chip.

BTW, this platform doesn't have any code written for native spinlocks.

(4)

$ uname -a
Linux spe156 2.4.18-mckinley-smp #1 SMP Thu Jul 11 12:51:02 MDT 2002
ia64 unknown
$ gcc --version

When you compile PostgreSQL without changing the CFLAGS configure picks,
the initdb required for 'make check' fails with:

[...]
initializing pg_depend... ok
creating system views... ok
loading pg_description... ok
creating conversions... ERROR: could not identify operator 679

I tried to compile PostgreSQL with CFLAGS='-O0' to see if the above
resulted from an optimization-induced compiler error, but I got the
following error:

$ gcc -O0 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c
../../../../src/include/storage/s_lock.h: In function `tas':
../../../../src/include/storage/s_lock.h:125: error: inconsistent
operand constraints in an `asm'

Whereas this works fine:

$ gcc -O2 -Wall -Wmissing-prototypes -Wmissing-declarations
-I../../../../src/include -D_GNU_SOURCE -c -o xlog.o xlog.c
$

BTW, line 138 of s_lock.h is:

#if defined(__arm__) || defined(__arm__)

Fix just committed. Thanks.

That seems a little redundant.

Anyway, I tried running initdb after compiling all of pgsql with "-O0',
except for the files that included s_lock.h, but make check still
failed:

creating information schema... ok
vacuuming database template1...
/house/neilc/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql/bin/initdb: line 882: 22035 Segmentation fault (core dumped) "$PGPATH"/postgres $PGSQL_OPT template1 >/dev/null <<EOF
ANALYZE;
VACUUM FULL FREEZE;
EOF

The core file seems to indicate a stack overflow due to an infinitely
recursive function:

(gdb) bt 25
#0 0x4000000000645dc0 in hash_search ()
#1 0x4000000000616930 in RelationSysNameCacheGetRelation ()
#2 0x4000000000616db0 in RelationSysNameGetRelation ()
#3 0x4000000000082e40 in relation_openr ()
#4 0x4000000000083910 in heap_openr ()
#5 0x400000000060e6b0 in ScanPgRelation ()
#6 0x4000000000611d60 in RelationBuildDesc ()
#7 0x4000000000616e70 in RelationSysNameGetRelation ()
#8 0x4000000000082e40 in relation_openr ()
#9 0x4000000000083910 in heap_openr ()
#10 0x400000000060e6b0 in ScanPgRelation ()
#11 0x4000000000611d60 in RelationBuildDesc ()
#12 0x4000000000616e70 in RelationSysNameGetRelation ()
#13 0x4000000000082e40 in relation_openr ()
#14 0x4000000000083910 in heap_openr ()
#15 0x400000000060e6b0 in ScanPgRelation ()
#16 0x4000000000611d60 in RelationBuildDesc ()
#17 0x4000000000616e70 in RelationSysNameGetRelation ()
#18 0x4000000000082e40 in relation_openr ()
#19 0x4000000000083910 in heap_openr ()
#20 0x400000000060e6b0 in ScanPgRelation ()
#21 0x4000000000611d60 in RelationBuildDesc ()
#22 0x4000000000616e70 in RelationSysNameGetRelation ()
#23 0x4000000000082e40 in relation_openr ()
#24 0x4000000000083910 in heap_openr ()
(More stack frames follow...)

(It also dumps core in the same place during initdb if CFLAGS='-O' is
specified.)

So it looks like the Itanium port is a little broken. Does anyone have
an idea what needs to be done to fix it?

My guess is that the compiler itself is broken --- what else could it
be?

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#25

Marko Karppinen

marko@karppinen.fi

over 22 years ago

In reply to: Bruce Momjian (#10)

Re: [HACKERS] Sun performance - Major discovery!

On 8.10.2003, at 21:31, Bruce Momjian wrote:

Well, this is really embarassing. I can't imagine why we would not set
at least -O on all platforms. Looking at the template files, I see
these have no optimization set:

darwin

Regarding Darwin optimizations, Apple has introduced a "-fast" flag in
their GCC 3.3 version that they recommend when compiling code for their
new G5 systems. Because of this, I foresee a lot of people defining
CFLAGS="-fast" on their systems.

This is problematic for PostgreSQL, however, since the -fast flag is
the equivalent of:

-O3 -falign-loops-max-skip=15 -falign-jumps-max-skip=15
-falign-loops=16 -falign-jumps=16 -falign-functions=16 -malign-natural
-ffast-math -fstrict-aliasing -frelax-aliasing -fgcse-mem-alias
-funroll-loops -floop-transpose -floop-to-memset -finline-floor
-mcpu=G5 -mpowerpc64 -mpowerpc-gpopt -mtune=G5 -fsched-interblock
-fload-after-store --param max-gcse-passes=3 -fno-gcse-sm
-fgcse-loop-depth -funit-at-a-time -fcallgraph-inlining
-fdisable-typechecking-for-spec

At least the --fast-math part causes problems, seeing that PostgreSQL
actually checks for the __FAST_MATH__ macro to make sure that it isn't
turned on. There might be other problems with Apple's flags, but I
think that the __FAST_MATH__ check should be altered.

As you know, setting --fast-math in GCC is the equivalent of setting
-fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math,
-ffinite-math-only and -fno-signaling-nans. What really should be done,
I think, is adding the opposites of these flags (-fmath-errno,
-fno-unsafe-math-optimizations, -ftrapping_math, -fno-finite-math-only
and -fsignaling-nans) to the command line if __FAST_MATH__ is detected.
This would allow people to use CFLAGS="-fast" on their G5s, beat some
Xeon speed records, and not worry about esoteric IEEE math standards.
What do you guys think?

GCC sets __FAST_MATH__ even if you counter a -ffast-math with the
negating flags above. This means that it is not currently possible to
use the -fast flag when compiling PostgreSQL at all. Instead, you have
to go through all the flags Apple is setting and only pass on those
that don't break pg.

#26

Tom Lane

tgl@sss.pgh.pa.us

about 22 years ago

In reply to: Bruce Momjian (#20)

Re: [HACKERS] Sun performance - Major discovery!

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, patch attached and applied. It centralizes the optimization
defaults into configure.in, rather than having CFLAGS= in the template
files.

I think there's a problem here:

+ # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+ if test x"$CFLAGS" = x""; then
+ 	CFLAGS="-O"
+ fi
if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
CFLAGS="$CFLAGS -g"
fi

since this will cause "configure --enable-debug" to default to selecting
CFLAGS="-O -g" for non-gcc compilers. On a lot of compilers that
combination does not work, and will generate tons of useless warnings.
I think it might be better to do

  if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
    CFLAGS="$CFLAGS -g"
+ else
+   # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+   if test x"$CFLAGS" = x""; then
+ 	CFLAGS="-O"
+   fi
  fi

regards, tom lane

#27

Bruce Momjian

pgman@candle.pha.pa.us

about 22 years ago

In reply to: Tom Lane (#26)

Re: [HACKERS] Sun performance - Major discovery!

Done as you suggested.

---------------------------------------------------------------------------

Tom Lane wrote:

Bruce Momjian <pgman@candle.pha.pa.us> writes:

OK, patch attached and applied. It centralizes the optimization
defaults into configure.in, rather than having CFLAGS= in the template
files.

I think there's a problem here:
+ # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+ if test x"$CFLAGS" = x""; then
+ 	CFLAGS="-O"
+ fi
if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
CFLAGS="$CFLAGS -g"
fi
since this will cause "configure --enable-debug" to default to selecting
CFLAGS="-O -g" for non-gcc compilers. On a lot of compilers that
combination does not work, and will generate tons of useless warnings.
I think it might be better to do
if test "$enable_debug" = yes && test "$ac_cv_prog_cc_g" = yes; then
CFLAGS="$CFLAGS -g"
+ else
+   # configure sets CFLAGS to -O2 for gcc, so this is only for non-gcc
+   if test x"$CFLAGS" = x""; then
+ 	CFLAGS="-O"
+   fi
fi
regards, tom lane

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

#28

Peter Eisentraut

peter_e@gmx.net

about 22 years ago

In reply to: Marko Karppinen (#25)

Re: [HACKERS] Sun performance - Major discovery!

Marko Karppinen writes:

GCC sets __FAST_MATH__ even if you counter a -ffast-math with the
negating flags above. This means that it is not currently possible to
use the -fast flag when compiling PostgreSQL at all. Instead, you have
to go through all the flags Apple is setting and only pass on those
that don't break pg.

That sounds perfectly reasonable to me. Why should we develop elaborate
workarounds for compiler flags that are known to create broken code? I
also want to point out that I'm getting kind of tired of developing more
and more workarounds for sloppy Apple engineering.

--
Peter Eisentraut peter_e@gmx.net

#29

Tom Lane

tgl@sss.pgh.pa.us

about 22 years ago

In reply to: Marko Karppinen (#25)

Re: [HACKERS] Sun performance - Major discovery!

Marko Karppinen <marko@karppinen.fi> writes:

At least the --fast-math part causes problems, seeing that PostgreSQL
actually checks for the __FAST_MATH__ macro to make sure that it isn't
turned on. There might be other problems with Apple's flags, but I
think that the __FAST_MATH__ check should be altered.

Removing the check is not acceptable --- we spent far too much time
fighting bug reports that turned out to trace to -ffast-math.
See for example
http://archives.postgresql.org/pgsql-bugs/2002-09/msg00169.php

As you know, setting --fast-math in GCC is the equivalent of setting
-fno-math-errno, -funsafe-math-optimizations, -fno-trapping-math,
-ffinite-math-only and -fno-signaling-nans.

I suspect that -funsafe-math-optimizations is the only one of those that
really affects the datetime code, but I would be quite worried about the
side-effects of any of them on the float8 arithmetic routines. Also I
think the behavior of -ffast-math has changed over time; in the gcc
2.95.3 manual I see none of the above and only the description

`-ffast-math'
This option allows GCC to violate some ANSI or IEEE rules and/or
specifications in the interest of optimizing code for speed. For
example, it allows the compiler to assume arguments to the `sqrt'
function are non-negative numbers and that no floating-point values
are NaNs.

Since we certainly do use NaNs, it would be very bad to allow -ffast-math
in gcc 2.95.

gcc 3.2 has some but not all of the sub-flags you list above, so
apparently the behavior changed again as of gcc 3.3.

This means that relaxing the check would require (a) finding out which
of the sub-flags break our code and which don't; (b) finding out how the
answer to (a) has varied with gcc release; and (c) finding out how we
can test whether a given sub-flag is set --- are there #defines for each
of them in gcc 3?

This does not sound real practical to me...

This would allow people to use CFLAGS="-fast" on their G5s, beat some
Xeon speed records, and not worry about esoteric IEEE math standards.

In the words of the sage, "I can make this code *arbitrarily* fast ...
if it doesn't have to give the right answer." Those "esoteric"
standards make the difference between printing 5:00:00 and printing
4:59:60.

regards, tom lane

#30

Marko Karppinen

marko@karppinen.fi

about 22 years ago

In reply to: Tom Lane (#29)

Re: [PERFORM] Sun performance - Major discovery!

On 14.10.2003, at 19:52, Tom Lane wrote:

This means that relaxing the check would require (a) finding out which
of the sub-flags break our code and which don't; (b) finding out how
the
answer to (a) has varied with gcc release; and (c) finding out how we
can test whether a given sub-flag is set --- are there #defines for
each
of them in gcc 3?

Okay, I can see how that makes this unpractical to implement. Thanks.

The current error message is "do not put -ffast-math in CFLAGS"; does
someone have an idea for a better text that doesn't imply that you
actually /have/ --ffast-math in CFLAGS? It'd be good to acknowledge
that it can be set implicitly, too.

And on the same subject:

On 14.10.2003, at 18:13, Peter Eisentraut wrote:

That sounds perfectly reasonable to me. Why should we develop
elaborate
workarounds for compiler flags that are known to create broken code? I
also want to point out that I'm getting kind of tired of developing
more
and more workarounds for sloppy Apple engineering.

Peter, you are free to consider your current environment to be the
peak of perfection, but that doesn't mean that the only reason for
differences between your system and others' is the sloppiness of
their engineering.

I'm not aware of any Darwin-specific "workarounds" in the tree
right now; the only thing close to that is the support for Apple's
two-level namespaces feature. And while you can argue the relative
merits of Apple's approach, the reason for its existence isn't
sloppiness and the support for it that was implemented by Tom
most certainly isn't a workaround.

The fact of the matter is that Mac OS X has about ten million active
users, and when one of these people is looking for an RDBMS, he's
gonna go for one that compiles and works great on his system, rather
worrying if his platform is optimal for running PostgreSQL. Supporting
this platform well is absolutely crucial to the overall adoption of pg,
and even if you consider yourself to be above such pedestrian
concerns, many people who have to make the business case for putting
money into PostgreSQL development most definitely think otherwise.

#31

Peter Eisentraut

peter_e@gmx.net

about 22 years ago

In reply to: Marko Karppinen (#30)

Re: [PERFORM] Sun performance - Major discovery!

Marko Karppinen writes:

I'm not aware of any Darwin-specific "workarounds" in the tree
right now; the only thing close to that is the support for Apple's
two-level namespaces feature. And while you can argue the relative
merits of Apple's approach, the reason for its existence isn't
sloppiness and the support for it that was implemented by Tom
most certainly isn't a workaround.

PostgreSQL is only part of the deal; in other projects, people have to
fight with different kinds of problems. Let me just point out the broken
precompiler, the namespace level thing (which might be a fine feature, but
the way it was shoved in was not), using zsh as the default "Bourne"
shell, using different file types for loadable modules and linkable shared
libraries, standard system paths with spaces in them, and there may be
more that I don't remember now. In my experience, the whole system just
has been very unpleasant to develop portable software for since the day it
appeared. You're not at fault for that, but please understand that,
considering all this, the last thing I want to spend time on is improving
the user response mechanics for a "don't do that then" problem.

The fact of the matter is that Mac OS X has about ten million active
users, and when one of these people is looking for an RDBMS, he's
gonna go for one that compiles and works great on his system, rather
worrying if his platform is optimal for running PostgreSQL. Supporting
this platform well is absolutely crucial to the overall adoption of pg,
and even if you consider yourself to be above such pedestrian
concerns, many people who have to make the business case for putting
money into PostgreSQL development most definitely think otherwise.

Everyone shall be happy if they don't use compiler switches that are known
to create broken code.

--
Peter Eisentraut peter_e@gmx.net