horo(r)logy test fail on solaris (again and solved)

Started by Zdenek Kotalaover 19 years ago17 messages
#1Zdenek Kotala
Zdenek.Kotala@Sun.COM
2 attachment(s)

I tried regression test with Postgres Beta and horology test field. See
attached log. It appears few month ago - see
http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
I used Sun Studio 11 with -fast flag and SPARC platform.

I played little bit with cc flags and following flags work fine for me:

export CFLAGS="-fast"
export LDFLAGS="-lm -fast"

The fast switch for compiler is very important too, because it links
"fast" library.

Could anybody confirm that it works on his machine?

But the question is if the "-fast" flag is good for postgres. The -fast
flag sets "brutal" floating point optimization and some operation should
have less precision. Is possible verify that floating point operation
works well?

I read postgres documentation about floating point datatypes and that
implementation is platform specific. Developer must take care about it
discrepancies, but should there any other part of postgres code where
"-fast" switch generate some computing defect - it means that result
must be platform independent?

The cc flags are describes in
http://docs.sun.com/source/819-3688/cc_ops.app.html.

Zdenek

Attachments:

regression.diffstext/plain; name=regression.diffsDownload
*** ./expected/horology.out	Tue Jul 25 05:51:22 2006
--- ./results/horology.out	Tue Sep 26 14:19:10 2006
***************
*** 2466,2472 ****
  SELECT '' AS ten, f1 AS interval, reltime(f1) AS reltime
    FROM INTERVAL_TBL;
   ten |           interval            |            reltime            
! -----+-------------------------------+-------------------------------
       | @ 1 min                       | @ 1 min
       | @ 5 hours                     | @ 5 hours
       | @ 10 days                     | @ 10 days
--- 2466,2472 ----
  SELECT '' AS ten, f1 AS interval, reltime(f1) AS reltime
    FROM INTERVAL_TBL;
   ten |           interval            |             reltime              
! -----+-------------------------------+----------------------------------
       | @ 1 min                       | @ 1 min
       | @ 5 hours                     | @ 5 hours
       | @ 10 days                     | @ 10 days
***************
*** 2474,2480 ****
       | @ 3 mons                      | @ 3 mons
       | @ 14 secs ago                 | @ 14 secs ago
       | @ 1 day 2 hours 3 mins 4 secs | @ 1 day 2 hours 3 mins 4 secs
!      | @ 6 years                     | @ 6 years
       | @ 5 mons                      | @ 5 mons
       | @ 5 mons 12 hours             | @ 5 mons 12 hours
  (10 rows)
--- 2474,2480 ----
       | @ 3 mons                      | @ 3 mons
       | @ 14 secs ago                 | @ 14 secs ago
       | @ 1 day 2 hours 3 mins 4 secs | @ 1 day 2 hours 3 mins 4 secs
!      | @ 6 years                     | @ 5 years 12 mons 5 days 6 hours
       | @ 5 mons                      | @ 5 mons
       | @ 5 mons 12 hours             | @ 5 mons 12 hours
  (10 rows)

======================================================================

regression.outtext/plain; name=regression.outDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#1)
Re: horo(r)logy test fail on solaris (again and solved)

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

But the question is if the "-fast" flag is good for postgres. The -fast
flag sets "brutal" floating point optimization and some operation should
have less precision. Is possible verify that floating point operation
works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

It might be acceptable if you use --enable-integer-datetimes.

regards, tom lane

#3Bruce Momjian
bruce@momjian.us
In reply to: Zdenek Kotala (#1)
Re: horo(r)logy test fail on solaris (again and

Zdenek Kotala wrote:

I tried regression test with Postgres Beta and horology test field. See
attached log. It appears few month ago - see
http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
I used Sun Studio 11 with -fast flag and SPARC platform.

Are you looking for ways to contort Solaris to make PostgreSQL fail?
That doesn't prove much about PostgreSQL, but rather about Solaris.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#4Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#2)
Re: horo(r)logy test fail on solaris (again and solved)

Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

But the question is if the "-fast" flag is good for postgres. The -fast
flag sets "brutal" floating point optimization and some operation should
have less precision. Is possible verify that floating point operation
works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

Doesn't this look odd regardless of what bad results come back from the
FP library?

cheers

andrew

#5Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#2)
Re: horo(r)logy test fail on solaris (again and solved)

Tom Lane napsal(a):

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

But the question is if the "-fast" flag is good for postgres. The -fast
flag sets "brutal" floating point optimization and some operation should
have less precision. Is possible verify that floating point operation
works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

It might be acceptable if you use --enable-integer-datetimes.

I suggest to remove mention about -fast flag from FAQ.Solaris or add
warning about usage of this.

Josh do you have any cc flags suggestion?

regards, Zdenek

#6Luke Lonergan
LLonergan@greenplum.com
In reply to: Bruce Momjian (#3)
Re: horo(r)logy test fail on solaris (again and

I suspect the '-fast' introduced arithmetic associativity transformations that horology is sensitive to. I've seen this in the past.

The solution I used was to mod the Makefile to exclude the sensitive routines from the aggressive optimizations. As I recall, adt.c was the prime culprit.

- Luke

Msg is shrt cuz m on ma treo

-----Original Message-----
From: Bruce Momjian [mailto:bruce@momjian.us]
Sent: Tuesday, September 26, 2006 11:51 AM Eastern Standard Time
To: Zdenek Kotala
Cc: pgsql-hackers@postgresql.org; Tom Lane; Match.Grun@thomson.com
Subject: Re: [HACKERS] horo(r)logy test fail on solaris (again and

Zdenek Kotala wrote:

I tried regression test with Postgres Beta and horology test field. See
attached log. It appears few month ago - see
http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
I used Sun Studio 11 with -fast flag and SPARC platform.

Are you looking for ways to contort Solaris to make PostgreSQL fail?
That doesn't prove much about PostgreSQL, but rather about Solaris.

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#7Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Bruce Momjian (#3)
Re: horo(r)logy test fail on solaris (again and solved)

Bruce Momjian napsal(a):

Zdenek Kotala wrote:

I tried regression test with Postgres Beta and horology test field. See
attached log. It appears few month ago - see
http://archives.postgresql.org/pgsql-ports/2006-06/msg00004.php
I used Sun Studio 11 with -fast flag and SPARC platform.

Are you looking for ways to contort Solaris to make PostgreSQL fail?
That doesn't prove much about PostgreSQL, but rather about Solaris.

It is not about Solaris, It is about recommended setting for Sun Studio
in the FAQ.Solaris.

regards Zdenek

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#4)
Re: horo(r)logy test fail on solaris (again and solved)

Andrew Dunstan <andrew@dunslane.net> writes:

! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

Doesn't this look odd regardless of what bad results come back from the
FP library?

It looks exactly like the sort of platform-dependent rounding issue that
Bruce and Michael Glaesemann spent a lot of time on recently. It might
be interesting to see if CVS HEAD works any better under these
conditions ... but if it doesn't, that doesn't mean I'll be interested
in fixing it. Getting the float datetime code to work is hard enough
without having a compiler that thinks it can take shortcuts.

regards, tom lane

#9Josh Berkus
josh@agliodbs.com
In reply to: Zdenek Kotala (#5)
Re: horo(r)logy test fail on solaris (again and solved)

Zdenek,

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

But the question is if the "-fast" flag is good for postgres. The
-fast flag sets "brutal" floating point optimization and some
operation should have less precision. Is possible verify that
floating point operation works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

It might be acceptable if you use --enable-integer-datetimes.

I suggest to remove mention about -fast flag from FAQ.Solaris or add
warning about usage of this.

Josh do you have any cc flags suggestion?

Using Sun Studio? I'm hardly the expert. Maybe Jignesh?

--Josh Berkus

#10Luke Lonergan
llonergan@greenplum.com
In reply to: Tom Lane (#8)
Re: horo(r)logy test fail on solaris (again and

Tom,

On 9/26/06 9:15 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Andrew Dunstan <andrew@dunslane.net> writes:

! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

Doesn't this look odd regardless of what bad results come back from the
FP library?

It looks exactly like the sort of platform-dependent rounding issue that
Bruce and Michael Glaesemann spent a lot of time on recently. It might
be interesting to see if CVS HEAD works any better under these
conditions ... but if it doesn't, that doesn't mean I'll be interested
in fixing it. Getting the float datetime code to work is hard enough
without having a compiler that thinks it can take shortcuts.

How about fixing the compilation so that the routines in adt that are
sensitive to FP optimizations are isolated from aggressive optimization?

- Luke

#11Josh Berkus
josh@agliodbs.com
In reply to: Josh Berkus (#9)
Re: horo(r)logy test fail on solaris (again and solved)

Zdenek,

Hmmm ... we're not using the -fast option for the standard PostgreSQL
packages. Where did you start using it?

--
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

#12Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Josh Berkus (#11)
Re: horo(r)logy test fail on solaris (again and solved)

Josh Berkus napsal(a):

Zdenek,

Hmmm ... we're not using the -fast option for the standard PostgreSQL
packages. Where did you start using it?

Yes, I know. The -fast option generates architecture depending code and
it is not possible use in common packages. I found out this option when
I analyzed BUG #2651. I tried regression test and it's fail. I found
that same problem was described with Match Grun few month ago and the
-fast option is mentioned in the FAQ.Solaris for performance tunning.

That is all.

regards Zdenek

#13Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Andrew Dunstan (#4)
Re: horo(r)logy test fail on solaris (again and solved)

Andrew Dunstan napsal(a):

Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

But the question is if the "-fast" flag is good for postgres. The
-fast flag sets "brutal" floating point optimization and some
operation should have less precision. Is possible verify that
floating point operation works well?

That's a pretty good way to guarantee that you'll break the datetime
code.

! | @ 6 years | @ 5 years 12 mons 5 days 6 hours

Doesn't this look odd regardless of what bad results come back from the
FP library?

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

regards, Zdenek

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#13)
Re: horo(r)logy test fail on solaris (again and solved)

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
LDFLAGS?

regards, tom lane

#15Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#14)
Re: horo(r)logy test fail on solaris (again and solved)

Tom Lane napsal(a):

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
LDFLAGS?

Exactly, but I want to sure, that float optimalization is safe and
should be applied for postgres, because -fast breaks IEE754 standard. If
it is OK I will adjust FAQ_Solaris.

Zdenek

#16Kenneth Marshall
ktm@it.is.rice.edu
In reply to: Zdenek Kotala (#15)
Re: horo(r)logy test fail on solaris (again and solved)

On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote:

Tom Lane napsal(a):

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
LDFLAGS?

Exactly, but I want to sure, that float optimalization is safe and
should be applied for postgres, because -fast breaks IEE754 standard. If
it is OK I will adjust FAQ_Solaris.

Zdenek

Unless the packager understands the floating point usage of every
piece and module included and the effect that the -fast option will
have on them, please do not recommend it for anything but extremely
well tested dedicated use-cases. When it causes problems, it can
be terrible if the problems are not detected immediately. Massive
data corruption could occur.

Given these caveats, in a well tested use-case the -fast option can
squeeze a bit more from the CPU and could be used. I have had to
debug the fallout from the -fast option in other software in the
past. Let's just say, backups are a good thing.

I would vote not to recommend it without very strong cautions similar
to was Sun includes in the compiler manual pages.

Ken

#17Bruce Momjian
bruce@momjian.us
In reply to: Kenneth Marshall (#16)
Re: horo(r)logy test fail on solaris (again and

Thanks for the analysis. I have removed mention of the -fast option
from the Solaris FAQ.

---------------------------------------------------------------------------

Kenneth Marshall wrote:

On Wed, Sep 27, 2006 at 04:09:18PM +0200, Zdenek Kotala wrote:

Tom Lane napsal(a):

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

The problem was generated, because -fast option was set only for the
compiler and not for the linker. Linker takes wrong version of
libraries. If -fast is set for both then horology test is OK, but
question was if float optimalization should generate some problems.

So FAQ_Solaris needs to tell people to put -fast in both CFLAGS and
LDFLAGS?

Exactly, but I want to sure, that float optimalization is safe and
should be applied for postgres, because -fast breaks IEE754 standard. If
it is OK I will adjust FAQ_Solaris.

Zdenek

Unless the packager understands the floating point usage of every
piece and module included and the effect that the -fast option will
have on them, please do not recommend it for anything but extremely
well tested dedicated use-cases. When it causes problems, it can
be terrible if the problems are not detected immediately. Massive
data corruption could occur.

Given these caveats, in a well tested use-case the -fast option can
squeeze a bit more from the CPU and could be used. I have had to
debug the fallout from the -fast option in other software in the
past. Let's just say, backups are a good thing.

I would vote not to recommend it without very strong cautions similar
to was Sun includes in the compiler manual pages.

Ken

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Bruce Momjian bruce@momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +