Floating point comparison inconsistencies of the geometric types

Started by Emre Hasegelialmost 10 years ago46 messageshackers
Jump to latest
#1Emre Hasegeli
emre@hasegeli.com

There are those macros defined for the built-in geometric types:

#define EPSILON 1.0E-06

#define FPzero(A) (fabs(A) <= EPSILON)
#define FPeq(A,B) (fabs((A) - (B)) <= EPSILON)
#define FPne(A,B) (fabs((A) - (B)) > EPSILON)
#define FPlt(A,B) ((B) - (A) > EPSILON)
#define FPle(A,B) ((A) - (B) <= EPSILON)
#define FPgt(A,B) ((A) - (B) > EPSILON)
#define FPge(A,B) ((B) - (A) <= EPSILON)

with this warning:

* XXX These routines were not written by a numerical analyst.

Most of the geometric operators use those macros for comparison, but
those do not:

* polygon << polygon
* polygon &< polygon
* polygon &> polygon
* polygon >> polygon
* polygon <<| polygon
* polygon &<| polygon
* polygon |&> polygon
* polygon |>> polygon
* box @> point
* point <@ box
* lseg <@ box
* circle @> point
* point <@ circle

This is really a bug that needs to be fixed one way or another. I think
that it is better to fix it by removing the macros all together. I
am not sure how useful they are in practice. I haven't seen anyone
complaining about the above operators not using the macros. Though,
people often complain about the ones using the macros and the problems
caused by them.

The hackers evidently don't like the macros, either. That should be
why they are not used on the new operators. What annoys me most about
this situation is the inconsistency blocks demonstrating our indexes.
Because of this, we had to rip out point type support from BRIN
inclusion operator class which could be useful to PostGIS.

Fixing it has been discussed many times before [1]/messages/by-id/D90A5A6C612A39408103E6ECDD77B8290FD4E3@voyager.corporate.connx.com[2]/messages/by-id/4A7C2C4B.5020508@netspace.net.au[3]/messages/by-id/12549.1346111029@sss.pgh.pa.us[4]/messages/by-id/20150512181307.GJ2523@alvh.no-ip.org with
no result. Here is my plan to fix the situation covering the problems
around it:

1) Reimplement some operators to avoid divisions

The attached patch does it on some of the line operators.

2) Use exact comparison on everywhere except the operators fuzzy
comparison is certainly required

The attach patch changes all operators except some "lseg" operators.
"lseg" stores two points on a line. Most of the operations done on it
are lossy. I don't see a problem treating them differently, those
operators are very unlikely to be index supported.

3) Check numbers for underflow and overflow

I am thinking to use CHECKFLOATVAL on utils/adt/float.c, but it is not
exposed outside at the moment. Would it be okay to create a new header
file utils/float.h to increase code sharing between float and geometric
types?

4) Implement relative fuzzy comparison for the remaining operators

It is better to completely get rid of those macros while we are on it.
I think we can get away by implementing just a single function for fuzzy
equality, not for other comparisons. I am inclined to put it to
utils/adt/float.c. Is this a good idea?

Tom Lane commented on the function posted to the list [1]/messages/by-id/D90A5A6C612A39408103E6ECDD77B8290FD4E3@voyager.corporate.connx.com on 2002:

Not like that. Perhaps use a fraction of the absolute value of the
one with larger absolute value. As-is it's hard to tell how FLT_EPSILON
is measured.

I cannot image how the function would look like. I would appreciate
any guidance.

5) Implement default hash operator class for all geometric types

This would solve most complained problem of those types allowing them
to used with DISTINCT and GROUP BY.

6) Implement default btree operator class at least for the point type

This would let the point type to be used with ORDER BY. It is
relatively straight forward to implement it for the point type. Is it
a good idea to somehow implement it for other types?

7) Add GiST index support for missing cross type operators

Currently only contained by operators are supported by an out-of-range
strategy number. I think we can make the operator classes much nicer
by allowing really cross type operator families.

Comments?

[1]: /messages/by-id/D90A5A6C612A39408103E6ECDD77B8290FD4E3@voyager.corporate.connx.com
[2]: /messages/by-id/4A7C2C4B.5020508@netspace.net.au
[3]: /messages/by-id/12549.1346111029@sss.pgh.pa.us
[4]: /messages/by-id/20150512181307.GJ2523@alvh.no-ip.org

Attachments:

0001-geo-ops-fpcomp-v01.patchtext/x-diff; charset=utf-8; name=0001-geo-ops-fpcomp-v01.patchDownload+193-209
#2Robert Haas
robertmhaas@gmail.com
In reply to: Emre Hasegeli (#1)
Re: Floating point comparison inconsistencies of the geometric types

On Fri, May 27, 2016 at 6:43 AM, Emre Hasegeli <emre@hasegeli.com> wrote:

There are those macros defined for the built-in geometric types:

#define EPSILON 1.0E-06

#define FPzero(A) (fabs(A) <= EPSILON)
#define FPeq(A,B) (fabs((A) - (B)) <= EPSILON)
#define FPne(A,B) (fabs((A) - (B)) > EPSILON)
#define FPlt(A,B) ((B) - (A) > EPSILON)
#define FPle(A,B) ((A) - (B) <= EPSILON)
#define FPgt(A,B) ((A) - (B) > EPSILON)
#define FPge(A,B) ((B) - (A) <= EPSILON)

with this warning:

* XXX These routines were not written by a numerical analyst.

I agree that those macros looks like a pile of suck. It's unclear to
me what purpose they're trying to accomplish, but regardless of what
it is, it's hard for me to believe that they are accomplishing it.
Whether 1.0E-06 is a correct fuzz factor presumably depends greatly on
the scale of the number; e.g. if the values are all like "3" it's
probably fine but if they are like "37142816124856" it's probably not
enough fuzz and if they are all like ".00000004" it's probably way too
much fuzz.

Figuring out what to do about it is harder. Your proposal seems to be
to remove them except where we need the fuzzy behavior, which doesn't
sound unreasonable, but I don't personally understand why we need it
in some places and not others. It would be good if some of the people
who are more numerically inclined than I am (and hate floats less, but
then that's everyone) could jump in here.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#2)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Jun 1, 2016 at 8:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, May 27, 2016 at 6:43 AM, Emre Hasegeli <emre@hasegeli.com> wrote:

There are those macros defined for the built-in geometric types:

#define EPSILON 1.0E-06

#define FPzero(A) (fabs(A) <= EPSILON)
#define FPeq(A,B) (fabs((A) - (B)) <= EPSILON)
#define FPne(A,B) (fabs((A) - (B)) > EPSILON)
#define FPlt(A,B) ((B) - (A) > EPSILON)
#define FPle(A,B) ((A) - (B) <= EPSILON)
#define FPgt(A,B) ((A) - (B) > EPSILON)
#define FPge(A,B) ((B) - (A) <= EPSILON)

with this warning:

* XXX These routines were not written by a numerical analyst.

I agree that those macros looks like a pile of suck.

+1

It's unclear to
me what purpose they're trying to accomplish, but regardless of what
it is, it's hard for me to believe that they are accomplishing it.
Whether 1.0E-06 is a correct fuzz factor presumably depends greatly on
the scale of the number; e.g. if the values are all like "3" it's
probably fine but if they are like "37142816124856" it's probably not
enough fuzz and if they are all like ".00000004" it's probably way too
much fuzz.

Also, it's a pretty lame heuristic. It doesn't really eliminate
*any* problems; it just aims to make them less frequent by shifting
the edges around. In doing so, it creates whole new classes of
problems:

test=# \set A '''(1.9999996,1.9999996),(1,1)''::box'
test=# \set B '''(2,2),(1,1)''::box'
test=# \set C '''(2.0000004,2.0000004),(1,1)''::box'
test=# select :A = :B, :B = :C, :A = :C;
?column? | ?column? | ?column?
----------+----------+----------
t | t | f
(1 row)

Is the benefit we get from the macros worth destroying the
transitive property of the comparison operators on these types?

Figuring out what to do about it is harder. Your proposal seems to be
to remove them except where we need the fuzzy behavior, which doesn't
sound unreasonable, but I don't personally understand why we need it
in some places and not others.

+1

My first inclination is to remove those macros in version 10, but
it would be good to hear from some people using these types on what
the impact of that would be.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#3)
Re: Floating point comparison inconsistencies of the geometric types

Kevin Grittner <kgrittn@gmail.com> writes:

On Wed, Jun 1, 2016 at 8:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Figuring out what to do about it is harder. Your proposal seems to be
to remove them except where we need the fuzzy behavior, which doesn't
sound unreasonable, but I don't personally understand why we need it
in some places and not others.

+1

My first inclination is to remove those macros in version 10, but
it would be good to hear from some people using these types on what
the impact of that would be.

As I understand it, the key problem is that tests like "is point on line"
would basically never succeed except in the most trivial cases, because of
roundoff error. That's not very nice, and it might cascade to larger
problems like object-containment tests failing unexpectedly. We would
need to go through all the geometric operations and figure out where that
kind of gotcha is significant and what we can do about it. Seems like a
fair amount of work :-(. If somebody's willing to do that kind of
investigation, then sure, but I don't think just blindly removing these
macros is going to lead to anything good.

Also, I suppose this means that Robert promises not to make any of his
usual complaints about breaking compatibility? Because we certainly
would be.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#4)
Re: Floating point comparison inconsistencies of the geometric types

On 6/1/16 9:27 AM, Tom Lane wrote:

Kevin Grittner <kgrittn@gmail.com> writes:

On Wed, Jun 1, 2016 at 8:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:

Figuring out what to do about it is harder. Your proposal seems to be
to remove them except where we need the fuzzy behavior, which doesn't
sound unreasonable, but I don't personally understand why we need it
in some places and not others.

+1
My first inclination is to remove those macros in version 10, but
it would be good to hear from some people using these types on what
the impact of that would be.

As I understand it, the key problem is that tests like "is point on line"
would basically never succeed except in the most trivial cases, because of
roundoff error. That's not very nice, and it might cascade to larger
problems like object-containment tests failing unexpectedly. We would
need to go through all the geometric operations and figure out where that
kind of gotcha is significant and what we can do about it. Seems like a
fair amount of work :-(. If somebody's willing to do that kind of
investigation, then sure, but I don't think just blindly removing these
macros is going to lead to anything good.

I suspect another wrinkle here is that in the GIS world a single point
can be represented it multiple reference/coordinate systems, and it
would have different values in each of them. AIUI the transforms between
those systems can be rather complicated if different projection methods
are involved. I don't know if PostGIS depends on what these macros are
doing or not. If it doesn't, perhaps it would be sufficient to lop of
the last few bits of the significand. ISTM that'd be much better than
what the macros currently do.

BTW, I suspect the macro values were chosen specifically for dealing
with LAT/LONG.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Jim Nasby (#5)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Jun 1, 2016 at 7:52 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

I suspect another wrinkle here is that in the GIS world a single point can
be represented it multiple reference/coordinate systems, and it would have
different values in each of them. AIUI the transforms between those systems
can be rather complicated if different projection methods are involved. I
don't know if PostGIS depends on what these macros are doing or not. If it
doesn't, perhaps it would be sufficient to lop of the last few bits of the
significand. ISTM that'd be much better than what the macros currently do.

We don't depend on these, we have our own :/
The real answer for a GIS system is to have an explicit tolerance
parameter for calculations like distance/touching/containment, but
unfortunately we didn't do that so now we have our own
compatibility/boil the ocean problem if we ever wanted/were funded to
add one.
P.

BTW, I suspect the macro values were chosen specifically for dealing with
LAT/LONG.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Joe Conway
mail@joeconway.com
In reply to: Jim Nasby (#5)
Re: Floating point comparison inconsistencies of the geometric types

On 06/01/2016 07:52 AM, Jim Nasby wrote:

On 6/1/16 9:27 AM, Tom Lane wrote:

Kevin Grittner <kgrittn@gmail.com> writes:

On Wed, Jun 1, 2016 at 8:08 AM, Robert Haas <robertmhaas@gmail.com>

wrote:

Figuring out what to do about it is harder. Your proposal seems

to be

to remove them except where we need the fuzzy behavior, which

doesn't

sound unreasonable, but I don't personally understand why we need it
in some places and not others.

+1
My first inclination is to remove those macros in version 10, but
it would be good to hear from some people using these types on what
the impact of that would be.

As I understand it, the key problem is that tests like "is point on line"
would basically never succeed except in the most trivial cases,
because of
roundoff error. That's not very nice, and it might cascade to larger
problems like object-containment tests failing unexpectedly. We would
need to go through all the geometric operations and figure out where that
kind of gotcha is significant and what we can do about it. Seems like a
fair amount of work :-(. If somebody's willing to do that kind of
investigation, then sure, but I don't think just blindly removing these
macros is going to lead to anything good.

I suspect another wrinkle here is that in the GIS world a single point
can be represented it multiple reference/coordinate systems, and it
would have different values in each of them. AIUI the transforms between
those systems can be rather complicated if different projection methods
are involved. I don't know if PostGIS depends on what these macros are
doing or not. If it doesn't, perhaps it would be sufficient to lop of
the last few bits of the significand. ISTM that'd be much better than
what the macros currently do.

IIRC PostGIS uses a function from libgeos to do things like "point
equals" (ST_Equals). I've never looked at that source, but would be
unsurprised to find that it does something similar to this although
probably also more sophisticated.

(looks) yes -- ST_Equals calls GEOSEquals() after some sanity checking...

BTW, I suspect the macro values were chosen specifically for dealing
with LAT/LONG.

I think that is it exactly.

Joe

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

#8Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Paul Ramsey (#6)
Re: Floating point comparison inconsistencies of the geometric types

On 6/1/16 10:03 AM, Paul Ramsey wrote:

We don't depend on these, we have our own :/
The real answer for a GIS system is to have an explicit tolerance
parameter for calculations like distance/touching/containment, but
unfortunately we didn't do that so now we have our own
compatibility/boil the ocean problem if we ever wanted/were funded to
add one.

Well it sounds like what's currently happening in Postgres is probably
going to change, so how might we structure that to help PostGIS? Would
simply lopping off the last few bits of the significand/mantissa work,
or is that not enough when different GRSes are involved?
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Jim Nasby (#8)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Jun 1, 2016 at 8:59 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

On 6/1/16 10:03 AM, Paul Ramsey wrote:

We don't depend on these, we have our own :/
The real answer for a GIS system is to have an explicit tolerance
parameter for calculations like distance/touching/containment, but
unfortunately we didn't do that so now we have our own
compatibility/boil the ocean problem if we ever wanted/were funded to
add one.

Well it sounds like what's currently happening in Postgres is probably going
to change, so how might we structure that to help PostGIS? Would simply
lopping off the last few bits of the significand/mantissa work, or is that
not enough when different GRSes are involved?

PostGIS doesn't look at all at what the PgSQL geotypes do, so go
forward w/o fear. Tolerance in geo world is more than vertex rounding
though, it's things like saying that when distance(pt,line) < epsilon
then distance(pt,line) == 0, or similarly for shape touching, etc. One
of the things people find annoying about postgis is that

ST_Intersects(ST_Intersection(a, b), a) can come out as false (a
derived point at a crossing of lines may not exactly intersect either
of the input lines), which is a direct result of our use of exact math
for the boolean intersects test.

Anyways, go forth and do whatever makes sense for PgSQL

P

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532) mobile: 512-569-9461

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Paul Ramsey (#9)
Re: Floating point comparison inconsistencies of the geometric types

Paul Ramsey <pramsey@cleverelephant.ca> writes:

One of the things people find annoying about postgis is that
ST_Intersects(ST_Intersection(a, b), a) can come out as false (a
derived point at a crossing of lines may not exactly intersect either
of the input lines), which is a direct result of our use of exact math
for the boolean intersects test.

That's an interesting comment, because it's more or less exactly the
type of failure that we could expect to get if we remove fuzzy comparisons
from the built-in types. How much of a problem is it in practice for
PostGIS users? Do you have any plans to change it?

Anyways, go forth and do whatever makes sense for PgSQL

I think we're trying to figure out what that is ...

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#4)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Jun 1, 2016 at 10:27 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

As I understand it, the key problem is that tests like "is point on line"
would basically never succeed except in the most trivial cases, because of
roundoff error. That's not very nice, and it might cascade to larger
problems like object-containment tests failing unexpectedly. We would
need to go through all the geometric operations and figure out where that
kind of gotcha is significant and what we can do about it. Seems like a
fair amount of work :-(. If somebody's willing to do that kind of
investigation, then sure, but I don't think just blindly removing these
macros is going to lead to anything good.

Yeah, it does seem to need some research.

Also, I suppose this means that Robert promises not to make any of his
usual complaints about breaking compatibility? Because we certainly
would be.

Pot, meet Mr. Kettle!

Obviously, the inconvenience caused by any backward incompatibility
has to be balanced against the fact that the new behavior is
presumably better. But I stridently object to the accusation that of
the two of us I'm the one more concerned with backward-compatibility.
There may be some instances where I've had a more conservative
judgement than you about breaking user-facing stuff, but you've
blocked dozens of changes to the C API that would have enabled
meaningful extension development on the grounds that somebody might
complain when a future release changes the API! I think behavior
changes that users will notice are of vastly greater significance than
those which will only be observed by developers.

In this particular case, I think that the current behavior is pretty
stupid, and that the built-in geometric types are barely used,
possibly because they have stupid behavior. So I would be willing to
bet on a well-thought-out change in this area coming out to a net
positive.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Emre Hasegeli
emre@hasegeli.com
In reply to: Emre Hasegeli (#1)
Re: Floating point comparison inconsistencies of the geometric types

My progress so far is attached as 2 patches. First one introduces an
header file for adt/float.c. Second one refactors the geometric
operations.

I have removed the fuzzy comparison macros all together. It is very
hard to keep some of them and maintain consistency. Many operators
that would benefit from the fuzzy comparison is quite broken at the
moment [1]/messages/by-id/CAE2gYzw_-z=V2kh8QqFjenu=8MJXzOP44wRW=AzzeamrmTT1=Q@mail.gmail.com, anyway. I have also made the comparisons safe against NaN
[2]: /messages/by-id/28685.1468246504@sss.pgh.pa.us

[1]: /messages/by-id/CAE2gYzw_-z=V2kh8QqFjenu=8MJXzOP44wRW=AzzeamrmTT1=Q@mail.gmail.com
[2]: /messages/by-id/28685.1468246504@sss.pgh.pa.us

Attachments:

0001-float-header-v01.patchapplication/octet-stream; name=0001-float-header-v01.patchDownload+246-296
0002-geo-ops-fpcomp-v02.patchapplication/octet-stream; name=0002-geo-ops-fpcomp-v02.patchDownload+944-1124
#13Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Emre Hasegeli (#12)
Re: Floating point comparison inconsistencies of the geometric types

On Mon, Jul 18, 2016 at 3:54 PM, Emre Hasegeli <emre@hasegeli.com> wrote:

My progress so far is attached as 2 patches. First one introduces an
header file for adt/float.c. Second one refactors the geometric
operations.

The first patch fails to apply due to bit-rot. That's easy enough
to correct, but then it runs into warnings on make:

btree_gin.c: In function ‘leftmostvalue_float4’:
btree_gin.c:234:2: error: implicit declaration of function
‘get_float4_infinity’ [-Werror=implicit-function-declaration]
return Float4GetDatum(-get_float4_infinity());
^
btree_gin.c: In function ‘leftmostvalue_float8’:
btree_gin.c:242:2: error: implicit declaration of function
‘get_float8_infinity’ [-Werror=implicit-function-declaration]
return Float8GetDatum(-get_float8_infinity());
^

Please fix.

Something to consider before posting new version -- should we
change some of those macros to static inline (in the .h files) to
avoid double-evaluation hazards? They might perform as well or
even better that way, and remove a subtle programmer foot-gun.

Changing status to "Waiting on Author".

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Emre Hasegeli
emre@hasegeli.com
In reply to: Kevin Grittner (#13)
Re: Floating point comparison inconsistencies of the geometric types

The first patch fails to apply due to bit-rot. That's easy enough
to correct, but then it runs into warnings on make:

Rebased and fixed.

Something to consider before posting new version -- should we
change some of those macros to static inline (in the .h files) to
avoid double-evaluation hazards? They might perform as well or
even better that way, and remove a subtle programmer foot-gun.

One reason to keep them as macros is that they are currently
supporting both float4 and float8. Currently they look like this:

FLOAT_EQ(val1, val2)
FLOAT_PL(result, val1, val2)

I guess if we would use inline functions, they would look like:

FLOAT8_EQ(val1, val2)
result = FLOAT8_PL(val1, val2)

Result would need to be defined again in the function to be checked
for overflow. I think this way would be more developer-friendly. I
have gone some trouble to allocate the result to be able to use the
macros. Do you know if the performance of both would be the same?

I am not much experienced in C. If you think that inline functions
are better suited, I can rework the patches.

Attachments:

0001-float-header-v02.patchapplication/octet-stream; name=0001-float-header-v02.patchDownload+250-300
0002-geo-ops-fpcomp-v02.patchapplication/octet-stream; name=0002-geo-ops-fpcomp-v02.patchDownload+944-1124
#15Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Emre Hasegeli (#14)
Re: Floating point comparison inconsistencies of the geometric types

On Sun, Sep 4, 2016 at 12:42 PM, Emre Hasegeli <emre@hasegeli.com> wrote:

The first patch fails to apply due to bit-rot. That's easy enough
to correct, but then it runs into warnings on make:

Rebased and fixed.

These patches apply and build on top of 5c609a74 with no problems,
but `make check` finds differences per the attached. Please
investigate why the regression tests are failing and what the
appropriate response is.

Something to consider before posting new version -- should we
change some of those macros to static inline (in the .h files) to
avoid double-evaluation hazards? They might perform as well or
even better that way, and remove a subtle programmer foot-gun.

One reason to keep them as macros is that they are currently
supporting both float4 and float8. Currently they look like this:

FLOAT_EQ(val1, val2)
FLOAT_PL(result, val1, val2)

I guess if we would use inline functions, they would look like:

FLOAT8_EQ(val1, val2)
result = FLOAT8_PL(val1, val2)

Result would need to be defined again in the function to be checked
for overflow. I think this way would be more developer-friendly. I
have gone some trouble to allocate the result to be able to use the
macros. Do you know if the performance of both would be the same?

In one case where I was concerned that a static inline definition
would not perform as well as a macro, I went so far as to compare
the compiled code for both at a number of call sites in an
optimized build and found that a reasonably current gcc generated
*identical* code for both. Of course, this will not always be the
case -- in particular for macros which require multiple evaluations
of one or more parameters and they are not simple literals. In
such cases, the static inline often benchmarks faster because the
results from the potentially expensive expression can be stored on
the stack or in a register, and just referenced again.

I am not much experienced in C. If you think that inline functions
are better suited, I can rework the patches.

I suspect that they will be as fast or faster, and they eliminate
the hazard of multiple evaluation, where a programmer might not be
aware of the multiple evaluation or of some side-effect of an
argument.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

regression.diffsapplication/octet-stream; name=regression.diffsDownload+18-18
#16Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#15)
Re: Floating point comparison inconsistencies of the geometric types

On Fri, Sep 9, 2016 at 4:25 PM, Kevin Grittner <kgrittn@gmail.com> wrote:

On Sun, Sep 4, 2016 at 12:42 PM, Emre Hasegeli <emre@hasegeli.com> wrote:

These patches apply and build on top of 5c609a74 with no problems,
but `make check` finds differences per the attached. Please
investigate why the regression tests are failing and what the
appropriate response is.

I am not much experienced in C. If you think that inline functions
are better suited, I can rework the patches.

I suspect that they will be as fast or faster, and they eliminate
the hazard of multiple evaluation, where a programmer might not be
aware of the multiple evaluation or of some side-effect of an
argument.

Emre, are you going to address the above? It would have to be Real
Soon Now.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Emre Hasegeli
emre@hasegeli.com
In reply to: Kevin Grittner (#16)
Re: Floating point comparison inconsistencies of the geometric types

Emre, are you going to address the above? It would have to be Real
Soon Now.

Yes, I am working on it. I found more problems, replaced more
algorithms. That took a lot of time. I will post the new version
really soon. I wouldn't feel bad, if you wouldn't have enough time to
review it in this commitfest.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Emre Hasegeli
emre@hasegeli.com
In reply to: Kevin Grittner (#15)
Re: Floating point comparison inconsistencies of the geometric types

These patches apply and build on top of 5c609a74 with no problems,
but `make check` finds differences per the attached. Please
investigate why the regression tests are failing and what the
appropriate response is.

I fixed the first one and workaround the second with COLLATE "C". I
have how my changes caused this regression.

"select_views" test runs "SELECT name, #thepath FROM iexit ORDER BY 1,
2" and expects to get rows in this order:

I- 580 Ramp | 8
I- 580/I-680 Ramp | 2

With the collation on my laptop, this is actually true:

regression=# select 'I- 580/I-680 Ramp' < 'I- 580 Ramp';
?column?
----------
t
(1 row)

However, on the Linux server, I am testing it is not:

regression=# select 'I- 580 Ramp' < 'I- 580/I-680 Ramp';
?column?
----------
f
(1 row)

Do you know how it is not failing on the master?

I suspect that they will be as fast or faster, and they eliminate
the hazard of multiple evaluation, where a programmer might not be
aware of the multiple evaluation or of some side-effect of an
argument.

I reworked the the patches to use inline functions and fixed the
problems I found. The new versions are attached.

Attachments:

0001-float-header-v03.patchapplication/octet-stream; name=0001-float-header-v03.patchDownload+554-523
0002-geo-ops-fpcomp-v03.patchapplication/octet-stream; name=0002-geo-ops-fpcomp-v03.patchDownload+890-954
#19Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Emre Hasegeli (#18)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Sep 28, 2016 at 12:02 PM, Emre Hasegeli <emre@hasegeli.com> wrote:

`make check` finds differences per the attached. Please
investigate why the regression tests are failing and what the
appropriate response is.

I fixed the first one and workaround the second with COLLATE "C". I
have how my changes caused this regression.

"select_views" test runs "SELECT name, #thepath FROM iexit ORDER BY 1,
2" and expects to get rows in this order:

I- 580 Ramp | 8
I- 580/I-680 Ramp | 2

With the collation on my laptop, this is actually true:

regression=# select 'I- 580/I-680 Ramp' < 'I- 580 Ramp';
?column?
----------
t
(1 row)

However, on the Linux server, I am testing it is not:

regression=# select 'I- 580 Ramp' < 'I- 580/I-680 Ramp';
?column?
----------
f
(1 row)

Do you know how it is not failing on the master?

Well, those two results are not contradictory -- notice that you
switched the order of the values in the comparison. I don't think
you've really found the explanation yet.

[discussing inline static functions compared to macros for min()/max(), etc.]
I suspect that they will be as fast or faster, and they eliminate
the hazard of multiple evaluation, where a programmer might not be
aware of the multiple evaluation or of some side-effect of an
argument.

I reworked the the patches to use inline functions and fixed the
problems I found. The new versions are attached.

Will take a look and post again.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#19)
Re: Floating point comparison inconsistencies of the geometric types

On Wed, Sep 28, 2016 at 2:04 PM, Kevin Grittner <kgrittn@gmail.com> wrote:

Will take a look and post again.

I am moving this patch to the next CF. You'll be hearing from me
sometime after this CF is closed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Emre Hasegeli
emre@hasegeli.com
In reply to: Kevin Grittner (#19)
#22Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#21)
#23Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#22)
#24Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#23)
#25Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#24)
#26Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#24)
#27Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#25)
#28Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#27)
#29Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#28)
#30Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#29)
#31Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#30)
#32Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#31)
#33Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#31)
#34Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#33)
#35Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#34)
#36Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Kyotaro Horiguchi (#35)
#37Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#36)
#38Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#37)
#39Emre Hasegeli
emre@hasegeli.com
In reply to: Kyotaro Horiguchi (#38)
#40Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Emre Hasegeli (#39)
#41Michael Paquier
michael@paquier.xyz
In reply to: Kyotaro Horiguchi (#40)
#42Robert Haas
robertmhaas@gmail.com
In reply to: Emre Hasegeli (#39)
#43Emre Hasegeli
emre@hasegeli.com
In reply to: Robert Haas (#42)
#44Robert Haas
robertmhaas@gmail.com
In reply to: Emre Hasegeli (#43)
#45Emre Hasegeli
emre@hasegeli.com
In reply to: Robert Haas (#44)
#46David Steele
david@pgmasters.net
In reply to: Emre Hasegeli (#45)