Strange behavior with polygon and NaN

Started by Jesse Zhangalmost 6 years ago31 messageshackers

sbjesse@gmail.com

almost 6 years ago

Hi hackers,

While working with Chris Hajas on merging Postgres 12 with Greenplum
Database we stumbled upon the following strange behavior in the geometry
type polygon:

------ >8 --------

CREATE TEMP TABLE foo (p point);
CREATE INDEX ON foo USING gist(p);

INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');

SELECT $q$
SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
$q$ AS qry \gset

BEGIN;
SAVEPOINT yolo;
SET LOCAL enable_seqscan TO off;
:qry;

ROLLBACK TO SAVEPOINT yolo;
SET LOCAL enable_indexscan TO off;
SET LOCAL enable_bitmapscan TO off;
:qry;

------ 8< --------

If you run the above repro SQL in HEAD (and 12, and likely all older
versions), you get the following output:

CREATE TABLE
CREATE INDEX
INSERT 0 3
BEGIN
SAVEPOINT
SET
p
-------
(0,0)
(1,1)
(2 rows)

ROLLBACK
SET
SET
p
-----------
(0,0)
(1,1)
(NaN,NaN)
(3 rows)

At first glance, you'd think this is the gist AM's bad, but on a second
thought, something else is strange here. The following query returns
true:

SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'

The above behavior of the "contained in" operator is surprising, and
it's probably not what the GiST AM is expecting. I took a look at
point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
distnace operator for polygon. It gives the following interesting
output:

SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
FROM (
SELECT circle(point(100 * i, 'NaN'), 50) AS c
FROM generate_series(-2, 4) i
) t(c)
ORDER BY 2;

c | distance
-----------------+----------
<(-200,NaN),50> | 0
<(-100,NaN),50> | 0
<(0,NaN),50> | 0
<(100,NaN),50> | 0
<(200,NaN),50> | NaN
<(300,NaN),50> | NaN
<(400,NaN),50> | NaN
(7 rows)

Should they all be NaN? Am I alone in thinking the index is right but
the operators are wrong? Or should we call the indexes wrong here?

Cheers,
Jesse and Chris

Bruce Momjian

bruce@momjian.us

almost 6 years ago

In reply to: Jesse Zhang (#1)

Re: Strange behavior with polygon and NaN

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

---------------------------------------------------------------------------

On Wed, Jun 24, 2020 at 03:11:03PM -0700, Jesse Zhang wrote:

Hi hackers,

While working with Chris Hajas on merging Postgres 12 with Greenplum
Database we stumbled upon the following strange behavior in the geometry
type polygon:

------ >8 --------

CREATE TEMP TABLE foo (p point);
CREATE INDEX ON foo USING gist(p);

INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');

SELECT $q$
SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
$q$ AS qry \gset

BEGIN;
SAVEPOINT yolo;
SET LOCAL enable_seqscan TO off;
:qry;

ROLLBACK TO SAVEPOINT yolo;
SET LOCAL enable_indexscan TO off;
SET LOCAL enable_bitmapscan TO off;
:qry;

------ 8< --------

If you run the above repro SQL in HEAD (and 12, and likely all older
versions), you get the following output:

CREATE TABLE
CREATE INDEX
INSERT 0 3
BEGIN
SAVEPOINT
SET
p
-------
(0,0)
(1,1)
(2 rows)

ROLLBACK
SET
SET
p
-----------
(0,0)
(1,1)
(NaN,NaN)
(3 rows)

At first glance, you'd think this is the gist AM's bad, but on a second
thought, something else is strange here. The following query returns
true:

SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'

The above behavior of the "contained in" operator is surprising, and
it's probably not what the GiST AM is expecting. I took a look at
point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
distnace operator for polygon. It gives the following interesting
output:

SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
FROM (
SELECT circle(point(100 * i, 'NaN'), 50) AS c
FROM generate_series(-2, 4) i
) t(c)
ORDER BY 2;

c | distance
-----------------+----------
<(-200,NaN),50> | 0
<(-100,NaN),50> | 0
<(0,NaN),50> | 0
<(100,NaN),50> | 0
<(200,NaN),50> | NaN
<(300,NaN),50> | NaN
<(400,NaN),50> | NaN
(7 rows)

Should they all be NaN? Am I alone in thinking the index is right but
the operators are wrong? Or should we call the indexes wrong here?

Cheers,
Jesse and Chris

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EnterpriseDB https://enterprisedb.com

The usefulness of a cup is in its emptiness, Bruce Lee

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 6 years ago

In reply to: Bruce Momjian (#2)

Re: Strange behavior with polygon and NaN

At Tue, 25 Aug 2020 19:03:50 -0400, Bruce Momjian <bruce@momjian.us> wrote in

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

I agree to that the behavior is broken.

---------------------------------------------------------------------------

On Wed, Jun 24, 2020 at 03:11:03PM -0700, Jesse Zhang wrote:

Hi hackers,

While working with Chris Hajas on merging Postgres 12 with Greenplum
Database we stumbled upon the following strange behavior in the geometry
type polygon:

------ >8 --------

CREATE TEMP TABLE foo (p point);
CREATE INDEX ON foo USING gist(p);

INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN');

SELECT $q$
SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'
$q$ AS qry \gset

BEGIN;
SAVEPOINT yolo;
SET LOCAL enable_seqscan TO off;
:qry;

ROLLBACK TO SAVEPOINT yolo;
SET LOCAL enable_indexscan TO off;
SET LOCAL enable_bitmapscan TO off;
:qry;

------ 8< --------

If you run the above repro SQL in HEAD (and 12, and likely all older
versions), you get the following output:

CREATE TABLE
CREATE INDEX
INSERT 0 3
BEGIN
SAVEPOINT
SET
p
-------
(0,0)
(1,1)
(2 rows)

ROLLBACK
SET
SET
p
-----------
(0,0)
(1,1)
(NaN,NaN)
(3 rows)

At first glance, you'd think this is the gist AM's bad, but on a second
thought, something else is strange here. The following query returns
true:

SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)'

The above behavior of the "contained in" operator is surprising, and
it's probably not what the GiST AM is expecting. I took a look at
point_inside() in geo_ops.c, and it doesn't seem well equipped to handle
NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the
distnace operator for polygon. It gives the following interesting
output:

SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance
FROM (
SELECT circle(point(100 * i, 'NaN'), 50) AS c
FROM generate_series(-2, 4) i
) t(c)
ORDER BY 2;

c | distance
-----------------+----------
<(-200,NaN),50> | 0
<(-100,NaN),50> | 0
<(0,NaN),50> | 0
<(100,NaN),50> | 0
<(200,NaN),50> | NaN
<(300,NaN),50> | NaN
<(400,NaN),50> | NaN
(7 rows)

Should they all be NaN? Am I alone in thinking the index is right but
the operators are wrong? Or should we call the indexes wrong here?

There may be other places that NaN is not fully considered. For
example, the following (perpendicular op) returns true.

SELECT lseg '[(0,0),(0,NaN)]' ?-| lseg '[(0,0),(10,0)]';

It is quite hard to fix all of the defect..

For the above cases, it's enough to make sure that point_inside don't
pass NaN's to lseg_crossing(), but it's much of labor to fill all
holes reasonable and principled way..

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Tom Lane

tgl@sss.pgh.pa.us

almost 6 years ago

In reply to: Kyotaro Horiguchi (#3)

Re: Strange behavior with polygon and NaN

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

At Tue, 25 Aug 2020 19:03:50 -0400, Bruce Momjian <bruce@momjian.us> wrote in

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

I agree to that the behavior is broken.

Yeah, but ... what is "non broken" in this case? I'm not convinced
that having point_inside() return zero for any case involving NaN
is going to lead to noticeably saner behavior than today.

regards, tom lane

Kyotaro Horiguchi

horikyota.ntt@gmail.com

almost 6 years ago

In reply to: Tom Lane (#4)

Re: Strange behavior with polygon and NaN

At Wed, 26 Aug 2020 08:18:49 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

At Tue, 25 Aug 2020 19:03:50 -0400, Bruce Momjian <bruce@momjian.us> wrote in

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

I agree to that the behavior is broken.

Yeah, but ... what is "non broken" in this case? I'm not convinced
that having point_inside() return zero for any case involving NaN
is going to lead to noticeably saner behavior than today.

Yes, just doing that leaves many unfixed behavior come from NaNs, but
at least it seems to me one of sane definition candidates that a point
cannot be inside a polygon when NaN is involved. It's similar to
Fpxx() returns false if NaN is involved. As mentioned, I had't fully
checked and haven't considered this seriously, but I changed my mind
to check all the callers. I started checking all the callers of
point_inside, then finally I had to check all functions in geo_ops.c:(

The attached is the result as of now.

=== Resulting behavior

With the attached:

- All boolean functions return false if NaN is involved.
- All float8 functions return NaN if NaN is involved.
- All geometric arithmetics return NaN as output if NaN is involved.

With some exceptions:
- line_eq: needs to consider that NaNs are equal each other.
- point_eq/ne (point_eq_pint): ditto
- lseg_eq/ne: ditto

The change makes some difference in the regression test.
For example,

<obj containing NaN> <-> <any obj> changed from 0 to NaN. (distance)
<obj containing NaN> <@ <any obj> changed from true to false. (contained)
<obj containing NaN> <-> <any obj> changed from 0 to NaN. (distance)
<obj containing NaN> ?# <any obj> changed from true to false (overlaps)

=== pg_hypot mistake?

I noticed that pg_hypot returns inf for the parameters (NaN, Inf) but
I think NaN is appropriate here since other operators behaves that
way. This change causes a change of distance between point(1e+300,Inf)
and line (1,-1,0) from infinity to NaN, which I think is correct
because the arithmetic generates NaN as an intermediate value.

=== Infinity annoyances

Infinity makes some not-great changes in regresssion results. For example:

- point '(1e+300,Infinity)' <-> path '((10,20))' returns
NaN(previously Infinity), but point '(1e+300,Infinity)' <-> path
'[(1,2),(3,4)]' returns Infinity. The difference of the two
expressions is whether (0 * Inf = NaN) is performed or not. The
former performs it but that was concealed by not propagating NaN to
upper layer without the patch.

- Without the patch, point '(1e+300,Infinity)' ## box '(2,2),(0,0)'
generates '(0,2)', which is utterly wrong. It is because
box_closept_point evaluates float8_lt(Inf, NaN) as true(!) and sets
the wrong point for distance=NaN is set. With the patch, the NaN
makes the result NULL.

- This is not a difference caused by this patch, but for both patched
and unpatched, point '(1e+300,Inf)' <-> line '{3,0,0}' returns NaN,
which should be 1e+300. However, the behavior comes from arithmetic
reasons and wouldn't be a problem.

create_index.out is changed since point(NaN,NaN) <@ polygon changed
from true to false, which seems rather saner.

I haven't checked unchanged results but at least changed results seems
saner to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Georgios Kokolatos

gkokolatos@protonmail.com

over 5 years ago

In reply to: Kyotaro Horiguchi (#5)

Re: Strange behavior with polygon and NaN

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, 27 August 2020 14:24, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Wed, 26 Aug 2020 08:18:49 -0400, Tom Lane tgl@sss.pgh.pa.us wrote in

Kyotaro Horiguchi horikyota.ntt@gmail.com writes:

At Tue, 25 Aug 2020 19:03:50 -0400, Bruce Momjian bruce@momjian.us wrote in

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

I agree to that the behavior is broken.

Yeah, but ... what is "non broken" in this case? I'm not convinced
that having point_inside() return zero for any case involving NaN
is going to lead to noticeably saner behavior than today.

Yes, just doing that leaves many unfixed behavior come from NaNs, but
at least it seems to me one of sane definition candidates that a point
cannot be inside a polygon when NaN is involved. It's similar to
Fpxx() returns false if NaN is involved. As mentioned, I had't fully
checked and haven't considered this seriously, but I changed my mind
to check all the callers. I started checking all the callers of
point_inside, then finally I had to check all functions in geo_ops.c:(

For what is worth, I agree with this definition.

The attached is the result as of now.

=== Resulting behavior

With the attached:

- All boolean functions return false if NaN is involved.
- All float8 functions return NaN if NaN is involved.
- All geometric arithmetics return NaN as output if NaN is involved.

Agreed! As in both this behaviour conforms to the definition above and the patch provides this behaviour with the exceptions below.

With some exceptions:

- line_eq: needs to consider that NaNs are equal each other.
- point_eq/ne (point_eq_pint): ditto
- lseg_eq/ne: ditto

The change makes some difference in the regression test.
For example,

<obj containing NaN> <-> <any obj> changed from 0 to NaN. (distance)

<obj containing NaN> <@ <any obj> changed from true to false. (contained)
<obj containing NaN> <-> <any obj> changed from 0 to NaN. (distance)
<obj containing NaN> ?# <any obj> changed from true to false (overlaps)

=== pg_hypot mistake?

I noticed that pg_hypot returns inf for the parameters (NaN, Inf) but
I think NaN is appropriate here since other operators behaves that
way. This change causes a change of distance between point(1e+300,Inf)
and line (1,-1,0) from infinity to NaN, which I think is correct
because the arithmetic generates NaN as an intermediate value.

=== Infinity annoyances

Infinity makes some not-great changes in regresssion results. For example:

- point '(1e+300,Infinity)' <-> path '((10,20))' returns
NaN(previously Infinity), but point '(1e+300,Infinity)' <-> path
'[(1,2),(3,4)]' returns Infinity. The difference of the two
expressions is whether (0 * Inf = NaN) is performed or not. The
former performs it but that was concealed by not propagating NaN to
upper layer without the patch.

Although I understand the reasoning for this change. I am not certain I agree with the result. I feel that:
point '(1e+300,Infinity)' <-> path '((10,20))'
should return Infinity. Even if I am wrong to think that, the two results should be expected to behave the same. Am I wrong on that too?

- Without the patch, point '(1e+300,Infinity)' ## box '(2,2),(0,0)'
generates '(0,2)', which is utterly wrong. It is because
box_closept_point evaluates float8_lt(Inf, NaN) as true(!) and sets
the wrong point for distance=NaN is set. With the patch, the NaN
makes the result NULL.

Agreed.

- This is not a difference caused by this patch, but for both patched
and unpatched, point '(1e+300,Inf)' <-> line '{3,0,0}' returns NaN,
which should be 1e+300. However, the behavior comes from arithmetic
reasons and wouldn't be a problem.

create_index.out is changed since point(NaN,NaN) <@ polygon changed
from true to false, which seems rather saner.

I haven't checked unchanged results but at least changed results seems
saner to me.

All in all a great patch!

It is clean, well reasoned and carefully crafted.

Do you think that the documentation needs to get updated to the 'new' behaviour?

//Georgios

Show quoted text

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Georgios Kokolatos (#6)

Re: Strange behavior with polygon and NaN

Hello, Georgios.

At Mon, 07 Sep 2020 12:46:50 +0000, gkokolatos@pm.me wrote in

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, 27 August 2020 14:24, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Wed, 26 Aug 2020 08:18:49 -0400, Tom Lane tgl@sss.pgh.pa.us wrote in

Kyotaro Horiguchi horikyota.ntt@gmail.com writes:

At Tue, 25 Aug 2020 19:03:50 -0400, Bruce Momjian bruce@momjian.us wrote in

I can confirm that this two-month old email report still produces
different results with indexes on/off in git master, which I don't think
is ever correct behavior.

I agree to that the behavior is broken.

Yeah, but ... what is "non broken" in this case? I'm not convinced
that having point_inside() return zero for any case involving NaN
is going to lead to noticeably saner behavior than today.

Yes, just doing that leaves many unfixed behavior come from NaNs, but
at least it seems to me one of sane definition candidates that a point
cannot be inside a polygon when NaN is involved. It's similar to
Fpxx() returns false if NaN is involved. As mentioned, I had't fully
checked and haven't considered this seriously, but I changed my mind
to check all the callers. I started checking all the callers of
point_inside, then finally I had to check all functions in geo_ops.c:(

For what is worth, I agree with this definition.

Thanks.

The attached is the result as of now.

=== Resulting behavior

With the attached:

- All boolean functions return false if NaN is involved.
- All float8 functions return NaN if NaN is involved.
- All geometric arithmetics return NaN as output if NaN is involved.

Agreed! As in both this behaviour conforms to the definition above and the patch provides this behaviour with the exceptions below.

With some exceptions:

- line_eq: needs to consider that NaNs are equal each other.
- point_eq/ne (point_eq_pint): ditto
- lseg_eq/ne: ditto

...

=== pg_hypot mistake?

I noticed that pg_hypot returns inf for the parameters (NaN, Inf) but
I think NaN is appropriate here since other operators behaves that
way. This change causes a change of distance between point(1e+300,Inf)
and line (1,-1,0) from infinity to NaN, which I think is correct
because the arithmetic generates NaN as an intermediate value.

=== Infinity annoyances

Infinity makes some not-great changes in regresssion results. For example:

- point '(1e+300,Infinity)' <-> path '((10,20))' returns
NaN(previously Infinity), but point '(1e+300,Infinity)' <-> path
'[(1,2),(3,4)]' returns Infinity. The difference of the two
expressions is whether (0 * Inf = NaN) is performed or not. The
former performs it but that was concealed by not propagating NaN to
upper layer without the patch.

Although I understand the reasoning for this change. I am not certain I agree with the result. I feel that:
point '(1e+300,Infinity)' <-> path '((10,20))'
should return Infinity. Even if I am wrong to think that, the two results should be expected to behave the same. Am I wrong on that too?

No. Actually that's not correct and that just comes from avoiding
special code paths for Infinity. I put more thought on
line_interpt_line and found that that issue is "fixed" by just
simplifying formulas by removing invariants. But one more if-block is
needed to make the function work a symmetrical way, though..

However, still we have a similar issue.

point '(Infinity,1e+300)' <-> line '{-1,0,5}' => Infinity
point '(Infinity,1e+300)' <-> line '{0,-1,5}' => NaN
point '(Infinity,1e+300)' <-> line '{1,1,5}' => NaN

The second should be 1e+300 and the third infinity. This is because
line_closept_point taking the distance between the foot of the
perpendicular line from the point and the point. We can fix the second
case by adding special code paths for vertical and horizontal lines,
but the third needs another special code path explicitly identifying
Infinity. It seems a kind of too-much..

Finally, I gave up fixing that and added doucmentation.

As another issue, (point '(Infinity, 1e+300)' <-> path '((10,20))')
results in NaN. That is "fixed" by adding a special path for "m ==
0.0" case, but I'm not sure it's worth doing..

By the way I found that float8_div(<normal number>, infinity) erros
out as underflow. It is inconsistent with the behavior that float8_mul
doesn't error-out as overflow when Infinity is given. So fixed it.

- This is not a difference caused by this patch, but for both patched
and unpatched, point '(1e+300,Inf)' <-> line '{3,0,0}' returns NaN,
which should be 1e+300. However, the behavior comes from arithmetic
reasons and wouldn't be a problem.

create_index.out is changed since point(NaN,NaN) <@ polygon changed
from true to false, which seems rather saner.

I haven't checked unchanged results but at least changed results seems
saner to me.

All in all a great patch!

It is clean, well reasoned and carefully crafted.

Do you think that the documentation needs to get updated to the 'new' behaviour?

Hmm. I'm not sure we can guarantee the behavior as documented, but I
tried writing in functions-geometry.html.

NaN and Infinity make geometric functions and operators behave
inconsistently. Geometric operators or functions that return a boolean
return false for operands that contain NaNs. Number-returning
functions and operators return the NaN in most cases but sometimes
return a valid value if no NaNs are met while
calculation. Object-returning ones yield an object that contain NaNs
depending to the operation. Likewise the objects containing Infinity
can make geometric operators and functions behave inconsistently. For
example (point '(Infinity,Infinity)' <-> line '{-1,0,5}') is Infinity
but (point '(Infinity,Infinity)' <-> line '{0,-1,5}') is NaN. It can
never be a value other than these, but you should consider it
uncertain how geometric operators behave for objects containing
Infinity.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Georgios Kokolatos

gkokolatos@protonmail.com

over 5 years ago

In reply to: Kyotaro Horiguchi (#7)

Re: Strange behavior with polygon and NaN

Hi,

apologies for the very, very late reply to your fixes.

You have answered/addressed all my questions concerns. The added documentation
reads well, at least to a non native English speaker.

The patch still applies and as far as I can see the tests are passing.

It gets my :+1: and I am changing the status to "Ready for Committer".

For what little is worth, I learned a lot from this patch, thank you.

Cheers,
Georgios

The new status of this patch is: Ready for Committer

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Georgios Kokolatos (#8)

Re: Strange behavior with polygon and NaN

At Mon, 02 Nov 2020 14:43:32 +0000, Georgios Kokolatos <gkokolatos@protonmail.com> wrote in

Hi,

apologies for the very, very late reply to your fixes.

You have answered/addressed all my questions concerns. The added documentation
reads well, at least to a non native English speaker.

The patch still applies and as far as I can see the tests are passing.

It gets my :+1: and I am changing the status to "Ready for Committer".

For what little is worth, I learned a lot from this patch, thank you.

Cheers,
Georgios

The new status of this patch is: Ready for Committer

Oh! Thanks. Since a part of this patch is committed (Thanks to Tom.)
this is a rebased version on the commit.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#10

Georgios Kokolatos

gkokolatos@protonmail.com

over 5 years ago

In reply to: Kyotaro Horiguchi (#9)

Re: Strange behavior with polygon and NaN

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, November 5, 2020 6:07 AM, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:

At Mon, 02 Nov 2020 14:43:32 +0000, Georgios Kokolatos gkokolatos@protonmail.com wrote in

Hi,
apologies for the very, very late reply to your fixes.
You have answered/addressed all my questions concerns. The added documentation
reads well, at least to a non native English speaker.
The patch still applies and as far as I can see the tests are passing.
It gets my :+1: and I am changing the status to "Ready for Committer".
For what little is worth, I learned a lot from this patch, thank you.
Cheers,
Georgios
The new status of this patch is: Ready for Committer

Oh! Thanks. Since a part of this patch is committed (Thanks to Tom.)
this is a rebased version on the commit.

I completely missed that a part got committed.

Thank you for your rebased version of the rest. I went through it
and my initial assessement of '+1' still stands.

The status remains to: Ready for Committer.

//Georgios

Show quoted text

regards.

--------------------------------------------------------------------------------------------------------------------------

Kyotaro Horiguchi
NTT Open Source Software Center

#11

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Georgios Kokolatos (#10)

Re: Strange behavior with polygon and NaN

I spent some time looking this over, and have a few thoughts:

1. I think it's useful to split the test changes into two patches,
as I've done below: first, just add the additional row in point_tbl
and let the fallout from that happen, and then in the second patch
make the code changes. This way, it's much clearer what the actual
behavioral changes are. Some of them don't look right, either.
For instance, in the very first hunk in geometry.out, we have
this:

- (Infinity,1e+300) | {1,0,5}                               |                NaN |                NaN
+ (Infinity,1e+300) | {1,0,5}                               |           Infinity |           Infinity

which seems right, and also this:

- (1e+300,Infinity) | {1,-1,0}                              |           Infinity |           Infinity
- (1e+300,Infinity) | {-0.4,-1,-6}                          |           Infinity |           Infinity
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |           Infinity |           Infinity
+ (1e+300,Infinity) | {1,-1,0}                              |                NaN |                NaN
+ (1e+300,Infinity) | {-0.4,-1,-6}                          |                NaN |                NaN
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |                NaN |                NaN

which does not. Why aren't these distances infinite as well?
For instance, {1,-1,0} is the line "x = y". We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

2. Rather than coding around undesirable behavior of float8_min,
it seems like it's better to add a primitive to float.h that
does what you want, ie "NaN if either input is NaN, else the
smaller input". This is more readable, and possibly more efficient
(depending on whether the compiler is smart enough to optimize
away redundant isnan checks). I did that in the attached.

3. Looking for other calls of float8_min, I wonder why you did not
touch the bounding-box calculations in box_interpt_lseg() or
boxes_bound_box().

4. The line changes feel a bit weird, like there's no clear idea
of what a "valid" or "invalid" line is. For instance the first
hunk in line_construct():

+		/* Avoid creating a valid line from an invalid point */
+		if (unlikely(isnan(pt->y)))
+			result->C = get_float8_nan();

Why's it appropriate to set C and only C to NaN?

5. But actually there's a bigger issue with that particular hunk.
This code branch is dealing with "draw a vertical line through this
point", so why should we care what the point's y coordinate is --- that
is, why is this particular change appropriate at all? The usual rule as
I understand it is that if a function's result is determined by some of
its arguments independently of what another argument's value is, then it
doesn't matter if that one is NaN, you can still return the same result.

6. I'm a bit uncomfortable with the use of "bool isnan" in a couple
of places. I think it's confusing to use that alongside the isnan()
macro. Moreover, it's at least possible that some platforms implement
isnan() in a way that would break this usage. The C spec specifically
says that isnan() is a macro not a function ... but it doesn't commit
to it being a macro-with-arguments. I think "anynan" or something
like that would be a better choice of name.

[ a bit later... ] Indeed, I get a compile failure on gaur:

geo_ops.c: In function 'lseg_closept_lseg':
geo_ops.c:2906:17: error: called object 'isnan' is not a function
geo_ops.c:2906:32: error: called object 'isnan' is not a function
geo_ops.c:2916:16: error: called object 'isnan' is not a function
geo_ops.c:2924:16: error: called object 'isnan' is not a function
geo_ops.c: In function 'box_closept_point':
geo_ops.c:2989:16: error: called object 'isnan' is not a function
geo_ops.c:2992:16: error: called object 'isnan' is not a function
geo_ops.c:3004:16: error: called object 'isnan' is not a function
geo_ops.c:3014:16: error: called object 'isnan' is not a function
make: *** [geo_ops.o] Error 1

So that scenario isn't hypothetical. Please rename the variables.

regards, tom lane

#12

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Tom Lane (#11)

Re: Strange behavior with polygon and NaN

Thank you for the review, Georgios and Tom.

At Tue, 10 Nov 2020 14:30:08 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

I spent some time looking this over, and have a few thoughts:

1. I think it's useful to split the test changes into two patches,
as I've done below: first, just add the additional row in point_tbl
and let the fallout from that happen, and then in the second patch
make the code changes. This way, it's much clearer what the actual
behavioral changes are. Some of them don't look right, either.
For instance, in the very first hunk in geometry.out, we have
this:
- (Infinity,1e+300) | {1,0,5}                               |                NaN |                NaN
+ (Infinity,1e+300) | {1,0,5}                               |           Infinity |           Infinity
which seems right, and also this:

For example, ('Infinity', 1e300) <-> {1,0,5}, that is:

line "x = -5" <-> point(1e300, Inf)

So sqrt((1e300 - 5)^2 + Inf^2) = Inf, which looks right.

- (1e+300,Infinity) | {1,-1,0}                              |           Infinity |           Infinity
- (1e+300,Infinity) | {-0.4,-1,-6}                          |           Infinity |           Infinity
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |           Infinity |           Infinity
+ (1e+300,Infinity) | {1,-1,0}                              |                NaN |                NaN
+ (1e+300,Infinity) | {-0.4,-1,-6}                          |                NaN |                NaN
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |                NaN |                NaN
which does not. Why aren't these distances infinite as well?

For instance, {1,-1,0} is the line "x = y". We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

Mmm... (swinging my arms to mimic lines..)
dist(x = y, (1e300, Inf)) looks indeterminant to me..

The calcuation is performed in the following steps.

1. construct the perpendicular line for the line.
perpine(1e300, 'Infinity') => {-1, -1, Inf}

2. calculate the cross point.
corsspoint({-1, -1, Inf}, {1,-1,0}) => (Inf, NaN)

3. calculte the distance from the crosspoint to the point.
point_dt((Inf, NaN), (1e300, 'Infinity'))
= HYPOT(Inf - 1e300, NaN - Inf);
= HYPOT(Inf, NaN);

4. HYPOT changed the behavior by the patch

Before: HYPOT(Inf, NaN) = Inf
After : HYPOT(Inf, NaN) = NaN - Result A

So if we will "fix" that, we should fix any, some, or all of 1-3.

1. seems to have no other way than the result.

2. crosspoint (x = - y + Inf, x = y) could be (Inf, Inf)?

3. point_dt((Inf, Inf), (1e300, Inf))
= HYPOT(Inf - 1e300, Inf - Inf)
= HYPOT(Inf, -NaN)
= NaN. - Result B

I'm not sure why Inf - Inf is negative, but |Inf-Inf| = NaN is
reasonable.

That is, we don't get a "reasonable" result this way.

The formula for the distance((x0,y0) - (ax + by + c = 0)) is

|ax0 + by0 + c|/sqrt(a^2 + b^2)

where a = -1, b = -1, c = Inf, x0 = 1e300, y0 = Inf,

abs(-1 * 1e300 + -1 * Inf + Inf) / sqrt(1 + 1)
= abs(-1e300 -Inf + Inf) / C
= NaN. - Result C

All of the Result A - C is NaN. At last NaN looks to be the right
result.

By the way that the formula is far simple than what we are doing
now. Is there any reason to take the above steps for the calculation?

2. Rather than coding around undesirable behavior of float8_min,
it seems like it's better to add a primitive to float.h that
does what you want, ie "NaN if either input is NaN, else the
smaller input". This is more readable, and possibly more efficient
(depending on whether the compiler is smart enough to optimize
away redundant isnan checks). I did that in the attached.

Sounds reasonable. I found that I forgot to do the same thing to y
coordinate.

3. Looking for other calls of float8_min, I wonder why you did not
touch the bounding-box calculations in box_interpt_lseg() or
boxes_bound_box().

While doing that, I didn't make changes just by looking a code locally
since I thought that that might be looked as overdone. Maybe, for
example box_interpt_lseg, even if bounding-box check overlooked NaNs,
I thought that the following calcualaions reflect any involved NaNs to
the result. (But I'm not confident that that is perfect, though..)

4. The line changes feel a bit weird, like there's no clear idea
of what a "valid" or "invalid" line is. For instance the first
hunk in line_construct():
+		/* Avoid creating a valid line from an invalid point */
+		if (unlikely(isnan(pt->y)))
+			result->C = get_float8_nan();
Why's it appropriate to set C and only C to NaN?

Not limited to here, I intended to reduce the patch footprint as much
as possible and it seemed that only set C to NaN is sufficient. (But
I'm not con<snip..>) I don't object to make that change more
comprehensively. Do we go that direction?

5. But actually there's a bigger issue with that particular hunk.
This code branch is dealing with "draw a vertical line through this
point", so why should we care what the point's y coordinate is --- that
is, why is this particular change appropriate at all? The usual rule as

The calculation mess comes from omitting a part of the component
values during calculation. So:

+     <para>
+       NaN and Infinity make geometric functions and operators behave
+       inconsistently. Geometric operators or functions that return a boolean
+       return false for operands that contain NaNs. Number-returning functions
+       and operators return NaN in most cases but sometimes return a valid
+       value if no NaNs are met while actual calculation.  Object-returning one
+       yield an object that contain NaNs depending to the operation.  Likewise

The code is following this policy. A point containing NaN yields an
"invalid" line, that is, a line containg NaN.

I understand it is that if a function's result is determined by some of
its arguments independently of what another argument's value is, then it
doesn't matter if that one is NaN, you can still return the same result.

That's true looking from pure calculation point of view, which caused
some of the messes.

6. I'm a bit uncomfortable with the use of "bool isnan" in a couple
of places. I think it's confusing to use that alongside the isnan()
macro. Moreover, it's at least possible that some platforms implement
isnan() in a way that would break this usage. The C spec specifically
says that isnan() is a macro not a function ... but it doesn't commit
to it being a macro-with-arguments. I think "anynan" or something
like that would be a better choice of name.

Ooo! Rright. I agreed to that. Will fix them.

[ a bit later... ] Indeed, I get a compile failure on gaur:

geo_ops.c: In function 'lseg_closept_lseg':
geo_ops.c:2906:17: error: called object 'isnan' is not a function
geo_ops.c:2906:32: error: called object 'isnan' is not a function
geo_ops.c:2916:16: error: called object 'isnan' is not a function
geo_ops.c:2924:16: error: called object 'isnan' is not a function
geo_ops.c: In function 'box_closept_point':
geo_ops.c:2989:16: error: called object 'isnan' is not a function
geo_ops.c:2992:16: error: called object 'isnan' is not a function
geo_ops.c:3004:16: error: called object 'isnan' is not a function
geo_ops.c:3014:16: error: called object 'isnan' is not a function
make: *** [geo_ops.o] Error 1

So that scenario isn't hypothetical. Please rename the variables.

lol! gaur looks like coal mine canary.

1. Won't fix the dist_pl/lp's changed behavior.
2. (already fixed?) Will find other instances.
3. Will do more comprehensive NaN-detection (as another patch)
4. Ditto.
5. Keep the curent state. Do we revert that?
6. Will fix.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#13

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Kyotaro Horiguchi (#12)

Re: Strange behavior with polygon and NaN

At Fri, 13 Nov 2020 15:35:58 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in

Thank you for the review, Georgios and Tom.

At Tue, 10 Nov 2020 14:30:08 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in
I spent some time looking this over, and have a few thoughts:

1. I think it's useful to split the test changes into two patches,
as I've done below: first, just add the additional row in point_tbl
and let the fallout from that happen, and then in the second patch
make the code changes. This way, it's much clearer what the actual
behavioral changes are. Some of them don't look right, either.
For instance, in the very first hunk in geometry.out, we have
this:
- (Infinity,1e+300) | {1,0,5}                               |                NaN |                NaN
+ (Infinity,1e+300) | {1,0,5}                               |           Infinity |           Infinity
which seems right, and also this:
For example, ('Infinity', 1e300) <-> {1,0,5}, that is:

line "x = -5" <-> point(1e300, Inf)

So sqrt((1e300 - 5)^2 + Inf^2) = Inf, which looks right.

??! Correction:

It's sqrt((1e300 - 5)^2 + 0^2) = Inf, which looks right.

reagrds.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#14

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Kyotaro Horiguchi (#12)

Re: Strange behavior with polygon and NaN

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

At Tue, 10 Nov 2020 14:30:08 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

For instance, {1,-1,0} is the line "x = y". We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

Mmm... (swinging my arms to mimic lines..)
dist(x = y, (1e300, Inf)) looks indeterminant to me..

Well, what you're showing is that we get an internal overflow,
essentially, on the way to calculating the result. Which is true,
so it's sort of accidental that we got a sensible result before.
Nonetheless, we *did* get a sensible result, so producing NaN
instead seems like a regression.

We might need to introduce special-case handling to protect the
low-level calculations from ever seeing NaN or Inf in their inputs.
Getting the right answer to "just fall out" of those calculations
might be an unreasonable hope.

For example, for a line with positive slope (A and B of opposite
signs), I think that the right answer for points (Inf,Inf) and
(-Inf,-Inf) should be NaN, on much the same grounds that Inf
minus Inf is NaN not zero. But all other points involving any Inf
coordinates are clearly an infinite distance away from that line.

regards, tom lane

#15

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Tom Lane (#14)

Re: Strange behavior with polygon and NaN

At Fri, 13 Nov 2020 11:26:21 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:

At Tue, 10 Nov 2020 14:30:08 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

For instance, {1,-1,0} is the line "x = y". We could argue about
whether it'd be sensible to return zero for the distance between that
and the point (inf,inf), but surely any point with one inf and one
finite coordinate must be an infinite distance away from that line.
There's nothing ill-defined about that situation.

Mmm... (swinging my arms to mimic lines..)
dist(x = y, (1e300, Inf)) looks indeterminant to me..

Well, what you're showing is that we get an internal overflow,
essentially, on the way to calculating the result. Which is true,
so it's sort of accidental that we got a sensible result before.
Nonetheless, we *did* get a sensible result, so producing NaN
instead seems like a regression.

Independently from the discussion, the following was wrong.

2. calculate the cross point.
corsspoint({-1, -1, Inf}, {1,-1,0}) => (Inf, NaN)

The Corss point must be on the line 2, that is, x equas to y. If we
avoid using x to calcualte y, the result gets right. But that doesn't
"fix" the result.

We might need to introduce special-case handling to protect the
low-level calculations from ever seeing NaN or Inf in their inputs.
Getting the right answer to "just fall out" of those calculations
might be an unreasonable hope.

However, as far as we we calculate the distance between the point and
the foot of the perpendicular line from the point to the line, (inf -
inf) is inevitable and we cannot avoid that "wrong" result.

For example, for a line with positive slope (A and B of opposite
signs), I think that the right answer for points (Inf,Inf) and
(-Inf,-Inf) should be NaN, on much the same grounds that Inf
minus Inf is NaN not zero. But all other points involving any Inf
coordinates are clearly an infinite distance away from that line.

After some checking I noticed that the calculation with the well-known
formula was wrong.

The formula for the distance((x0,y0) - (ax + by + c = 0)) is

|ax0 + by0 + c|/sqrt(a^2 + b^2)

where a = -1, b = -1, c = Inf, x0 = 1e300, y0 = Inf,

a = -1, b = -1, c = "0", x0=1e300, y0=Inf results in Inf. Sorry for
the mistake.

So, we can recalculate the result using the formula if get NaN based
on the perpendicular foot. The reason I left the existing calculation
is the consistency between the returned perpendicular foot and the
distance value, and for the reduced complexity in the major code path.

1. So the attached yeilds "Inf" in that cases.

2. Independently from the point, I noticed that the y-coord of the
perpendicular foot is miscalculated as NaN instead of Inf for the
cases that are discussed here. (line_interpt_line)

3. I fixed line_construct to construct (NaN, NaN, NaN) if the input
containsNaNs.

4. Renamed the variable "isnan" to "anynan" in lseg_closept_lseg() and
box_closept_point().

5. (not in the past comments) line_interpt() needs to check if any of
the coordinates is NaN since line_interpt_line() is defined to return
such a result.

A. I'm not sure how to treat addtion/subtruct/multiply between
points. But thinking that operations as vector calculation returning
such values are valid. So I left them as it is.

-- Add point
SELECT p1.f1, p2.f1, p1.f1 + p2.f1 FROM POINT_TBL p1, POINT_TBL p2;
(NaN,NaN) | (0,0) | (NaN,NaN)

B. @@ lseg (center) returns NaN-containing results. I'm not sure this
is regarded whether as a vector calculation or as a geometric
operation. If it is the former we don't fix it and otherwise we
should reutrn NULL for such input.

=# select @@ lseg('[(NaN,1),(NaN,90)]');
?column?
------------
(NaN,45.5)
(1 row)

== Changes in the result ============

1 and 2 above cause visible diffence in some results at the least
significant digit in mantissa, but that difference doesn't matter.

- (-3,4)            | {-0.000184615384615,-1,15.3846153846} |      11.3851690368 |      11.3851690368
+ (-3,4)            | {-0.000184615384615,-1,15.3846153846} |      11.3851690367 |      11.3851690367

1 restored the previous results.

- (1e+300,Infinity) | {1,-1,0}                              |                NaN |                NaN
- (1e+300,Infinity) | {-0.4,-1,-6}                          |                NaN |                NaN
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |                NaN |                NaN
+ (1e+300,Infinity) | {1,-1,0}                              |           Infinity |           Infinity
+ (1e+300,Infinity) | {-0.4,-1,-6}                          |           Infinity |           Infinity
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} |           Infinity |           Infinity

- (Infinity,1e+300) | [(0,-20),(30,-20)]            |                NaN |                NaN
+ (Infinity,1e+300) | [(0,-20),(30,-20)]            |           Infinity |           Infinity
- (Infinity,1e+300) | [(0,0),(3,0),(4,5),(1,6)] |                NaN |                NaN
+ (Infinity,1e+300) | [(0,0),(3,0),(4,5),(1,6)] |           Infinity |           Infinity

Looks fine.

-- Closest point to line
SELECT p.f1, l.s, p.f1 ## l.s FROM POINT_TBL p, LINE_TBL l;
- (1e+300,Infinity) | {1,-1,0}                              | 
- (1e+300,Infinity) | {-0.4,-1,-6}                          | 
- (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} | 
+ (1e+300,Infinity) | {1,-1,0}                              | (Infinity,Infinity)
+ (1e+300,Infinity) | {-0.4,-1,-6}                          | (-Infinity,Infinity)
+ (1e+300,Infinity) | {-0.000184615384615,-1,15.3846153846} | (-Infinity,Infinity)

-- Distance to line segment
SELECT p.f1, l.s, p.f1 <-> l.s AS dist_ps, l.s <-> p.f1 AS dist_sp FROM POINT_TBL p, LSEG_TBL l;
- (Infinity,1e+300) | [(0,-20),(30,-20)]            | 
+ (Infinity,1e+300) | [(0,-20),(30,-20)]            | (30,-20)

-- Intersection point with line
SELECT l1.s, l2.s, l1.s # l2.s FROM LINE_TBL l1, LINE_TBL l2;
- {-0.000184615384615,-1,15.3846153846} | {0,3,0}                               | (83333.3333333,-1.7763568394e-15)
+ {-0.000184615384615,-1,15.3846153846} | {0,3,0}                               | (83333.3333333,0)

These are fixed by 2.

-- Distance to line
SELECT p.f1, l.s, p.f1 <-> l.s AS dist_pl, l.s <-> p.f1 AS dist_lp FROM POINT_TBL p, LINE_TBL l;
(1e+300,Infinity) | {-1,0,3} | NaN | NaN

This should be 1e+300, not NaN, but 1 nor 2 doesn't fix this. The
reasonis line->B(0) * point->y(Infinity) results in NaN. But from the
meaning of the this sexpression, it should be 0.

I made line_closept_point() to do that but I found a similar issue in
line_interpt_line().

-- Closest point to line
SELECT p.f1, l.s, p.f1 ## l.s FROM POINT_TBL p, LINE_TBL l;
(1e+300,Infinity) | {1,0,5} | (NaN,Infinity)

So, what is needed here is we have special multiplication function
that supercedes 0*Inf = NaN rule by "0"*Inf = 0. I introduced that
function as float8_coef_mul(). The reason that the function is in
geo_ops.c is that it is geo_ops specific and using ZPzere(), which is
not used in float.h. By using the function the results are fixed as:

-- Distance to line
SELECT p.f1, l.s, p.f1 <-> l.s AS dist_pl, l.s <-> p.f1 AS dist_lp FROM POINT_TBL p, LINE_TBL l;
(1e+300,Infinity) | {-1,0,3} | 1e+300 | 1e+300
(Infinity,1e+300) | {0,-1,5} | 1e+300 | 1e+300

-- Closest point to line
SELECT p.f1, l.s, p.f1 ## l.s FROM POINT_TBL p, LINE_TBL l;
(1e+300,Infinity) | {1,0,5} | (-5,Infinity)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#16

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Kyotaro Horiguchi (#15)

Re: Strange behavior with polygon and NaN

I spent some more time looking at this patch.

Some experimentation shows that the changes around bounding box
calculation (ie float8_min_nan() and its call sites) seem to be
completely pointless: removing them doesn't change any of the regression
results. Nor does using float8_min_nan() in the other two bounding-box
calculations I'd asked about. So I think we should just drop that set
of changes and stick with the rule that bounding box upper and lower
values are sorted as per float.h comparison rules. This isn't that hard
to work with: if you want to know whether any NaNs are in the box, test
the upper limit (and *not* the lower limit) for isnan(). Moreover, even
if we wanted a different coding rule, we really can't have it because we
will still need to work with existing on-disk values that have bounding
boxes computed the old way.

I don't much like anything about float8_coef_mul(). In the first place,
FPzero() is an ugly, badly designed condition that we should be trying
to get rid of not add more dependencies on. In the second place, it's
really unclear why allowing 0 times Inf to be something other than NaN
is a good idea, and it's even less clear why allowing small-but-not-zero
times Inf to be zero rather than Inf is a good idea. In the third
place, the asymmetry between the two inputs looks more like a bug than
something we should actually want.

After some time spent staring at the specific case of line_closept_point
and its callers, I decided that the real problems there are twofold.
First, the API, or at least the callers' interpretation of this
undocumented point, is that whether the distance is undefined (NaN) is
equivalent to whether the closest point is undefined. This is not so;
in some cases we know that the distance is infinite even though we can't
calculate a unique closest point. Second, it's not making any attempt
to eliminate undefined cases up front. We can do that pretty easily
by plugging the point's coordinates into the line's equation Ax+By+C
and seeing whether we get a NaN. The attached 0002 is a subset patch
that just fixes these two issues, and I like the results it produces.

I wonder now whether the problems around line_interpt_line() and the
other intersection-ish functions wouldn't be better handled in a similar
way, by making changes to their API specs to be clearer about what
happens with NaNs and trying to eliminate ill-defined cases explicitly.
I've not tried to code that though.

Changing pg_hypot() the way you've done here is right out. See the
comment for the function: what it is doing now is per all the relevant
standards, and your patch breaks that. It's extremely unlikely that
doing it differently from IEEE and POSIX is a good idea.

Attached are the same old 0001 (adding the extra point_tbl entry)
and a small 0002 that fixes just line_closept_point. I've not
tried to figure out just which of the rest of your diffs should be
dropped given that. I did note though that the example you add
to func.sgml doesn't apply to this version of line_closept_point:

regression=# select point '(Infinity,Infinity)' <-> line '{-1,0,5}';
?column?
----------
NaN
(1 row)

regression=# select point '(Infinity,Infinity)' <-> line '{0,-1,5}';
?column?
----------
NaN
(1 row)

regards, tom lane

#17

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Tom Lane (#16)

Re: Strange behavior with polygon and NaN

Further to this ...

I realized after looking at things some more that one of
line_closept_point's issues is really a bug in line_construct:
it fails to draw a horizontal line through a point with x = Inf,
though surely that's not particularly ill-defined. The reason
is that somebody thought they could dispense with a special case
for m == 0, but then we end up doing

result->C = float8_mi(pt->y, float8_mul(m, pt->x));

and if m = 0 and pt->x = Inf, we get NaN.

It also annoyed me that the code was still using DBL_MAX instead of a
true Inf to represent infinite slope. That's sort of okay as long as
it's just a communication mechanism between line_construct and places
like line_sl, but it's not really okay, because in some places you can
get a true infinity from a slope calculation. Thus in HEAD you get
different results from

regression=# select line(point(1,2),point(1,'inf'));
line
----------
{-1,0,1}
(1 row)

regression=# select line(point(1,2),point(4,'inf'));
line
-------------------------
{Infinity,-1,-Infinity}
(1 row)

which is completely silly: we ought to "round off" that infinitesimal
slope to a true vertical, rather than producing a line representation
we can't do anything with.

So I fixed that too, but then I got a weird regression test diff:
the case of
lseg '[(-10,2),(-10,3)]' ?|| lseg '[(-10,2),(-10,3)]'
was no longer returning true. The reason turned out to be that
lseg_parallel does

PG_RETURN_BOOL(FPeq(lseg_sl(l1), lseg_sl(l2)));

and now lseg_sl is returning true infinities for vertical lines, and
FPeq *gets the wrong answer* when asked to compare Inf to Inf. It
should say equal, surely, but internally it computes a NaN and ends up
with false.

So the attached 0003 patch also fixes FPeq() and friends to give
sane answers for Inf-vs-Inf comparisons. That part seems like
a fairly fundamental bug fix, and so I feel like we ought to
go ahead and apply it before we do too much more messing with
the logic in this area.

(Note that the apparently-large results diff in 0003 is mostly
a whitespace change: the first hunk just reflects slopes coming
out as Infinity not DBL_MAX.)

I'm reposting 0001 and 0002 just to keep the cfbot happy,
they're the same as in my previous message.

regards, tom lane

#18

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Tom Lane (#17)

Re: Strange behavior with polygon and NaN

I went ahead and pushed 0001 and 0003 (the latter in two parts), since
they didn't seem particularly controversial to me. Just to keep the
cfbot from whining, here's a rebased version of 0002.

regards, tom lane

#19

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Tom Lane (#18)

Re: Strange behavior with polygon and NaN

At Sat, 21 Nov 2020 17:33:53 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

I went ahead and pushed 0001 and 0003 (the latter in two parts), since
they didn't seem particularly controversial to me. Just to keep the
cfbot from whining, here's a rebased version of 0002.

I didn't noticed that inf == inf sould be true (in IEEE754).

# (inf - inf == 0) => false but (inf == inf + 0) == false is somewhat
# uneasy but, yes, it's the standare we are basing on.

So, I agree that the changes of line_construct() and line_(inv)sl()
looks good to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#20

Kyotaro Horiguchi

horikyota.ntt@gmail.com

over 5 years ago

In reply to: Tom Lane (#16)

Re: Strange behavior with polygon and NaN

At Fri, 20 Nov 2020 15:57:46 -0500, Tom Lane <tgl@sss.pgh.pa.us> wrote in

I spent some more time looking at this patch.

Some experimentation shows that the changes around bounding box
calculation (ie float8_min_nan() and its call sites) seem to be
completely pointless: removing them doesn't change any of the regression
results. Nor does using float8_min_nan() in the other two bounding-box
calculations I'd asked about. So I think we should just drop that set
of changes and stick with the rule that bounding box upper and lower
values are sorted as per float.h comparison rules. This isn't that hard
to work with: if you want to know whether any NaNs are in the box, test
the upper limit (and *not* the lower limit) for isnan(). Moreover, even
if we wanted a different coding rule, we really can't have it because we
will still need to work with existing on-disk values that have bounding
boxes computed the old way.

Actually that changes the result since that code gives a shortcut of
checking NaNs in the object coordinates. I don't think that the it is
pointless to avoid full calculations that are performed only to find
NaNs are involved, if bounding box check is meaningful.

I don't much like anything about float8_coef_mul(). In the first place,
FPzero() is an ugly, badly designed condition that we should be trying
to get rid of not add more dependencies on. In the second place, it's
really unclear why allowing 0 times Inf to be something other than NaN
is a good idea, and it's even less clear why allowing small-but-not-zero
times Inf to be zero rather than Inf is a good idea. In the third
place, the asymmetry between the two inputs looks more like a bug than
something we should actually want.

I have the same feeling on the function, but I concluded that
coefficients and coordinates should be regarded as different things in
the practical standpoint.

For example, consider Ax + By + C == 0, if B is 0.0, we can remove the
second term from the equation, regardless of the value of y, of course
even if it were inf. that is, The function imitates that kind of
removals.

After some time spent staring at the specific case of line_closept_point
and its callers, I decided that the real problems there are twofold.
First, the API, or at least the callers' interpretation of this
undocumented point, is that whether the distance is undefined (NaN) is
equivalent to whether the closest point is undefined. This is not so;
in some cases we know that the distance is infinite even though we can't
calculate a unique closest point. Second, it's not making any attempt
to eliminate undefined cases up front. We can do that pretty easily
by plugging the point's coordinates into the line's equation Ax+By+C
and seeing whether we get a NaN. The attached 0002 is a subset patch
that just fixes these two issues, and I like the results it produces.

Actually the code reacts to some "problem" cases in a "wrong" way:

+	 * If it is unclear whether the point is on the line or not, then the
+	 * results are ill-defined.  This eliminates cases where any of the given
+	 * coordinates are NaN, as well as cases where infinite coordinates give
+	 * rise to Inf - Inf, 0 * Inf, etc.
+	 */
+	if (unlikely(isnan(float8_pl(float8_pl(float8_mul(line->A, point->x),
+										   float8_mul(line->B, point->y)),
+								 line->C))))

| postgres=# select point(1e+300, 'Infinity') <-> line('{1,0,5}');
| ?column?
| ----------
| NaN

Aren't our guts telling that is 1e+300? You might be thinking to put
some special case handling into that path (as mentioned below?), but
otherwise it yeildsa "wrong" result. The reason for the expectation
is that we assume that "completely vertical" lines have a constant x
value regardless of the y coordinate. That is the reason for the
float8_coef_mul() function.

I wonder now whether the problems around line_interpt_line() and the
other intersection-ish functions wouldn't be better handled in a similar
way, by making changes to their API specs to be clearer about what
happens with NaNs and trying to eliminate ill-defined cases explicitly.
I've not tried to code that though.

One of the "ill-defined" cases is the zero-coefficient issue. The
asymmetric multiply function "fixes" it, at least. Of course it could
be open-coded instead of being as a function that looks as if having
some general interpretation.

Changing pg_hypot() the way you've done here is right out. See the
comment for the function: what it is doing now is per all the relevant
standards, and your patch breaks that. It's extremely unlikely that
doing it differently from IEEE and POSIX is a good idea.

Mmm. Ok, I agree to that.

Attached are the same old 0001 (adding the extra point_tbl entry)
and a small 0002 that fixes just line_closept_point. I've not
tried to figure out just which of the rest of your diffs should be
dropped given that. I did note though that the example you add
to func.sgml doesn't apply to this version of line_closept_point:

regression=# select point '(Infinity,Infinity)' <-> line '{-1,0,5}';
?column?
----------
NaN
(1 row)

regression=# select point '(Infinity,Infinity)' <-> line '{0,-1,5}';
?column?
----------
NaN
(1 row)