Add missing operator <->(box, point)

Started by Nikita Glukhovabout 7 years ago10 messageshackers
Jump to latest
#1Nikita Glukhov
n.gluhov@postgrespro.ru

Hi, hackers.

Attached patches add missing distance operator <->(box, point).

We already have reverse operator <->(point, box), but it can't be used for kNN
search in GiST and SP-GiST. GiST and SP-GiST now support kNN searches over more
complex polygons and circles, but do not support more simple boxes, which seems
to be inconsistent.

Description of the patches:
1. Add function dist_pb(box, point) and operator <->.

2. Add <-> to GiST box_ops.
Extracted gist_box_distance_helper() common for gist_box_distance() and
gist_bbox_distance().

3. Add <-> to SP-GiST.
Changed only catalog and tests. Box case is already checked in
spg_box_quad_leaf_consistent():
out->recheckDistances = distfnoid == F_DIST_POLYP;

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0002-Add-box-point-distance-operator-to-GiST-box_ops-v01.patchtext/x-patch; name=0002-Add-box-point-distance-operator-to-GiST-box_ops-v01.patchDownload+140-16
0003-Add-box-point-distance-operator-to-SP-GiST-box_ops-v01.patchtext/x-patch; name=0003-Add-box-point-distance-operator-to-SP-GiST-box_ops-v01.patchDownload+135-25
0001-Add-box-point-distance-operator-v01.patchtext/x-patch; name=0001-Add-box-point-distance-operator-v01.patchDownload+68-50
#2Fabien COELHO
coelho@cri.ensmp.fr
In reply to: Nikita Glukhov (#1)
Re: Add missing operator <->(box, point)

Hello Nikita,

Attached patches add missing distance operator <->(box, point).

Indeed.

We already have reverse operator <->(point, box), but it can't be used
for kNN search in GiST and SP-GiST. GiST and SP-GiST now support kNN
searches over more complex polygons and circles, but do not support more
simple boxes, which seems to be inconsistent.

Description of the patches:
1. Add function dist_pb(box, point) and operator <->.

About this first patch: applies cleanly, compiles, "make check" ok.

No doc changes, but this was expected to work in the first place,
according to the documention.

About the test, I'd suggest to name the result columns, eg "pt to box
dist" and "box to pt dist", otherwise why all is repeated is unclear.

I notice that other distance tests do not test for commutativity. Are they
also not implemented, or just not tested? If not implemented, I'd suggest
to add them in the same batch. If not tested, maybe the patch should do as
others, or maybe given the trivial implementation there should just be one
test per commutted operator for coverage.

ISTM that the committer would need to "bump the catalog revision number"
because it adds new functions & operators.

--
Fabien.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Fabien COELHO (#2)
Re: Add missing operator <->(box, point)

[ warning, drive-by comment ahead ]

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I notice that other distance tests do not test for commutativity. Are they
also not implemented, or just not tested? If not implemented, I'd suggest
to add them in the same batch.

Yeah ... just looking at operators named <->, I see

regression=# select oid, oid::regoperator, oprcom, oprcode from pg_operator where oprname = '<->';
oid | oid | oprcom | oprcode
------+----------------------+--------+---------------------------
517 | <->(point,point) | 517 | point_distance
613 | <->(point,line) | 0 | dist_pl
614 | <->(point,lseg) | 0 | dist_ps
615 | <->(point,box) | 0 | dist_pb
616 | <->(lseg,line) | 0 | dist_sl
617 | <->(lseg,box) | 0 | dist_sb
618 | <->(point,path) | 0 | dist_ppath
706 | <->(box,box) | 706 | box_distance
707 | <->(path,path) | 707 | path_distance
708 | <->(line,line) | 708 | line_distance
709 | <->(lseg,lseg) | 709 | lseg_distance
712 | <->(polygon,polygon) | 712 | poly_distance
1520 | <->(circle,circle) | 1520 | circle_distance
1522 | <->(point,circle) | 3291 | dist_pc
3291 | <->(circle,point) | 1522 | dist_cpoint
3276 | <->(point,polygon) | 3289 | dist_ppoly
3289 | <->(polygon,point) | 3276 | dist_polyp
1523 | <->(circle,polygon) | 0 | dist_cpoly
1524 | <->(line,box) | 0 | dist_lb
5005 | <->(tsquery,tsquery) | 0 | pg_catalog.tsquery_phrase
(20 rows)

It's not clear to me why to be particularly more excited about
<->(box, point) than about the other missing cases here.

regards, tom lane

#4Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Tom Lane (#3)
Re: Add missing operator <->(box, point)

Attached 2nd version of the patches.

On 20.04.2019 16:41, Fabien COELHO wrote:

About the test, I'd suggest to name the result columns, eg "pt to box
dist" and "box to pt dist", otherwise why all is repeated is unclear.

Fixed.

On 02.07.2019 7:01, Tom Lane wrote:

[ warning, drive-by comment ahead ]

Fabien COELHO <coelho@cri.ensmp.fr> writes:

I notice that other distance tests do not test for commutativity. Are they
also not implemented, or just not tested? If not implemented, I'd suggest
to add them in the same batch.

Yeah ... just looking at operators named <->, I see

regression=# select oid, oid::regoperator, oprcom, oprcode from pg_operator where oprname = '<->';
oid | oid | oprcom | oprcode
------+----------------------+--------+---------------------------
517 | <->(point,point) | 517 | point_distance
613 | <->(point,line) | 0 | dist_pl
614 | <->(point,lseg) | 0 | dist_ps
615 | <->(point,box) | 0 | dist_pb
616 | <->(lseg,line) | 0 | dist_sl
617 | <->(lseg,box) | 0 | dist_sb
618 | <->(point,path) | 0 | dist_ppath
706 | <->(box,box) | 706 | box_distance
707 | <->(path,path) | 707 | path_distance
708 | <->(line,line) | 708 | line_distance
709 | <->(lseg,lseg) | 709 | lseg_distance
712 | <->(polygon,polygon) | 712 | poly_distance
1520 | <->(circle,circle) | 1520 | circle_distance
1522 | <->(point,circle) | 3291 | dist_pc
3291 | <->(circle,point) | 1522 | dist_cpoint
3276 | <->(point,polygon) | 3289 | dist_ppoly
3289 | <->(polygon,point) | 3276 | dist_polyp
1523 | <->(circle,polygon) | 0 | dist_cpoly
1524 | <->(line,box) | 0 | dist_lb
5005 | <->(tsquery,tsquery) | 0 | pg_catalog.tsquery_phrase
(20 rows)

It's not clear to me why to be particularly more excited about
<->(box, point) than about the other missing cases here.

regards, tom lane

The original goal was to add support of ordering by distance to point to
all geometric opclasses. As you can see, GiST and SP-GiST box_ops has no
distance operator while more complex circle_ops and poly_ops have it:

SELECT
amname,
opcname,
amopopr::regoperator AS dist_opr
FROM
pg_opclass LEFT JOIN
pg_amop ON amopfamily = opcfamily AND amoppurpose = 'o',
pg_am,
pg_type
WHERE
opcmethod = pg_am.oid AND
opcintype = pg_type.oid AND
typcategory = 'G'
ORDER BY 1, 2;

amname | opcname | dist_opr
--------+-------------------+--------------------
brin | box_inclusion_ops |
gist | box_ops |
gist | circle_ops | <->(circle,point)
gist | point_ops | <->(point,point)
gist | poly_ops | <->(polygon,point)
spgist | box_ops |
spgist | kd_point_ops | <->(point,point)
spgist | poly_ops | <->(polygon,point)
spgist | quad_point_ops | <->(point,point)
(9 rows)

We could use commuted "const <-> var" operators for kNN searches, but the
current implementation requires the existence of "var <-> const" operators, and
order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op()
at /src/backend/optimizer/path/indxpath.c).

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-Add-operator-box-point-v02.patchtext/x-patch; name=0001-Add-operator-box-point-v02.patchDownload+68-50
0002-Add-operator-box-point-to-GiST-box_ops-v02.patchtext/x-patch; name=0002-Add-operator-box-point-to-GiST-box_ops-v02.patchDownload+140-16
0003-Add-operator-box-point-to-SP-GiST-box_ops-v02.patchtext/x-patch; name=0003-Add-operator-box-point-to-SP-GiST-box_ops-v02.patchDownload+135-25
#5Alexander Korotkov
aekorotkov@gmail.com
In reply to: Nikita Glukhov (#4)
Re: Add missing operator <->(box, point)

On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

We could use commuted "const <-> var" operators for kNN searches, but the
current implementation requires the existence of "var <-> const" operators, and
order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op()
at /src/backend/optimizer/path/indxpath.c).

But probably it's still worth to just add commutator for every <->
operator and close this question. Otherwise, it may arise again once
we want to add some more kNN support to opclasses or something. On
the other hand, are we already going to limit oid consumption?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#5)
Re: Add missing operator <->(box, point)

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

We could use commuted "const <-> var" operators for kNN searches, but the
current implementation requires the existence of "var <-> const" operators, and
order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op()
at /src/backend/optimizer/path/indxpath.c).

But probably it's still worth to just add commutator for every <->
operator and close this question.

Yeah, I was just thinking that it was weird not to have the commutator
operators, independently of indexing considerations. Seems like a
usability thing.

regards, tom lane

#7Alexander Korotkov
aekorotkov@gmail.com
In reply to: Nikita Glukhov (#1)
Re: Add missing operator <->(box, point)

On Mon, Mar 11, 2019 at 2:49 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

2. Add <-> to GiST box_ops.
Extracted gist_box_distance_helper() common for gist_box_distance() and
gist_bbox_distance().

For me it doesn't look worth having two distinct functions
gist_box_distance_helper() and gist_bbox_distance(). What about
having just one and leave responsibility for recheck flag to the
caller?

3. Add <-> to SP-GiST.
Changed only catalog and tests. Box case is already checked in
spg_box_quad_leaf_consistent():
out->recheckDistances = distfnoid == F_DIST_POLYP;

So, it seems to be fix of oversight in 2a6368343ff4. But assuming
fixing this requires catalog changes, we shouldn't backpatch this.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#8Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Alexander Korotkov (#7)
Re: Add missing operator <->(box, point)

Attached 3rd version of the patches.

On 02.07.2019 21:55, Alexander Korotkov wrote:

On Tue, Jul 2, 2019 at 9:19 PM Nikita Glukhov<n.gluhov@postgrespro.ru> wrote:

We could use commuted "const <-> var" operators for kNN searches, but the
current implementation requires the existence of "var <-> const" operators, and
order-by-op clauses are rebuilt using them (see match_clause_to_ordering_op()
at /src/backend/optimizer/path/indxpath.c).

But probably it's still worth to just add commutator for every <->
operator and close this question. Otherwise, it may arise again once
we want to add some more kNN support to opclasses or something. On
the other hand, are we already going to limit oid consumption?

All missing distance operators were added to the first patch.

On 08.07.2019 18:22, Alexander Korotkov wrote:

On Mon, Mar 11, 2019 at 2:49 AM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

2. Add <-> to GiST box_ops.
Extracted gist_box_distance_helper() common for gist_box_distance() and
gist_bbox_distance().

For me it doesn't look worth having two distinct functions
gist_box_distance_helper() and gist_bbox_distance(). What about
having just one and leave responsibility for recheck flag to the
caller?

gist_bbox_distance() was removed.

But maybe it would be better to replace two identical functions
gist_circle_distance() and gist_poly_distance() with the single
gist_bbox_distance()?

3. Add <-> to SP-GiST.
Changed only catalog and tests. Box case is already checked in
spg_box_quad_leaf_consistent():
out->recheckDistances = distfnoid == F_DIST_POLYP;

So, it seems to be fix of oversight in 2a6368343ff4. But assuming
fixing this requires catalog changes, we shouldn't backpatch this.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-Add-missing-distance-operators-v03.patchtext/x-patch; name=0001-Add-missing-distance-operators-v03.patchDownload+687-511
0002-Add-operator-box-point-to-GiST-box_ops-v03.patchtext/x-patch; name=0002-Add-operator-box-point-to-GiST-box_ops-v03.patchDownload+134-18
0003-Add-operator-box-point-to-SP-GiST-box_ops-v03.patchtext/x-patch; name=0003-Add-operator-box-point-to-SP-GiST-box_ops-v03.patchDownload+135-25
#9Alexander Korotkov
aekorotkov@gmail.com
In reply to: Nikita Glukhov (#8)
Re: Add missing operator <->(box, point)

On Mon, Jul 8, 2019 at 11:39 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

On 08.07.2019 18:22, Alexander Korotkov wrote:
For me it doesn't look worth having two distinct functions
gist_box_distance_helper() and gist_bbox_distance(). What about
having just one and leave responsibility for recheck flag to the
caller?

gist_bbox_distance() was removed.

OK.

But maybe it would be better to replace two identical functions
gist_circle_distance() and gist_poly_distance() with the single
gist_bbox_distance()?

Sounds reasonable to me.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#10Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#9)
Re: Add missing operator <->(box, point)

On Tue, Jul 9, 2019 at 12:03 AM Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

On Mon, Jul 8, 2019 at 11:39 PM Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

On 08.07.2019 18:22, Alexander Korotkov wrote:
For me it doesn't look worth having two distinct functions
gist_box_distance_helper() and gist_bbox_distance(). What about
having just one and leave responsibility for recheck flag to the
caller?

gist_bbox_distance() was removed.

OK.

But maybe it would be better to replace two identical functions
gist_circle_distance() and gist_poly_distance() with the single
gist_bbox_distance()?

Sounds reasonable to me.

However, gist_poly_distance() and gist_circle_distance() have
different signatures. Having same internal function to be
corresponding to more than one catalog function cause troubles in
sanity checks. So, let's leave it as it is.

Revised patchset is attached. It contains commit messages as well as
minor editorialization.

I'm going to push this if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-Add-missing-distance-operators-v04.patchapplication/octet-stream; name=0001-Add-missing-distance-operators-v04.patchDownload+687-510
0002-Add-operator-box-point-to-GiST-box_ops-v04.patchapplication/octet-stream; name=0002-Add-operator-box-point-to-GiST-box_ops-v04.patchDownload+134-18
0003-Add-operator-box-point-to-SP-GiST-box_ops-v04.patchapplication/octet-stream; name=0003-Add-operator-box-point-to-SP-GiST-box_ops-v04.patchDownload+135-24