KNN-GiST with recheck
Hackers!
This patch was split from thread:
/messages/by-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com
I've split it to separate thead, because it's related to partial sort only
conceptually not technically. Also I renamed it to "knn-gist-recheck" from
"partial-knn" as more appropriate name. In the attached version docs are
updated. Possible weak point of this patch design is that it fetches heap
tuple from GiST scan. However, I didn't receive any notes about its design,
so, I'm going to put it to commitfest.
Here goes a desription of this patch same as in original thread.
KNN-GiST provides ability to get ordered results from index, but this order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.
See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);
We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;
id | ?column?
--------+----------------------
755611 | 0.000405855808916853
807562 | 0.000464123777564343
437778 | 0.000738524708741959
947860 | 0.00076250998760724
389843 | 0.000886362723569568
17586 | 0.000981960100555216
411329 | 0.00145338112316853
894191 | 0.00149399559703506
391907 | 0.0016647896049741
235381 | 0.00167554614889509
(10 rows)
It's fast using just index scan.
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)
------
With best regards,
Alexander Korotkov.
Attachments:
On 01/13/2014 07:17 PM, Alexander Korotkov wrote:
Here goes a desription of this patch same as in original thread.
KNN-GiST provides ability to get ordered results from index, but this order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;
id | ?column?
--------+----------------------
755611 | 0.000405855808916853
807562 | 0.000464123777564343
437778 | 0.000738524708741959
947860 | 0.00076250998760724
389843 | 0.000886362723569568
17586 | 0.000981960100555216
411329 | 0.00145338112316853
894191 | 0.00149399559703506
391907 | 0.0016647896049741
235381 | 0.00167554614889509
(10 rows)It's fast using just index scan.
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)
Nice! Some thoughts:
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.
2. I wonder how useful it really is to allow mixing exact and non-exact
return values from the distance function. The distance function included
in the patch always returns recheck=true. I have a feeling that all
other distance function will also always return either true or false.
3. A binary heap would be a better data structure to buffer the
rechecked values. A Red-Black tree allows random insertions and
deletions, but in this case you need to insert arbitrary values but only
remove the minimum item. That's exactly what a binary heap excels at. We
have a nice binary heap implementation in the backend that you can use,
see src/backend/lib/binaryheap.c.
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to
do this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, or
perhaps add a new node type for it.
Of course that's exactly what your partial sort patch does :-). I
haven't looked at that in detail, but I don't think the approach the
partial sort patch takes will work here as is. In the KNN-GiST case, the
index is returning tuples roughly in the right order, but a tuple that
it returns might in reality belong somewhere later in the ordering. In
the partial sort patch, the "input stream" of tuples is divided into
non-overlapping groups, so that the tuples within the group are not
sorted, but the groups are. I think the partial sort case is a special
case of the KNN-GiST case, if you consider the lower bound of each tuple
to be the leading keys that you don't need to sort.
BTW, this capability might also be highly useful for the min/max indexes
as well. A min/max index cannot return an exact ordering of tuples, but
it can also give a lower bound for a group of tuples.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 28, 2014 at 5:54 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
On 01/13/2014 07:17 PM, Alexander Korotkov wrote:
Here goes a desription of this patch same as in original thread.
KNN-GiST provides ability to get ordered results from index, but this
order
is based only on index information. For instance, GiST index contains
bounding rectangles for polygons, and we can't get exact distance to
polygon from index (similar situation is in PostGIS). In attached patch,
GiST distance method can set recheck flag (similar to consistent method).
This flag means that distance method returned lower bound of distance and
we should recheck it from heap.See an example.
create table test as (select id, polygon(3+(random()*10)::int,
circle(point(random(), random()), 0.0003 + random()*0.001)) as p from
generate_series(1,1000000) id);
create index test_idx on test using gist (p);We can get results ordered by distance from polygon to point.
postgres=# select id, p <-> point(0.5,0.5) from test order by p <->
point(0.5,0.5) limit 10;
id | ?column?
--------+----------------------
755611 | 0.000405855808916853
807562 | 0.000464123777564343
437778 | 0.000738524708741959
947860 | 0.00076250998760724
389843 | 0.000886362723569568
17586 | 0.000981960100555216
411329 | 0.00145338112316853
894191 | 0.00149399559703506
391907 | 0.0016647896049741
235381 | 0.00167554614889509
(10 rows)It's fast using just index scan.
QUERY PLAN
------------------------------------------------------------
----------------------------------------------------------------------
Limit (cost=0.29..1.86 rows=10 width=36) (actual time=0.180..0.230
rows=10 loops=1)
-> Index Scan using test_idx on test (cost=0.29..157672.29
rows=1000000 width=36) (actual time=0.179..0.228 rows=10 loops=1)
Order By: (p <-> '(0.5,0.5)'::point)
Total runtime: 0.305 ms
(4 rows)Nice! Some thoughts:
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.
Yeah, but exact-knn cant come with no one implementation. But it would
better come in a separate patch.
2. I wonder how useful it really is to allow mixing exact and non-exact
return values from the distance function. The distance function included in
the patch always returns recheck=true. I have a feeling that all other
distance function will also always return either true or false.
For geometrical datatypes recheck variations in consistent methods are also
very rare (I can't remember any). But imagine opclass for arrays where keys
have different representation depending on array length. For such opclass
and knn on similarity recheck flag could be useful.
3. A binary heap would be a better data structure to buffer the rechecked
values. A Red-Black tree allows random insertions and deletions, but in
this case you need to insert arbitrary values but only remove the minimum
item. That's exactly what a binary heap excels at. We have a nice binary
heap implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.
Hmm. For me binary heap would be a better data structure for KNN-GiST at
all :-)
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to do
this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, or perhaps
add a new node type for it.Of course that's exactly what your partial sort patch does :-). I haven't
looked at that in detail, but I don't think the approach the partial sort
patch takes will work here as is. In the KNN-GiST case, the index is
returning tuples roughly in the right order, but a tuple that it returns
might in reality belong somewhere later in the ordering. In the partial
sort patch, the "input stream" of tuples is divided into non-overlapping
groups, so that the tuples within the group are not sorted, but the groups
are. I think the partial sort case is a special case of the KNN-GiST case,
if you consider the lower bound of each tuple to be the leading keys that
you don't need to sort.
Yes. But, for instance btree accesses heap for unique checking. Is really
it so crimilal? :-)
This is not only question of a new node or extending existing node. We need
to teach planner/executor access method can return value of some expression
which is lower bound of another expression. AFICS now access method can
return only original indexed datums and TIDs. So, I afraid that enormous
infrastructure changes are required. And I can hardly imagine what they
should look like.
------
With best regards,
Alexander Korotkov.
On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
On Tue, Jan 28, 2014 at 5:54 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to do
this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, or perhaps
add a new node type for it.Of course that's exactly what your partial sort patch does :-). I haven't
looked at that in detail, but I don't think the approach the partial sort
patch takes will work here as is. In the KNN-GiST case, the index is
returning tuples roughly in the right order, but a tuple that it returns
might in reality belong somewhere later in the ordering. In the partial
sort patch, the "input stream" of tuples is divided into non-overlapping
groups, so that the tuples within the group are not sorted, but the groups
are. I think the partial sort case is a special case of the KNN-GiST case,
if you consider the lower bound of each tuple to be the leading keys that
you don't need to sort.Yes. But, for instance btree accesses heap for unique checking. Is really
it so crimilal? :-)
Well, it is generally considered an ugly hack in b-tree too. I'm not
100% opposed to doing such a hack in GiST, but would very much prefer
not to.
This is not only question of a new node or extending existing node. We need
to teach planner/executor access method can return value of some expression
which is lower bound of another expression. AFICS now access method can
return only original indexed datums and TIDs. So, I afraid that enormous
infrastructure changes are required. And I can hardly imagine what they
should look like.
Yeah, I'm not sure either. Maybe a new field in IndexScanDesc, along
with xs_itup. Or as an attribute of xs_itup itself.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 28, 2014 at 9:32 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
On Tue, Jan 28, 2014 at 5:54 PM, Heikki Linnakangas <
hlinnakangas@vmware.comwrote:
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to do
this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, or
perhaps
add a new node type for it.Of course that's exactly what your partial sort patch does :-). I haven't
looked at that in detail, but I don't think the approach the partial sort
patch takes will work here as is. In the KNN-GiST case, the index is
returning tuples roughly in the right order, but a tuple that it returns
might in reality belong somewhere later in the ordering. In the partial
sort patch, the "input stream" of tuples is divided into non-overlapping
groups, so that the tuples within the group are not sorted, but the
groups
are. I think the partial sort case is a special case of the KNN-GiST
case,
if you consider the lower bound of each tuple to be the leading keys that
you don't need to sort.Yes. But, for instance btree accesses heap for unique checking. Is really
it so crimilal? :-)Well, it is generally considered an ugly hack in b-tree too. I'm not 100%
opposed to doing such a hack in GiST, but would very much prefer not to.This is not only question of a new node or extending existing node. We
need
to teach planner/executor access method can return value of some
expression
which is lower bound of another expression. AFICS now access method can
return only original indexed datums and TIDs. So, I afraid that enormous
infrastructure changes are required. And I can hardly imagine what they
should look like.Yeah, I'm not sure either. Maybe a new field in IndexScanDesc, along with
xs_itup. Or as an attribute of xs_itup itself.
This shouldn't look like a hack too. Otherwise I see no point of it: it's
better to have some isolated hack in access method than hack in
planner/executor. So I see following changes to be needed to implement this
right way:
1) Implement new relation between operators: operator1 is lower bound of
operator2.
2) Extend am interface to let it return values of operators.
3) Implement new node for knn-sorting.
However, it requires a lot of changes in PostgreSQL infrastructure and can
appear to be not enough general too (we don't know until we have another
application).
------
With best regards,
Alexander Korotkov.
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.Yeah, but exact-knn cant come with no one implementation. But it would
better come in a separate patch.
I tried to split them. Separated patches are attached. I changed
the order of the arguments as point <-> polygon, because point was
the first one on all the others. Its commutator was required for
the index, so I added it on the second patch. I also added tests
for the operator. I think it is ready for committer as a separate
patch. We can add it to the open CommitFest.
I have made some cosmetic changes on the patches. I hope they are
useful.
I added support to point <-> circle operator with the same GiST
distance function you added for polygon. I can change it, if it is not
the right way.
2. I wonder how useful it really is to allow mixing exact and non-exact
return values from the distance function. The distance function included in
the patch always returns recheck=true. I have a feeling that all other
distance function will also always return either true or false.For geometrical datatypes recheck variations in consistent methods are also
very rare (I can't remember any). But imagine opclass for arrays where keys
have different representation depending on array length. For such opclass
and knn on similarity recheck flag could be useful.
I also wonder how useful it is. Your example is convincing, but maybe
setting it index-wide will make the decisions on the framework easier.
For example, how hard would it be to decide if further sorting is
required or not on the planner?
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to do
this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, or perhaps
add a new node type for it.Of course that's exactly what your partial sort patch does :-). I haven't
looked at that in detail, but I don't think the approach the partial sort
patch takes will work here as is. In the KNN-GiST case, the index is
returning tuples roughly in the right order, but a tuple that it returns
might in reality belong somewhere later in the ordering. In the partial
sort patch, the "input stream" of tuples is divided into non-overlapping
groups, so that the tuples within the group are not sorted, but the groups
are. I think the partial sort case is a special case of the KNN-GiST case,
if you consider the lower bound of each tuple to be the leading keys that
you don't need to sort.Yes. But, for instance btree accesses heap for unique checking. Is really
it so crimilal? :-)
This is not only question of a new node or extending existing node. We need
to teach planner/executor access method can return value of some expression
which is lower bound of another expression. AFICS now access method can
return only original indexed datums and TIDs. So, I afraid that enormous
infrastructure changes are required. And I can hardly imagine what they
should look like.
Unfortunately, I am not experienced enough to judge your implementation.
As far as I understand the problem is partially sorting rows on
the index scan node. It can lead the planner to choose non-optimal
plans, because of not taking into account the cost of sorting.
Attachments:
polygon-distance-op-1.patchtext/plain; charset=utf-8Download
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index 54391fd..402ea40 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -2408,36 +2408,42 @@ lseg_interpt(PG_FUNCTION_ARGS)
** Routines for position comparisons of differently-typed
** 2D objects.
**
***********************************************************************/
/*---------------------------------------------------------------------
* dist_
* Minimum distance from one object to another.
*-------------------------------------------------------------------*/
+/*
+ * Distance from a point to a line
+ */
Datum
dist_pl(PG_FUNCTION_ARGS)
{
Point *pt = PG_GETARG_POINT_P(0);
LINE *line = PG_GETARG_LINE_P(1);
PG_RETURN_FLOAT8(dist_pl_internal(pt, line));
}
static double
dist_pl_internal(Point *pt, LINE *line)
{
return fabs((line->A * pt->x + line->B * pt->y + line->C) /
HYPOT(line->A, line->B));
}
+/*
+ * Distance from a point to a lseg
+ */
Datum
dist_ps(PG_FUNCTION_ARGS)
{
Point *pt = PG_GETARG_POINT_P(0);
LSEG *lseg = PG_GETARG_LSEG_P(1);
PG_RETURN_FLOAT8(dist_ps_internal(pt, lseg));
}
static double
@@ -2487,21 +2493,21 @@ dist_ps_internal(Point *pt, LSEG *lseg)
result = point_dt(pt, &lseg->p[0]);
tmpdist = point_dt(pt, &lseg->p[1]);
if (tmpdist < result)
result = tmpdist;
}
return result;
}
/*
- ** Distance from a point to a path
+ * Distance from a point to a path
*/
Datum
dist_ppath(PG_FUNCTION_ARGS)
{
Point *pt = PG_GETARG_POINT_P(0);
PATH *path = PG_GETARG_PATH_P(1);
float8 result = 0.0; /* keep compiler quiet */
bool have_min = false;
float8 tmp;
int i;
@@ -2543,37 +2549,42 @@ dist_ppath(PG_FUNCTION_ARGS)
{
result = tmp;
have_min = true;
}
}
break;
}
PG_RETURN_FLOAT8(result);
}
+/*
+ * Distance from a point to a box
+ */
Datum
dist_pb(PG_FUNCTION_ARGS)
{
Point *pt = PG_GETARG_POINT_P(0);
BOX *box = PG_GETARG_BOX_P(1);
float8 result;
Point *near;
near = DatumGetPointP(DirectFunctionCall2(close_pb,
PointPGetDatum(pt),
BoxPGetDatum(box)));
result = point_dt(near, pt);
PG_RETURN_FLOAT8(result);
}
-
+/*
+ * Distance from a lseg to a line
+ */
Datum
dist_sl(PG_FUNCTION_ARGS)
{
LSEG *lseg = PG_GETARG_LSEG_P(0);
LINE *line = PG_GETARG_LINE_P(1);
float8 result,
d2;
if (has_interpt_sl(lseg, line))
result = 0.0;
@@ -2582,57 +2593,63 @@ dist_sl(PG_FUNCTION_ARGS)
result = dist_pl_internal(&lseg->p[0], line);
d2 = dist_pl_internal(&lseg->p[1], line);
/* XXX shouldn't we take the min not max? */
if (d2 > result)
result = d2;
}
PG_RETURN_FLOAT8(result);
}
-
+/*
+ * Distance from a lseg to a box
+ */
Datum
dist_sb(PG_FUNCTION_ARGS)
{
LSEG *lseg = PG_GETARG_LSEG_P(0);
BOX *box = PG_GETARG_BOX_P(1);
Point *tmp;
Datum result;
tmp = DatumGetPointP(DirectFunctionCall2(close_sb,
LsegPGetDatum(lseg),
BoxPGetDatum(box)));
result = DirectFunctionCall2(dist_pb,
PointPGetDatum(tmp),
BoxPGetDatum(box));
PG_RETURN_DATUM(result);
}
-
+/*
+ * Distance from a line to a box
+ */
Datum
dist_lb(PG_FUNCTION_ARGS)
{
#ifdef NOT_USED
LINE *line = PG_GETARG_LINE_P(0);
BOX *box = PG_GETARG_BOX_P(1);
#endif
/* need to think about this one for a while */
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function \"dist_lb\" not implemented")));
PG_RETURN_NULL();
}
-
+/*
+ * Distance from a circle to a polygon
+ */
Datum
dist_cpoly(PG_FUNCTION_ARGS)
{
CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
POLYGON *poly = PG_GETARG_POLYGON_P(1);
float8 result;
float8 d;
int i;
LSEG seg;
@@ -2669,20 +2686,69 @@ dist_cpoly(PG_FUNCTION_ARGS)
result = d;
}
result -= circle->radius;
if (result < 0)
result = 0;
PG_RETURN_FLOAT8(result);
}
+/*
+ * Distance from a point to a polygon
+ */
+Datum
+dist_ppoly(PG_FUNCTION_ARGS)
+{
+ Point *point = PG_GETARG_POINT_P(0);
+ POLYGON *poly = PG_GETARG_POLYGON_P(1);
+ float8 result;
+ float8 distance;
+ int i;
+ LSEG seg;
+
+ if (point_inside(point, poly->npts, poly->p) != 0)
+ {
+#ifdef GEODEBUG
+ printf("dist_ppoly- point inside of polygon\n");
+#endif
+ PG_RETURN_FLOAT8(0.0);
+ }
+
+ /* initialize distance with segment between first and last points */
+ seg.p[0].x = poly->p[0].x;
+ seg.p[0].y = poly->p[0].y;
+ seg.p[1].x = poly->p[poly->npts - 1].x;
+ seg.p[1].y = poly->p[poly->npts - 1].y;
+ result = dist_ps_internal(point, &seg);
+#ifdef GEODEBUG
+ printf("dist_ppoly- segment 0/n distance is %f\n", result);
+#endif
+
+ /* check distances for other segments */
+ for (i = 0; i < poly->npts - 1; i++)
+ {
+ seg.p[0].x = poly->p[i].x;
+ seg.p[0].y = poly->p[i].y;
+ seg.p[1].x = poly->p[i + 1].x;
+ seg.p[1].y = poly->p[i + 1].y;
+ distance = dist_ps_internal(point, &seg);
+#ifdef GEODEBUG
+ printf("dist_ppoly- segment %d distance is %f\n", i + 1, distance);
+#endif
+ if (distance < result)
+ result = distance;
+ }
+
+ PG_RETURN_FLOAT8(result);
+}
+
/*---------------------------------------------------------------------
* interpt_
* Intersection point of objects.
* We choose to ignore the "point" of intersection between
* lines and boxes, since there are typically two.
*-------------------------------------------------------------------*/
/* Get intersection point of lseg and line; returns NULL if no intersection */
static Point *
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index f8b4a65..c31b8a8 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1009,20 +1009,22 @@ DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_
DESCR("multiply");
DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - ));
DESCR("divide");
DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - ));
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
+DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
+DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
/* additional geometric operators - thomas 1997-07-09 */
DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - ));
DESCR("distance between");
DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - ));
DESCR("intersect");
DATA(insert OID = 1526 ( "?||" PGNSP PGUID b f f 601 601 16 1526 0 lseg_parallel - - ));
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0af1248..95f0b74 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -806,20 +806,21 @@ DESCR("set bit");
DATA(insert OID = 749 ( overlay PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 17 "17 17 23 23" _null_ _null_ _null_ _null_ byteaoverlay _null_ _null_ _null_ ));
DESCR("substitute portion of string");
DATA(insert OID = 752 ( overlay PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 17 "17 17 23" _null_ _null_ _null_ _null_ byteaoverlay_no_len _null_ _null_ _null_ ));
DESCR("substitute portion of string");
DATA(insert OID = 725 ( dist_pl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 628" _null_ _null_ _null_ _null_ dist_pl _null_ _null_ _null_ ));
DATA(insert OID = 726 ( dist_lb PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "628 603" _null_ _null_ _null_ _null_ dist_lb _null_ _null_ _null_ ));
DATA(insert OID = 727 ( dist_sl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "601 628" _null_ _null_ _null_ _null_ dist_sl _null_ _null_ _null_ ));
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
+DATA(insert OID = 3590 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
DATA(insert OID = 742 ( text_gt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_gt _null_ _null_ _null_ ));
DATA(insert OID = 743 ( text_ge PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_ge _null_ _null_ _null_ ));
DATA(insert OID = 745 ( current_user PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 19 "" _null_ _null_ _null_ _null_ current_user _null_ _null_ _null_ ));
DESCR("current user name");
DATA(insert OID = 746 ( session_user PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 19 "" _null_ _null_ _null_ _null_ session_user _null_ _null_ _null_ ));
DESCR("session user name");
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
index 60b5d01..91610d8 100644
--- a/src/include/utils/geo_decls.h
+++ b/src/include/utils/geo_decls.h
@@ -388,20 +388,21 @@ extern Datum circle_contain_pt(PG_FUNCTION_ARGS);
extern Datum pt_contained_circle(PG_FUNCTION_ARGS);
extern Datum circle_add_pt(PG_FUNCTION_ARGS);
extern Datum circle_sub_pt(PG_FUNCTION_ARGS);
extern Datum circle_mul_pt(PG_FUNCTION_ARGS);
extern Datum circle_div_pt(PG_FUNCTION_ARGS);
extern Datum circle_diameter(PG_FUNCTION_ARGS);
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
+extern Datum dist_ppoly(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
extern Datum circle_box(PG_FUNCTION_ARGS);
extern Datum poly_circle(PG_FUNCTION_ARGS);
extern Datum circle_poly(PG_FUNCTION_ARGS);
extern Datum circle_area(PG_FUNCTION_ARGS);
/* support routines for the GiST access method (access/gist/gistproc.c) */
extern Datum gist_box_compress(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/polygon.out b/src/test/regress/expected/polygon.out
index b252902..b449e32 100644
--- a/src/test/regress/expected/polygon.out
+++ b/src/test/regress/expected/polygon.out
@@ -271,10 +271,22 @@ SELECT '((1,4),(1,1),(4,1),(4,2),(2,2),(2,4),(1,4))'::polygon && '((3,3),(4,3),(
-------
f
(1 row)
SELECT '((200,800),(800,800),(800,200),(200,200))' && '(1000,1000,0,0)'::polygon AS "true";
true
------
t
(1 row)
+-- distance from a point
+SELECT
+ '(0,0)'::point <-> '((0,0),(1,2),(2,1))'::polygon as on_corner,
+ '(1,1)'::point <-> '((0,0),(2,2),(1,3))'::polygon as on_segment,
+ '(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
+ '(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
+ '(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
+ on_corner | on_segment | inside | near_corner | near_segment
+-----------+------------+--------+-----------------+--------------
+ 0 | 0 | 0 | 1.4142135623731 | 3.2
+(1 row)
+
diff --git a/src/test/regress/sql/polygon.sql b/src/test/regress/sql/polygon.sql
index 2dad566..20a7c36 100644
--- a/src/test/regress/sql/polygon.sql
+++ b/src/test/regress/sql/polygon.sql
@@ -164,10 +164,18 @@ SELECT polygon '(2.0,0.0),(2.0,4.0),(0.0,0.0)' && polygon '(3.0,1.0),(3.0,3.0),(
SELECT '((0,4),(6,4),(1,2),(6,0),(0,0))'::polygon && '((2,1),(2,3),(3,3),(3,1))'::polygon AS "true";
-- +--+ *--*
-- | | | |
-- | | *--*
-- | +----+
-- | |
-- +-------+
SELECT '((1,4),(1,1),(4,1),(4,2),(2,2),(2,4),(1,4))'::polygon && '((3,3),(4,3),(4,4),(3,4),(3,3))'::polygon AS "false";
SELECT '((200,800),(800,800),(800,200),(200,200))' && '(1000,1000,0,0)'::polygon AS "true";
+
+-- distance from a point
+SELECT
+ '(0,0)'::point <-> '((0,0),(1,2),(2,1))'::polygon as on_corner,
+ '(1,1)'::point <-> '((0,0),(2,2),(1,3))'::polygon as on_segment,
+ '(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
+ '(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
+ '(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
knn-gist-recheck-2.patchtext/plain; charset=utf-8Download
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
index 0158b17..2cfe9e8 100644
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@@ -98,20 +98,21 @@
<literal><<|</>
<literal><@</>
<literal>@></>
<literal>@</>
<literal>|&></>
<literal>|>></>
<literal>~</>
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
<entry><literal>inet_ops</></entry>
<entry><type>inet</>, <type>cidr</></entry>
<entry>
<literal>&&</>
<literal>>></>
<literal>>>=</>
<literal>></>
@@ -156,20 +157,21 @@
<literal><<|</>
<literal><@</>
<literal>@></>
<literal>@</>
<literal>|&></>
<literal>|>></>
<literal>~</>
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
<entry><literal>range_ops</></entry>
<entry>any range type</entry>
<entry>
<literal>&&</>
<literal>&></>
<literal>&<</>
<literal>>></>
@@ -200,20 +202,26 @@
<literal>@@</>
</entry>
<entry>
</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
for example
<programlisting>
CREATE INDEX ON my_table USING gist (my_inet_column inet_ops);
</programlisting>
</para>
</sect1>
@@ -759,56 +767,62 @@ my_same(PG_FUNCTION_ARGS)
so the results must be consistent with the operator's semantics.
For a leaf index entry the result just represents the distance to
the index entry; for an internal tree node, the result must be the
smallest distance that any child entry could have.
</para>
<para>
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
-CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
+CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
</programlisting>
And the matching code in the C module could then follow this skeleton:
<programlisting>
Datum my_distance(PG_FUNCTION_ARGS);
PG_FUNCTION_INFO_V1(my_distance);
Datum
my_distance(PG_FUNCTION_ARGS)
{
GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
/*
* determine return value as a function of strategy, key and query.
*/
PG_RETURN_FLOAT8(retval);
}
</programlisting>
The arguments to the <function>distance</> function are identical to
- the arguments of the <function>consistent</> function, except that no
- recheck flag is used. The distance to a leaf index entry must always
- be determined exactly, since there is no way to re-order the tuples
- once they are returned. Some approximation is allowed when determining
- the distance to an internal tree node, so long as the result is never
+ the arguments of the <function>consistent</> function. When
+ <literal>recheck = true</> then value of distance will
+ be rechecked from heap tuple before tuple is returned. If
+ <literal>recheck</> flag isn't set then it's true by default for
+ compatibility reasons. The <literal>recheck</> flag can be used only
+ when ordering operator returns <type>float8</> value comparable with
+ result of <function>distance</> function. Result of distance function
+ should be never greater than result of ordering operator.
+ Same approximation is allowed when determining the distance to an
+ internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..7be050d 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -9,20 +9,21 @@
*
* IDENTIFICATION
* src/backend/access/gist/gistget.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/gist_private.h"
#include "access/relscan.h"
+#include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
/*
* gistindex_keytest() -- does this index tuple satisfy the scan key(s)?
*
@@ -48,38 +49,41 @@ static bool
gistindex_keytest(IndexScanDesc scan,
IndexTuple tuple,
Page page,
OffsetNumber offset,
bool *recheck_p)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
- double *distance_p;
+ GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
/*
* If it's a leftover invalid tuple from pre-9.1, treat it as a match with
* minimum possible distances. This means we'll always follow it to the
* referenced page.
*/
if (GistTupleIsInvalid(tuple))
{
int i;
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
- so->distances[i] = -get_float8_infinity();
+ {
+ so->distances[i].value = -get_float8_infinity();
+ so->distances[i].recheck = false;
+ }
return true;
}
/* Check whether it matches according to the Consistent functions */
while (keySize > 0)
{
Datum datum;
bool isNull;
datum = index_getattr(tuple,
@@ -163,53 +167,56 @@ gistindex_keytest(IndexScanDesc scan,
bool isNull;
datum = index_getattr(tuple,
key->sk_attno,
giststate->tupdesc,
&isNull);
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
- *distance_p = get_float8_infinity();
+ distance_p->value = get_float8_infinity();
+ distance_p->recheck = false;
}
else
{
Datum dist;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
datum, r, page, offset,
FALSE, isNull);
/*
* Call the Distance function to evaluate the distance. The
* arguments are the index datum (as a GISTENTRY*), the comparison
* datum, and the ordering operator's strategy number and subtype
* from pg_amop.
*
* (Presently there's no need to pass the subtype since it'll
* always be zero, but might as well pass it for possible future
* use.)
*
- * Note that Distance functions don't get a recheck argument. We
- * can't tolerate lossy distance calculations on leaf tuples;
- * there is no opportunity to re-sort the tuples afterwards.
+ * Distance function gets a recheck argument as well as consistent
+ * function. Distance will be re-calculated from heap tuple when
+ * needed.
*/
- dist = FunctionCall4Coll(&key->sk_func,
+ distance_p->recheck = false;
+ dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
- ObjectIdGetDatum(key->sk_subtype));
+ ObjectIdGetDatum(key->sk_subtype),
+ PointerGetDatum(&distance_p->recheck));
- *distance_p = DatumGetFloat8(dist);
+ distance_p->value = DatumGetFloat8(dist);
}
key++;
distance_p++;
keySize--;
}
return true;
}
@@ -227,21 +234,21 @@ gistindex_keytest(IndexScanDesc scan,
* tuples should be reported directly into the bitmap. If they are NULL,
* we're doing a plain or ordered indexscan. For a plain indexscan, heap
* tuple TIDs are returned into so->pageData[]. For an ordered indexscan,
* heap tuple TIDs are pushed into individual search queue items.
*
* If we detect that the index page has split since we saw its downlink
* in the parent, we push its new right sibling onto the queue so the
* sibling will be processed next.
*/
static void
-gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
+gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
Buffer buffer;
Page page;
GISTPageOpaque opaque;
OffsetNumber maxoff;
OffsetNumber i;
GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
bool isNew;
@@ -277,21 +284,21 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
/* Create new GISTSearchItem for the right sibling index page */
item = palloc(sizeof(GISTSearchItem));
item->next = NULL;
item->blkno = opaque->rightlink;
item->data.parentlsn = pageItem->data.parentlsn;
/* Insert it into the queue using same distances as for this page */
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
- sizeof(double) * scan->numberOfOrderBys);
+ sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
MemoryContextSwitchTo(oldcxt);
}
so->nPageData = so->curPageData = 0;
/*
* check all tuples on page
@@ -368,63 +375,166 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* only have a shared lock, so we need to get the LSN
* atomically.
*/
item->data.parentlsn = BufferGetLSNAtomic(buffer);
}
/* Insert it into the queue using new distance data */
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
- sizeof(double) * scan->numberOfOrderBys);
+ sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
MemoryContextSwitchTo(oldcxt);
}
}
UnlockReleaseBuffer(buffer);
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+static bool
+searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *item)
+{
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+}
+
+/*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+static void
+searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *treeItem,
+ GISTSearchItem *item)
+{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(&item->data.heap.heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(&item->data.heap.heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ UnlockReleaseBuffer(buffer);
+ pfree(item);
+ return;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(&tup, so->slot, InvalidBuffer, false);
+ FormIndexDatum(so->indexInfo, so->slot, so->estate, values, isnull);
+
+ /* Prepare new tree item and reinsert it */
+ memcpy(tmpItem, treeItem, GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys);
+ tmpItem->head = item;
+ tmpItem->lastHeap = item;
+ item->next = NULL;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (tmpItem->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ tmpItem->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ tmpItem->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ tmpItem->distances[i].value = newDistance;
+
+ }
+ }
+ (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ UnlockReleaseBuffer(buffer);
+}
+
+/*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*
* NOTE: on successful return, so->curTreeItem is the GISTSearchTreeItem that
* contained the result item. Callers can use so->curTreeItem->distances as
* the distances value for the item.
*/
static GISTSearchItem *
-getNextGISTSearchItem(GISTScanOpaque so)
+getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+
for (;;)
{
GISTSearchItem *item;
/* Update curTreeItem if we don't have one */
if (so->curTreeItem == NULL)
{
so->curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so->queue);
/* Done when tree is empty */
if (so->curTreeItem == NULL)
break;
}
item = so->curTreeItem->head;
if (item != NULL)
{
/* Delink item from chain */
so->curTreeItem->head = item->next;
if (item == so->curTreeItem->lastHeap)
so->curTreeItem->lastHeap = NULL;
+
+ /* Recheck distance from heap tuple if needed */
+ if (GISTSearchItemIsHeap(*item) &&
+ searchTreeItemNeedDistanceRecheck(scan, so->curTreeItem))
+ {
+ searchTreeItemDistanceRecheck(scan, so->curTreeItem, item);
+ continue;
+ }
/* Return item; caller is responsible to pfree it */
return item;
}
/* curTreeItem is exhausted, so remove it from rbtree */
rb_delete(so->queue, (RBNode *) so->curTreeItem);
so->curTreeItem = NULL;
}
return NULL;
@@ -434,21 +544,21 @@ getNextGISTSearchItem(GISTScanOpaque so)
* Fetch next heap tuple in an ordered search
*/
static bool
getNextNearest(IndexScanDesc scan)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
do
{
- GISTSearchItem *item = getNextGISTSearchItem(so);
+ GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
if (GISTSearchItemIsHeap(*item))
{
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
res = true;
@@ -514,21 +624,21 @@ gistgettuple(PG_FUNCTION_ARGS)
/* continuing to return tuples from a leaf page */
scan->xs_ctup.t_self = so->pageData[so->curPageData].heapPtr;
scan->xs_recheck = so->pageData[so->curPageData].recheck;
so->curPageData++;
PG_RETURN_BOOL(true);
}
/* find and process the next index page */
do
{
- GISTSearchItem *item = getNextGISTSearchItem(so);
+ GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
CHECK_FOR_INTERRUPTS();
/*
* While scanning a leaf page, ItemPointers of matching heap
* tuples are stored in so->pageData. If there are any on
* this page, we fall out of the inner "do" and loop around to
@@ -566,21 +676,21 @@ gistgetbitmap(PG_FUNCTION_ARGS)
fakeItem.blkno = GIST_ROOT_BLKNO;
memset(&fakeItem.data.parentlsn, 0, sizeof(GistNSN));
gistScanPage(scan, &fakeItem, NULL, tbm, &ntids);
/*
* While scanning a leaf page, ItemPointers of matching heap tuples will
* be stored directly into tbm, so we don't need to deal with them here.
*/
for (;;)
{
- GISTSearchItem *item = getNextGISTSearchItem(so);
+ GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
CHECK_FOR_INTERRUPTS();
gistScanPage(scan, item, so->curTreeItem->distances, tbm, &ntids);
pfree(item);
}
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index db0bec6..fd3546a 100644
--- a/src/backend/access/gist/gistproc.c
+++ b/src/backend/access/gist/gistproc.c
@@ -1091,20 +1091,21 @@ gist_poly_consistent(PG_FUNCTION_ARGS)
*/
result = rtree_internal_consistent(DatumGetBoxP(entry->key),
&(query->boundbox), strategy);
/* Avoid memory leak if supplied poly is toasted */
PG_FREE_IF_COPY(query, 1);
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
/*
* GiST compress for circles: represent a circle by its bounding box
*/
Datum
gist_circle_compress(PG_FUNCTION_ARGS)
{
@@ -1452,10 +1453,44 @@ gist_point_distance(PG_FUNCTION_ARGS)
PG_GETARG_POINT_P(1));
break;
default:
elog(ERROR, "unrecognized strategy number: %d", strategy);
distance = 0.0; /* keep compiler quiet */
break;
}
PG_RETURN_FLOAT8(distance);
}
+
+/*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+Datum
+gist_inexact_distance(PG_FUNCTION_ARGS)
+{
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 8360b16..9bb8294 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -10,41 +10,54 @@
* IDENTIFICATION
* src/backend/access/gist/gistscan.c
*
*-------------------------------------------------------------------------
*/
#include "postgres.h"
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+#include "catalog/index.h"
+#include "executor/executor.h"
+#include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
/*
* RBTree support functions for the GISTSearchTreeItem queue
*/
static int
GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
{
const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
int i;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
- if (sa->distances[i] != sb->distances[i])
- return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
+ const GISTSearchTreeItemDistance distance_a = sa->distances[i];
+ const GISTSearchTreeItemDistance distance_b = sb->distances[i];
+
+ if (distance_a.value != distance_b.value)
+ return (distance_a.value > distance_b.value) ? 1 : -1;
+
+ /*
+ * Items without recheck can be immediately returned. So they are
+ * placed first.
+ */
+ if (distance_a.recheck != distance_b.recheck)
+ return distance_a.recheck ? 1 : -1;
}
return 0;
}
static void
GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
{
GISTSearchTreeItem *scurrent = (GISTSearchTreeItem *) existing;
const GISTSearchTreeItem *snew = (const GISTSearchTreeItem *) newrb;
@@ -76,21 +89,21 @@ GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
newitem->next = scurrent->lastHeap->next;
scurrent->lastHeap->next = newitem;
}
}
static RBNode *
GISTSearchTreeItemAllocator(void *arg)
{
IndexScanDesc scan = (IndexScanDesc) arg;
- return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
+ return palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
}
static void
GISTSearchTreeItemDeleter(RBNode *rb, void *arg)
{
pfree(rb);
}
/*
@@ -120,24 +133,36 @@ gistbeginscan(PG_FUNCTION_ARGS)
oldCxt = MemoryContextSwitchTo(giststate->scanCxt);
/* initialize opaque data */
so = (GISTScanOpaque) palloc0(sizeof(GISTScanOpaqueData));
so->giststate = giststate;
giststate->tempCxt = createTempGistContext();
so->queue = NULL;
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
- so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
- so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
+ so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys);
+ so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Prepare data structures for distance recheck from heap tuple */
+
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ so->indexInfo = BuildIndexInfo(scan->indexRelation);
+ so->estate = CreateExecutorState();
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
PG_RETURN_POINTER(scan);
}
Datum
gistrescan(PG_FUNCTION_ARGS)
{
@@ -179,23 +204,30 @@ gistrescan(PG_FUNCTION_ARGS)
ALLOCSET_DEFAULT_MAXSIZE);
first_time = false;
}
else
{
/* third or later time through */
MemoryContextReset(so->queueCxt);
first_time = false;
}
+ if (scan->numberOfOrderBys > 0 && !so->slot)
+ {
+ /* Prepare heap tuple slot for distance recheck */
+ so->slot = MakeSingleTupleTableSlot(RelationGetDescr(scan->heapRelation));
+ }
+
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
- so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
+ so->queue = rb_create(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
GISTSearchTreeItemDeleter,
scan);
MemoryContextSwitchTo(oldCxt);
so->curTreeItem = NULL;
so->firstCall = true;
@@ -282,20 +314,24 @@ gistrescan(PG_FUNCTION_ARGS)
{
ScanKey skey = scan->orderByData + i;
FmgrInfo *finfo = &(so->giststate->distanceFn[skey->sk_attno - 1]);
/* Check we actually have a distance function ... */
if (!OidIsValid(finfo->fn_oid))
elog(ERROR, "missing support function %d for attribute %d of index \"%s\"",
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
if (!first_time)
skey->sk_func.fn_extra = fn_extras[i];
}
if (!first_time)
pfree(fn_extras);
}
@@ -316,18 +352,21 @@ gistrestrpos(PG_FUNCTION_ARGS)
elog(ERROR, "GiST does not support mark/restore");
PG_RETURN_VOID();
}
Datum
gistendscan(PG_FUNCTION_ARGS)
{
IndexScanDesc scan = (IndexScanDesc) PG_GETARG_POINTER(0);
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ if (so->slot)
+ ExecDropSingleTupleTableSlot(so->slot);
+
/*
* freeGISTstate is enough to clean up everything made by gistbeginscan,
* as well as the queueCxt if there is a separate context for it.
*/
freeGISTstate(so->giststate);
PG_RETURN_VOID();
}
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index 402ea40..c9788e4 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -63,20 +63,21 @@ static int pair_encode(float8 x, float8 y, char *str);
static int pair_count(char *s, char delim);
static int path_decode(int opentype, int npts, char *str, int *isopen, char **ss, Point *p);
static char *path_encode(enum path_delim path_delim, int npts, Point *pt);
static void statlseg_construct(LSEG *lseg, Point *pt1, Point *pt2);
static double box_ar(BOX *box);
static void box_cn(Point *center, BOX *box);
static Point *interpt_sl(LSEG *lseg, LINE *line);
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+static float8 dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
/*
* Delimiters for input and output strings.
* LDELIM, RDELIM, and DELIM are left, right, and separator delimiters, respectively.
* LDELIM_EP, RDELIM_EP are left and right delimiters for paths with endpoints.
*/
@@ -2634,20 +2635,52 @@ dist_lb(PG_FUNCTION_ARGS)
/* need to think about this one for a while */
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("function \"dist_lb\" not implemented")));
PG_RETURN_NULL();
}
/*
+ * Distance from a point to a circle
+ */
+Datum
+dist_pc(PG_FUNCTION_ARGS)
+{
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+}
+
+/*
+ * Distance from a circle to a point
+ */
+Datum
+dist_cpoint(PG_FUNCTION_ARGS)
+{
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+}
+
+/*
* Distance from a circle to a polygon
*/
Datum
dist_cpoly(PG_FUNCTION_ARGS)
{
CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
POLYGON *poly = PG_GETARG_POLYGON_P(1);
float8 result;
float8 d;
int i;
@@ -2692,33 +2725,48 @@ dist_cpoly(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(result);
}
/*
* Distance from a point to a polygon
*/
Datum
dist_ppoly(PG_FUNCTION_ARGS)
{
- Point *point = PG_GETARG_POINT_P(0);
- POLYGON *poly = PG_GETARG_POLYGON_P(1);
+ PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(0),
+ PG_GETARG_POLYGON_P(1)));
+}
+
+/*
+ * Distance from a polygon to a point
+ */
+Datum
+dist_polyp(PG_FUNCTION_ARGS)
+{
+ PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
+ PG_GETARG_POLYGON_P(0)));
+}
+
+static float8
+dist_ppoly_internal(Point *point, POLYGON *poly)
+{
float8 result;
float8 distance;
int i;
LSEG seg;
if (point_inside(point, poly->npts, poly->p) != 0)
{
#ifdef GEODEBUG
printf("dist_ppoly- point inside of polygon\n");
#endif
- PG_RETURN_FLOAT8(0.0);
+ return 0.0;
}
/* initialize distance with segment between first and last points */
seg.p[0].x = poly->p[0].x;
seg.p[0].y = poly->p[0].y;
seg.p[1].x = poly->p[poly->npts - 1].x;
seg.p[1].y = poly->p[poly->npts - 1].y;
result = dist_ps_internal(point, &seg);
#ifdef GEODEBUG
printf("dist_ppoly- segment 0/n distance is %f\n", result);
@@ -2732,21 +2780,21 @@ dist_ppoly(PG_FUNCTION_ARGS)
seg.p[1].x = poly->p[i + 1].x;
seg.p[1].y = poly->p[i + 1].y;
distance = dist_ps_internal(point, &seg);
#ifdef GEODEBUG
printf("dist_ppoly- segment %d distance is %f\n", i + 1, distance);
#endif
if (distance < result)
result = distance;
}
- PG_RETURN_FLOAT8(result);
+ return result;
}
/*---------------------------------------------------------------------
* interpt_
* Intersection point of objects.
* We choose to ignore the "point" of intersection between
* lines and boxes, since there are typically two.
*-------------------------------------------------------------------*/
@@ -5091,37 +5139,20 @@ pt_contained_circle(PG_FUNCTION_ARGS)
{
Point *point = PG_GETARG_POINT_P(0);
CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
double d;
d = point_dt(&circle->center, point);
PG_RETURN_BOOL(d <= circle->radius);
}
-/* dist_pc - returns the distance between
- * a point and a circle.
- */
-Datum
-dist_pc(PG_FUNCTION_ARGS)
-{
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
-}
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
circle_center(PG_FUNCTION_ARGS)
{
CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
Point *result;
result = (Point *) palloc(sizeof(Point));
result->x = circle->center.x;
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 03e9903..6f98583 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -9,21 +9,23 @@
*
* src/include/access/gist_private.h
*
*-------------------------------------------------------------------------
*/
#ifndef GIST_PRIVATE_H
#define GIST_PRIVATE_H
#include "access/gist.h"
#include "access/itup.h"
+#include "executor/tuptable.h"
#include "fmgr.h"
+#include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/rbtree.h"
#include "utils/hsearch.h"
/*
* Maximum number of "halves" a page can be split into in one operation.
* Typically a split produces 2 halves, but can be more if keys have very
* different lengths, or when inserting multiple keys in one operation (as
* when inserting downlinks to an internal node). There is no theoretical
@@ -128,55 +130,70 @@ typedef struct GISTSearchItem
{
GistNSN parentlsn; /* parent page's LSN, if index page */
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
/*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+typedef struct GISTSearchTreeItemDistance
+{
+ double value;
+ bool recheck;
+} GISTSearchTreeItemDistance;
+
+/*
* Within a GISTSearchTreeItem's chain, heap items always appear before
* index-page items, since we want to visit heap items first. lastHeap points
* to the last heap item in the chain, or is NULL if there are none.
*/
typedef struct GISTSearchTreeItem
{
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
- double distances[1]; /* array with numberOfOrderBys entries */
+ GISTSearchTreeItemDistance distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
*/
typedef struct GISTScanOpaqueData
{
GISTSTATE *giststate; /* index information, see above */
RBTree *queue; /* queue of unvisited items */
MemoryContext queueCxt; /* context holding the queue */
bool qual_ok; /* false if qual can never be satisfied */
bool firstCall; /* true until first gistgettuple call */
GISTSearchTreeItem *curTreeItem; /* current queue item, if any */
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
- double *distances; /* output area for gistindex_keytest */
+ GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
/* XLog stuff */
#define XLOG_GIST_PAGE_UPDATE 0x00
/* #define XLOG_GIST_NEW_ROOT 0x20 */ /* not used anymore */
#define XLOG_GIST_PAGE_SPLIT 0x30
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 3ef5a49..dd468f6 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -643,39 +643,41 @@ DATA(insert ( 2594 604 604 4 s 487 783 0 ));
DATA(insert ( 2594 604 604 5 s 488 783 0 ));
DATA(insert ( 2594 604 604 6 s 491 783 0 ));
DATA(insert ( 2594 604 604 7 s 490 783 0 ));
DATA(insert ( 2594 604 604 8 s 489 783 0 ));
DATA(insert ( 2594 604 604 9 s 2575 783 0 ));
DATA(insert ( 2594 604 604 10 s 2574 783 0 ));
DATA(insert ( 2594 604 604 11 s 2577 783 0 ));
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*/
DATA(insert ( 2595 718 718 1 s 1506 783 0 ));
DATA(insert ( 2595 718 718 2 s 1507 783 0 ));
DATA(insert ( 2595 718 718 3 s 1513 783 0 ));
DATA(insert ( 2595 718 718 4 s 1508 783 0 ));
DATA(insert ( 2595 718 718 5 s 1509 783 0 ));
DATA(insert ( 2595 718 718 6 s 1512 783 0 ));
DATA(insert ( 2595 718 718 7 s 1511 783 0 ));
DATA(insert ( 2595 718 718 8 s 1510 783 0 ));
DATA(insert ( 2595 718 718 9 s 2589 783 0 ));
DATA(insert ( 2595 718 718 10 s 1515 783 0 ));
DATA(insert ( 2595 718 718 11 s 1514 783 0 ));
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
* of the family)
*/
DATA(insert ( 2745 2277 2277 1 s 2750 2742 0 ));
DATA(insert ( 2745 2277 2277 2 s 2751 2742 0 ));
DATA(insert ( 2745 2277 2277 3 s 2752 2742 0 ));
DATA(insert ( 2745 2277 2277 4 s 1070 2742 0 ));
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index 10a47df..1149923 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -197,27 +197,29 @@ DATA(insert ( 2593 603 603 4 2580 ));
DATA(insert ( 2593 603 603 5 2581 ));
DATA(insert ( 2593 603 603 6 2582 ));
DATA(insert ( 2593 603 603 7 2584 ));
DATA(insert ( 2594 604 604 1 2585 ));
DATA(insert ( 2594 604 604 2 2583 ));
DATA(insert ( 2594 604 604 3 2586 ));
DATA(insert ( 2594 604 604 4 2580 ));
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
DATA(insert ( 2595 718 718 4 2580 ));
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
DATA(insert ( 3655 3614 3614 4 3649 ));
DATA(insert ( 3655 3614 3614 5 3653 ));
DATA(insert ( 3655 3614 3614 6 3650 ));
DATA(insert ( 3655 3614 3614 7 3652 ));
DATA(insert ( 3702 3615 3615 1 3701 ));
DATA(insert ( 3702 3615 3615 2 3698 ));
DATA(insert ( 3702 3615 3615 3 3695 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index c31b8a8..b633665 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1007,23 +1007,27 @@ DATA(insert OID = 1517 ( "-" PGNSP PGUID b f f 718 600 718 0 0 circle_
DESCR("subtract");
DATA(insert OID = 1518 ( "*" PGNSP PGUID b f f 718 600 718 0 0 circle_mul_pt - - ));
DESCR("multiply");
DATA(insert OID = 1519 ( "/" PGNSP PGUID b f f 718 600 718 0 0 circle_div_pt - - ));
DESCR("divide");
DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 circle_distance - - ));
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
-DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
+DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
-DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
+DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
+DESCR("distance between");
+DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
+DESCR("distance between");
+DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3591 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
/* additional geometric operators - thomas 1997-07-09 */
DATA(insert OID = 1524 ( "<->" PGNSP PGUID b f f 628 603 701 0 0 dist_lb - - ));
DESCR("distance between");
DATA(insert OID = 1525 ( "?#" PGNSP PGUID b f f 601 601 16 1525 0 lseg_intersect - - ));
DESCR("intersect");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 95f0b74..1b7664e 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -807,20 +807,22 @@ DATA(insert OID = 749 ( overlay PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 17
DESCR("substitute portion of string");
DATA(insert OID = 752 ( overlay PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 17 "17 17 23" _null_ _null_ _null_ _null_ byteaoverlay_no_len _null_ _null_ _null_ ));
DESCR("substitute portion of string");
DATA(insert OID = 725 ( dist_pl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 628" _null_ _null_ _null_ _null_ dist_pl _null_ _null_ _null_ ));
DATA(insert OID = 726 ( dist_lb PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "628 603" _null_ _null_ _null_ _null_ dist_lb _null_ _null_ _null_ ));
DATA(insert OID = 727 ( dist_sl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "601 628" _null_ _null_ _null_ _null_ dist_sl _null_ _null_ _null_ ));
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3590 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
DATA(insert OID = 742 ( text_gt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_gt _null_ _null_ _null_ ));
DATA(insert OID = 743 ( text_ge PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_ge _null_ _null_ _null_ ));
DATA(insert OID = 745 ( current_user PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 19 "" _null_ _null_ _null_ _null_ current_user _null_ _null_ _null_ ));
DESCR("current user name");
DATA(insert OID = 746 ( session_user PGNSP PGUID 12 1 0 0 0 f f f f t f s 0 0 19 "" _null_ _null_ _null_ _null_ session_user _null_ _null_ _null_ ));
DESCR("session user name");
@@ -4002,20 +4004,22 @@ DESCR("GiST support");
DATA(insert OID = 2582 ( gist_box_picksplit PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 2281 "2281 2281" _null_ _null_ _null_ _null_ gist_box_picksplit _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2583 ( gist_box_union PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 603 "2281 2281" _null_ _null_ _null_ _null_ gist_box_union _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2584 ( gist_box_same PGNSP PGUID 12 1 0 0 0 f f f f t f i 3 0 2281 "603 603 2281" _null_ _null_ _null_ _null_ gist_box_same _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2585 ( gist_poly_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 604 23 26 2281" _null_ _null_ _null_ _null_ gist_poly_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+DATA(insert OID = 3589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 1030 ( gist_point_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_point_compress _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2179 ( gist_point_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 600 23 26 2281" _null_ _null_ _null_ _null_ gist_point_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
index 91610d8..64f63b2 100644
--- a/src/include/utils/geo_decls.h
+++ b/src/include/utils/geo_decls.h
@@ -387,22 +387,24 @@ extern Datum circle_ge(PG_FUNCTION_ARGS);
extern Datum circle_contain_pt(PG_FUNCTION_ARGS);
extern Datum pt_contained_circle(PG_FUNCTION_ARGS);
extern Datum circle_add_pt(PG_FUNCTION_ARGS);
extern Datum circle_sub_pt(PG_FUNCTION_ARGS);
extern Datum circle_mul_pt(PG_FUNCTION_ARGS);
extern Datum circle_div_pt(PG_FUNCTION_ARGS);
extern Datum circle_diameter(PG_FUNCTION_ARGS);
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
extern Datum circle_box(PG_FUNCTION_ARGS);
extern Datum poly_circle(PG_FUNCTION_ARGS);
extern Datum circle_poly(PG_FUNCTION_ARGS);
extern Datum circle_area(PG_FUNCTION_ARGS);
/* support routines for the GiST access method (access/gist/gistproc.c) */
extern Datum gist_box_compress(PG_FUNCTION_ARGS);
@@ -412,20 +414,21 @@ extern Datum gist_box_picksplit(PG_FUNCTION_ARGS);
extern Datum gist_box_consistent(PG_FUNCTION_ARGS);
extern Datum gist_box_penalty(PG_FUNCTION_ARGS);
extern Datum gist_box_same(PG_FUNCTION_ARGS);
extern Datum gist_poly_compress(PG_FUNCTION_ARGS);
extern Datum gist_poly_consistent(PG_FUNCTION_ARGS);
extern Datum gist_circle_compress(PG_FUNCTION_ARGS);
extern Datum gist_circle_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
extern Datum positionsel(PG_FUNCTION_ARGS);
extern Datum positionjoinsel(PG_FUNCTION_ARGS);
extern Datum contsel(PG_FUNCTION_ARGS);
extern Datum contjoinsel(PG_FUNCTION_ARGS);
#endif /* GEO_DECLS_H */
On Sun, Aug 3, 2014 at 5:48 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.Yeah, but exact-knn cant come with no one implementation. But it would
better come in a separate patch.I tried to split them. Separated patches are attached. I changed
the order of the arguments as point <-> polygon, because point was
the first one on all the others. Its commutator was required for
the index, so I added it on the second patch. I also added tests
for the operator. I think it is ready for committer as a separate
patch. We can add it to the open CommitFest.I have made some cosmetic changes on the patches. I hope they are
useful.I added support to point <-> circle operator with the same GiST
distance function you added for polygon. I can change it, if it is not
the right way.
Great, thanks!
2. I wonder how useful it really is to allow mixing exact and non-exact
return values from the distance function. The distance function
included in
the patch always returns recheck=true. I have a feeling that all other
distance function will also always return either true or false.For geometrical datatypes recheck variations in consistent methods are
also
very rare (I can't remember any). But imagine opclass for arrays where
keys
have different representation depending on array length. For such opclass
and knn on similarity recheck flag could be useful.I also wonder how useful it is. Your example is convincing, but maybe
setting it index-wide will make the decisions on the framework easier.
For example, how hard would it be to decide if further sorting is
required or not on the planner?
I think that for most use cases just some operators require further sorting
and some of them not. But it could appear one day that some index gives
part of its knn answers exact and part of them inexact. Same happen to
recheck of regular operators. Initially recheck flag was defined in
opclass. But later recheck became runtime flag.
4. (as you mentioned in the other thread: ) It's a modularity violation
that you peek into the heap tuple from gist. I think the proper way to
do
this would be to extend the IndexScan executor node to perform the
re-shuffling of tuples that come from the index in wrong order, orperhaps
add a new node type for it.
Of course that's exactly what your partial sort patch does :-). I
haven't
looked at that in detail, but I don't think the approach the partial
sort
patch takes will work here as is. In the KNN-GiST case, the index is
returning tuples roughly in the right order, but a tuple that itreturns
might in reality belong somewhere later in the ordering. In the partial
sort patch, the "input stream" of tuples is divided intonon-overlapping
groups, so that the tuples within the group are not sorted, but the
groups
are. I think the partial sort case is a special case of the KNN-GiST
case,
if you consider the lower bound of each tuple to be the leading keys
that
you don't need to sort.
Yes. But, for instance btree accesses heap for unique checking. Is really
it so crimilal? :-)
This is not only question of a new node or extending existing node. Weneed
to teach planner/executor access method can return value of some
expression
which is lower bound of another expression. AFICS now access method can
return only original indexed datums and TIDs. So, I afraid that enormous
infrastructure changes are required. And I can hardly imagine what they
should look like.Unfortunately, I am not experienced enough to judge your implementation.
As far as I understand the problem is partially sorting rows on
the index scan node. It can lead the planner to choose non-optimal
plans, because of not taking into account the cost of sorting.
Cost estimation of GiST is a big problem anyway. It doesn't care (and
can't) about amount of recheck for regular operators. In this patch, same
would be for knn recheck. The problem is that touching heap from access
method breaks incapsulation. One idea about this is to do sorting in
another nodes. However, I wonder if it would be an overengineering and
overhead. In attached patch I propose a different approach: put code
touching heap into separate index_get_heap_values function. Also new
version of patch includes regression tests and some cleanup.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-3.patchapplication/octet-stream; name=knn-gist-recheck-3.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 0158b17..2cfe9e8
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_same(PG_FUNCTION_ARGS)
*** 766,772 ****
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- 774,780 ----
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
*************** my_distance(PG_FUNCTION_ARGS)
*** 785,790 ****
--- 793,799 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 797,807 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
--- 806,821 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function. When
! <literal>recheck = true</> then value of distance will
! be rechecked from heap tuple before tuple is returned. If
! <literal>recheck</> flag isn't set then it's true by default for
! compatibility reasons. The <literal>recheck</> flag can be used only
! when ordering operator returns <type>float8</> value comparable with
! result of <function>distance</> function. Result of distance function
! should be never greater than result of ordering operator.
! Same approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 7a8692b..e454ba2
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "utils/builtins.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 55,61 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 56,62 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 72,78 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 73,82 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 170,176 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 174,181 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 191,208 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 196,215 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance function gets a recheck argument as well as consistent
! * function. Distance will be re-calculated from heap tuple when
! * needed.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 234,240 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 241,247 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 284,290 ****
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 291,297 ----
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 375,381 ****
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 382,388 ----
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 387,392 ****
--- 394,473 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *treeItem,
+ GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get index values from heap */
+ if (!index_get_heap_values(scan, &item->data.heap.heapPtr, values, isnull))
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ pfree(item);
+ return;
+ }
+
+ /* Prepare new tree item and reinsert it */
+ memcpy(tmpItem, treeItem, GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys);
+ tmpItem->head = item;
+ tmpItem->lastHeap = item;
+ item->next = NULL;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (tmpItem->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ tmpItem->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ tmpItem->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ tmpItem->distances[i].value = newDistance;
+
+ }
+ }
+ (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 396,403 ****
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
for (;;)
{
GISTSearchItem *item;
--- 477,486 ----
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+
for (;;)
{
GISTSearchItem *item;
*************** getNextGISTSearchItem(GISTScanOpaque so)
*** 418,423 ****
--- 501,514 ----
so->curTreeItem->head = item->next;
if (item == so->curTreeItem->lastHeap)
so->curTreeItem->lastHeap = NULL;
+
+ /* Recheck distance from heap tuple if needed */
+ if (GISTSearchItemIsHeap(*item) &&
+ searchTreeItemNeedDistanceRecheck(scan, so->curTreeItem))
+ {
+ searchTreeItemDistanceRecheck(scan, so->curTreeItem, item);
+ continue;
+ }
/* Return item; caller is responsible to pfree it */
return item;
}
*************** getNextNearest(IndexScanDesc scan)
*** 441,447 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 532,538 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 521,527 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 612,618 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 573,579 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 664,670 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index db0bec6..fd3546a
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1098,1103 ****
--- 1098,1104 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1459,1461 ****
--- 1460,1496 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_inexact_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 8360b16..5832087
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** GISTSearchTreeItemComparator(const RBNod
*** 36,43 ****
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
}
return 0;
--- 39,56 ----
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! const GISTSearchTreeItemDistance distance_a = sa->distances[i];
! const GISTSearchTreeItemDistance distance_b = sb->distances[i];
!
! if (distance_a.value != distance_b.value)
! return (distance_a.value > distance_b.value) ? 1 : -1;
!
! /*
! * Items without recheck can be immediately returned. So they are
! * placed first.
! */
! if (distance_a.recheck != distance_b.recheck)
! return distance_a.recheck ? 1 : -1;
}
return 0;
*************** GISTSearchTreeItemAllocator(void *arg)
*** 83,89 ****
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
}
static void
--- 96,102 ----
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
}
static void
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 127,136 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
--- 140,158 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Functions for distance recheck from heap tuple */
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
*************** gistrescan(PG_FUNCTION_ARGS)
*** 188,194 ****
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
--- 210,217 ----
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
*************** gistrescan(PG_FUNCTION_ARGS)
*** 289,294 ****
--- 312,321 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
new file mode 100644
index 850008b..18cd20b
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 120,125 ****
--- 120,128 ----
scan->xs_ctup.t_data = NULL;
scan->xs_cbuf = InvalidBuffer;
scan->xs_continue_hot = false;
+ scan->indexInfo = NULL;
+ scan->estate = NULL;
+ scan->slot = NULL;
return scan;
}
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
new file mode 100644
index 53cf96f..5b526c6
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
***************
*** 69,74 ****
--- 69,75 ----
#include "access/transam.h"
#include "access/xlog.h"
+ #include "executor/executor.h"
#include "catalog/index.h"
#include "catalog/catalog.h"
#include "pgstat.h"
*************** index_beginscan(Relation heapRelation,
*** 254,259 ****
--- 255,265 ----
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ /* Prepare data structures for getting original indexed values from heap */
+ scan->indexInfo = BuildIndexInfo(scan->indexRelation);
+ scan->estate = CreateExecutorState();
+ scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
+
return scan;
}
*************** index_endscan(IndexScanDesc scan)
*** 377,382 ****
--- 383,393 ----
scan->xs_cbuf = InvalidBuffer;
}
+ if (scan->slot)
+ ExecDropSingleTupleTableSlot(scan->slot);
+ if (scan->estate)
+ FreeExecutorState(scan->estate);
+
/* End the AM's scan */
FunctionCall1(procedure, PointerGetDatum(scan));
*************** index_fetch_heap(IndexScanDesc scan)
*** 564,569 ****
--- 575,623 ----
}
/* ----------------
+ * index_get_heap_values - get original indexed values from heap
+ *
+ * Fetches heap tuple of heapPtr and calculated original indexed values.
+ * Returns true on success. Returns false when heap tuple wasn't found.
+ * Useful for indexes with lossy representation of keys.
+ * ----------------
+ */
+ bool
+ index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS])
+ {
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /* Tuple not found: it has been deleted from heap. */
+ UnlockReleaseBuffer(buffer);
+ return false;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(heap_copytuple(&tup), scan->slot, InvalidBuffer, true);
+ FormIndexDatum(scan->indexInfo, scan->slot, scan->estate, values, isnull);
+
+ UnlockReleaseBuffer(buffer);
+
+ return true;
+ }
+
+ /* ----------------
* index_getnext - get the next heap tuple from a scan
*
* The result is the next heap tuple satisfying the scan keys and the
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 402ea40..c9788e4
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** static Point *interpt_sl(LSEG *lseg, LIN
*** 70,75 ****
--- 70,76 ----
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+ static float8 dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
*************** dist_lb(PG_FUNCTION_ARGS)
*** 2641,2646 ****
--- 2642,2679 ----
}
/*
+ * Distance from a point to a circle
+ */
+ Datum
+ dist_pc(PG_FUNCTION_ARGS)
+ {
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
* Distance from a circle to a polygon
*/
Datum
*************** dist_cpoly(PG_FUNCTION_ARGS)
*** 2699,2706 ****
Datum
dist_ppoly(PG_FUNCTION_ARGS)
{
! Point *point = PG_GETARG_POINT_P(0);
! POLYGON *poly = PG_GETARG_POLYGON_P(1);
float8 result;
float8 distance;
int i;
--- 2732,2754 ----
Datum
dist_ppoly(PG_FUNCTION_ARGS)
{
! PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(0),
! PG_GETARG_POLYGON_P(1)));
! }
!
! /*
! * Distance from a polygon to a point
! */
! Datum
! dist_polyp(PG_FUNCTION_ARGS)
! {
! PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
! PG_GETARG_POLYGON_P(0)));
! }
!
! static float8
! dist_ppoly_internal(Point *point, POLYGON *poly)
! {
float8 result;
float8 distance;
int i;
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2711,2717 ****
#ifdef GEODEBUG
printf("dist_ppoly- point inside of polygon\n");
#endif
! PG_RETURN_FLOAT8(0.0);
}
/* initialize distance with segment between first and last points */
--- 2759,2765 ----
#ifdef GEODEBUG
printf("dist_ppoly- point inside of polygon\n");
#endif
! return 0.0;
}
/* initialize distance with segment between first and last points */
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2739,2745 ****
result = distance;
}
! PG_RETURN_FLOAT8(result);
}
--- 2787,2793 ----
result = distance;
}
! return result;
}
*************** pt_contained_circle(PG_FUNCTION_ARGS)
*** 5098,5120 ****
}
- /* dist_pc - returns the distance between
- * a point and a circle.
- */
- Datum
- dist_pc(PG_FUNCTION_ARGS)
- {
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
- }
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
--- 5146,5151 ----
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d99158f..170069e
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index 03e9903..f7b7aeb
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 16,22 ****
--- 16,24 ----
#include "access/gist.h"
#include "access/itup.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/rbtree.h"
*************** typedef struct GISTSearchItem
*** 135,140 ****
--- 137,151 ----
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
/*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
+ /*
* Within a GISTSearchTreeItem's chain, heap items always appear before
* index-page items, since we want to visit heap items first. lastHeap points
* to the last heap item in the chain, or is NULL if there are none.
*************** typedef struct GISTSearchTreeItem
*** 144,150 ****
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
--- 155,161 ----
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! GISTSearchTreeItemDistance distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
*************** typedef struct GISTScanOpaqueData
*** 164,175 ****
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
--- 175,189 ----
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 8a57698..5acf228
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
***************
*** 19,24 ****
--- 19,25 ----
#include "access/htup_details.h"
#include "access/itup.h"
#include "access/tupdesc.h"
+ #include "nodes/execnodes.h"
typedef struct HeapScanDescData
*************** typedef struct IndexScanDescData
*** 93,98 ****
--- 94,104 ----
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
+
+ /* Data structures for getting original indexed values from heap */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} IndexScanDescData;
/* Struct for heap-or-index scans of system tables */
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 3ef5a49..dd468f6
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 10a47df..1149923
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 204,209 ****
--- 204,210 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 211,216 ****
--- 212,218 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index c31b8a8..b633665
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3591 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 95f0b74..1b7664e
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 814,819 ****
--- 814,821 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3590 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 4009,4014 ****
--- 4011,4018 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 91610d8..64f63b2
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 394,401 ****
--- 394,403 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 419,424 ****
--- 421,427 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index f6f5516..61082d8
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 366,371 ****
--- 366,401 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1146,1151 ****
--- 1176,1229 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index d4d24ef..d9bce16
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 220,225 ****
--- 220,229 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 433,438 ****
--- 437,450 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
I added the point to polygon distance operator patch to the open
CommitFest as ready for committer and added myself as reviewer to
both of the patches.
I think that for most use cases just some operators require further sorting
and some of them not. But it could appear one day that some index gives
part of its knn answers exact and part of them inexact. Same happen to
recheck of regular operators. Initially recheck flag was defined in
opclass. But later recheck became runtime flag.
I cannot think of an use case, but it makes sense to add the flag to
the distance function just like the consistent function if we will go
with this implementation.
Cost estimation of GiST is a big problem anyway. It doesn't care (and
can't) about amount of recheck for regular operators. In this patch, same
would be for knn recheck. The problem is that touching heap from access
method breaks incapsulation. One idea about this is to do sorting in
another nodes. However, I wonder if it would be an overengineering and
overhead. In attached patch I propose a different approach: put code
touching heap into separate index_get_heap_values function. Also new
version of patch includes regression tests and some cleanup.
While looking it at I found a bug. It returns the second column
in wrong order when both of the distance functions return recheck = true.
Test script attached to run on the regression database. I tried to
fix but could not. searchTreeItemDistanceRecheck function is not
very easy to follow. I think it deserves more comments.
Attachments:
On Sun, Sep 14, 2014 at 10:09 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
I added the point to polygon distance operator patch to the open
CommitFest as ready for committer and added myself as reviewer to
both of the patches.
Thanks.
Cost estimation of GiST is a big problem anyway. It doesn't care (and
can't) about amount of recheck for regular operators. In this patch, same
would be for knn recheck. The problem is that touching heap from access
method breaks incapsulation. One idea about this is to do sorting in
another nodes. However, I wonder if it would be an overengineering and
overhead. In attached patch I propose a different approach: put code
touching heap into separate index_get_heap_values function. Also new
version of patch includes regression tests and some cleanup.While looking it at I found a bug. It returns the second column
in wrong order when both of the distance functions return recheck = true.
Test script attached to run on the regression database. I tried to
fix but could not. searchTreeItemDistanceRecheck function is not
very easy to follow. I think it deserves more comments.
Fixed, thanks. It was logical error in comparison function implementation.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-4.patchapplication/octet-stream; name=knn-gist-recheck-4.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 0158b17..2cfe9e8
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_same(PG_FUNCTION_ARGS)
*** 766,772 ****
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- 774,780 ----
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
*************** my_distance(PG_FUNCTION_ARGS)
*** 785,790 ****
--- 793,799 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 797,807 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
--- 806,821 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function. When
! <literal>recheck = true</> then value of distance will
! be rechecked from heap tuple before tuple is returned. If
! <literal>recheck</> flag isn't set then it's true by default for
! compatibility reasons. The <literal>recheck</> flag can be used only
! when ordering operator returns <type>float8</> value comparable with
! result of <function>distance</> function. Result of distance function
! should be never greater than result of ordering operator.
! Same approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 7a8692b..e454ba2
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "utils/builtins.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 55,61 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 56,62 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 72,78 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 73,82 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 170,176 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 174,181 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 191,208 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 196,215 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance function gets a recheck argument as well as consistent
! * function. Distance will be re-calculated from heap tuple when
! * needed.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 234,240 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 241,247 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 284,290 ****
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 291,297 ----
tmpItem->head = item;
tmpItem->lastHeap = NULL;
memcpy(tmpItem->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 375,381 ****
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
--- 382,388 ----
tmpItem->head = item;
tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
memcpy(tmpItem->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
(void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 387,392 ****
--- 394,473 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchTreeItem *treeItem,
+ GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get index values from heap */
+ if (!index_get_heap_values(scan, &item->data.heap.heapPtr, values, isnull))
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ pfree(item);
+ return;
+ }
+
+ /* Prepare new tree item and reinsert it */
+ memcpy(tmpItem, treeItem, GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
+ scan->numberOfOrderBys);
+ tmpItem->head = item;
+ tmpItem->lastHeap = item;
+ item->next = NULL;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (tmpItem->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ tmpItem->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ tmpItem->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ tmpItem->distances[i].value = newDistance;
+
+ }
+ }
+ (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 396,403 ****
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
for (;;)
{
GISTSearchItem *item;
--- 477,486 ----
* the distances value for the item.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+
for (;;)
{
GISTSearchItem *item;
*************** getNextGISTSearchItem(GISTScanOpaque so)
*** 418,423 ****
--- 501,514 ----
so->curTreeItem->head = item->next;
if (item == so->curTreeItem->lastHeap)
so->curTreeItem->lastHeap = NULL;
+
+ /* Recheck distance from heap tuple if needed */
+ if (GISTSearchItemIsHeap(*item) &&
+ searchTreeItemNeedDistanceRecheck(scan, so->curTreeItem))
+ {
+ searchTreeItemDistanceRecheck(scan, so->curTreeItem, item);
+ continue;
+ }
/* Return item; caller is responsible to pfree it */
return item;
}
*************** getNextNearest(IndexScanDesc scan)
*** 441,447 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 532,538 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 521,527 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 612,618 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 573,579 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 664,670 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index db0bec6..fd3546a
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1098,1103 ****
--- 1098,1104 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1459,1461 ****
--- 1460,1496 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_inexact_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 8360b16..284e8dc
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** GISTSearchTreeItemComparator(const RBNod
*** 31,46 ****
const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
}
! return 0;
}
static void
--- 34,59 ----
const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i, recheckCmp = 0;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! const GISTSearchTreeItemDistance distance_a = sa->distances[i];
! const GISTSearchTreeItemDistance distance_b = sb->distances[i];
!
! if (distance_a.value != distance_b.value)
! return (distance_a.value > distance_b.value) ? 1 : -1;
!
! /*
! * When all distance values are the same, items without recheck
! * can be immediately returned. So they are placed first.
! */
! if (recheckCmp == 0 && distance_a.recheck != distance_b.recheck)
! recheckCmp = distance_a.recheck ? 1 : -1;
}
! return recheckCmp;
}
static void
*************** GISTSearchTreeItemAllocator(void *arg)
*** 83,89 ****
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
}
static void
--- 96,102 ----
{
IndexScanDesc scan = (IndexScanDesc) arg;
! return palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
}
static void
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 127,136 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
--- 140,158 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Functions for distance recheck from heap tuple */
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
*************** gistrescan(PG_FUNCTION_ARGS)
*** 188,194 ****
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
--- 210,217 ----
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
! so->queue = rb_create(GSTIHDRSZ + sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys,
GISTSearchTreeItemComparator,
GISTSearchTreeItemCombiner,
GISTSearchTreeItemAllocator,
*************** gistrescan(PG_FUNCTION_ARGS)
*** 289,294 ****
--- 312,321 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
new file mode 100644
index 850008b..18cd20b
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 120,125 ****
--- 120,128 ----
scan->xs_ctup.t_data = NULL;
scan->xs_cbuf = InvalidBuffer;
scan->xs_continue_hot = false;
+ scan->indexInfo = NULL;
+ scan->estate = NULL;
+ scan->slot = NULL;
return scan;
}
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
new file mode 100644
index 53cf96f..5b526c6
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
***************
*** 69,74 ****
--- 69,75 ----
#include "access/transam.h"
#include "access/xlog.h"
+ #include "executor/executor.h"
#include "catalog/index.h"
#include "catalog/catalog.h"
#include "pgstat.h"
*************** index_beginscan(Relation heapRelation,
*** 254,259 ****
--- 255,265 ----
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ /* Prepare data structures for getting original indexed values from heap */
+ scan->indexInfo = BuildIndexInfo(scan->indexRelation);
+ scan->estate = CreateExecutorState();
+ scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
+
return scan;
}
*************** index_endscan(IndexScanDesc scan)
*** 377,382 ****
--- 383,393 ----
scan->xs_cbuf = InvalidBuffer;
}
+ if (scan->slot)
+ ExecDropSingleTupleTableSlot(scan->slot);
+ if (scan->estate)
+ FreeExecutorState(scan->estate);
+
/* End the AM's scan */
FunctionCall1(procedure, PointerGetDatum(scan));
*************** index_fetch_heap(IndexScanDesc scan)
*** 564,569 ****
--- 575,623 ----
}
/* ----------------
+ * index_get_heap_values - get original indexed values from heap
+ *
+ * Fetches heap tuple of heapPtr and calculated original indexed values.
+ * Returns true on success. Returns false when heap tuple wasn't found.
+ * Useful for indexes with lossy representation of keys.
+ * ----------------
+ */
+ bool
+ index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS])
+ {
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /* Tuple not found: it has been deleted from heap. */
+ UnlockReleaseBuffer(buffer);
+ return false;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(heap_copytuple(&tup), scan->slot, InvalidBuffer, true);
+ FormIndexDatum(scan->indexInfo, scan->slot, scan->estate, values, isnull);
+
+ UnlockReleaseBuffer(buffer);
+
+ return true;
+ }
+
+ /* ----------------
* index_getnext - get the next heap tuple from a scan
*
* The result is the next heap tuple satisfying the scan keys and the
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 402ea40..c9788e4
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** static Point *interpt_sl(LSEG *lseg, LIN
*** 70,75 ****
--- 70,76 ----
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+ static float8 dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
*************** dist_lb(PG_FUNCTION_ARGS)
*** 2641,2646 ****
--- 2642,2679 ----
}
/*
+ * Distance from a point to a circle
+ */
+ Datum
+ dist_pc(PG_FUNCTION_ARGS)
+ {
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
* Distance from a circle to a polygon
*/
Datum
*************** dist_cpoly(PG_FUNCTION_ARGS)
*** 2699,2706 ****
Datum
dist_ppoly(PG_FUNCTION_ARGS)
{
! Point *point = PG_GETARG_POINT_P(0);
! POLYGON *poly = PG_GETARG_POLYGON_P(1);
float8 result;
float8 distance;
int i;
--- 2732,2754 ----
Datum
dist_ppoly(PG_FUNCTION_ARGS)
{
! PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(0),
! PG_GETARG_POLYGON_P(1)));
! }
!
! /*
! * Distance from a polygon to a point
! */
! Datum
! dist_polyp(PG_FUNCTION_ARGS)
! {
! PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
! PG_GETARG_POLYGON_P(0)));
! }
!
! static float8
! dist_ppoly_internal(Point *point, POLYGON *poly)
! {
float8 result;
float8 distance;
int i;
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2711,2717 ****
#ifdef GEODEBUG
printf("dist_ppoly- point inside of polygon\n");
#endif
! PG_RETURN_FLOAT8(0.0);
}
/* initialize distance with segment between first and last points */
--- 2759,2765 ----
#ifdef GEODEBUG
printf("dist_ppoly- point inside of polygon\n");
#endif
! return 0.0;
}
/* initialize distance with segment between first and last points */
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2739,2745 ****
result = distance;
}
! PG_RETURN_FLOAT8(result);
}
--- 2787,2793 ----
result = distance;
}
! return result;
}
*************** pt_contained_circle(PG_FUNCTION_ARGS)
*** 5098,5120 ****
}
- /* dist_pc - returns the distance between
- * a point and a circle.
- */
- Datum
- dist_pc(PG_FUNCTION_ARGS)
- {
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
- }
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
--- 5146,5151 ----
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d99158f..170069e
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index 03e9903..f7b7aeb
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 16,22 ****
--- 16,24 ----
#include "access/gist.h"
#include "access/itup.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/rbtree.h"
*************** typedef struct GISTSearchItem
*** 135,140 ****
--- 137,151 ----
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
/*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
+ /*
* Within a GISTSearchTreeItem's chain, heap items always appear before
* index-page items, since we want to visit heap items first. lastHeap points
* to the last heap item in the chain, or is NULL if there are none.
*************** typedef struct GISTSearchTreeItem
*** 144,150 ****
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
--- 155,161 ----
RBNode rbnode; /* this is an RBTree item */
GISTSearchItem *head; /* first chain member */
GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
! GISTSearchTreeItemDistance distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchTreeItem;
#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
*************** typedef struct GISTScanOpaqueData
*** 164,175 ****
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
--- 175,189 ----
/* pre-allocated workspace arrays */
GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 8a57698..5acf228
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
***************
*** 19,24 ****
--- 19,25 ----
#include "access/htup_details.h"
#include "access/itup.h"
#include "access/tupdesc.h"
+ #include "nodes/execnodes.h"
typedef struct HeapScanDescData
*************** typedef struct IndexScanDescData
*** 93,98 ****
--- 94,104 ----
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
+
+ /* Data structures for getting original indexed values from heap */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} IndexScanDescData;
/* Struct for heap-or-index scans of system tables */
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 3ef5a49..dd468f6
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index a1de336..0f505ae
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index 2275c2c..3cfcbea
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3591 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3591 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 617a83d..a06f925
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 816,821 ****
--- 816,823 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3590 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 4023,4028 ****
--- 4025,4032 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 91610d8..64f63b2
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 394,401 ****
--- 394,403 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 419,424 ****
--- 421,427 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index a2bef7a..81645bc
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 366,371 ****
--- 366,401 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1146,1151 ****
--- 1176,1229 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index d4d24ef..d9bce16
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 220,225 ****
--- 220,229 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 433,438 ****
--- 437,450 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
While looking it at I found a bug. It returns the second column
in wrong order when both of the distance functions return recheck = true.
Test script attached to run on the regression database. I tried to
fix but could not. searchTreeItemDistanceRecheck function is not
very easy to follow. I think it deserves more comments.Fixed, thanks. It was logical error in comparison function implementation.
I managed to break it again by ordering rows only by the second column
of the index. Test script attached.
Attachments:
I managed to break it again by ordering rows only by the second column
of the index. Test script attached.
I was confused. It is undefined behavior. Sorry for the noise.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 17, 2014 at 12:30 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
I managed to break it again by ordering rows only by the second column
of the index. Test script attached.I was confused. It is undefined behavior. Sorry for the noise.
No problem. Thanks a lot for testing.
------
With best regards,
Alexander Korotkov.
Fixed, thanks.
Here are my questions and comments about the code.
doc/src/sgml/gist.sgml:812:
be rechecked from heap tuple before tuple is returned. If
<literal>recheck</> flag isn't set then it's true by default for
compatibility reasons. The <literal>recheck</> flag can be used only
Recheck flag is set to false on gistget.c so I think it should say
"false by default". On the other hand, it is true by default on
the consistent function. It is written as "the safest assumption"
on the code comments. I don't know why the safest is chosen over
the backwards compatible for the consistent function.
src/backend/access/gist/gistget.c:505:
/* Recheck distance from heap tuple if needed */
if (GISTSearchItemIsHeap(*item) &&
searchTreeItemNeedDistanceRecheck(scan, so->curTreeItem))
{
searchTreeItemDistanceRecheck(scan, so->curTreeItem, item);
continue;
}
Why so->curTreeItem is passed to these functions? They can use
scan->opaque->curTreeItem.
src/backend/access/gist/gistscan.c:49:
/*
* When all distance values are the same, items without recheck
* can be immediately returned. So they are placed first.
*/
if (recheckCmp == 0 && distance_a.recheck != distance_b.recheck)
recheckCmp = distance_a.recheck ? 1 : -1;
I don't understand why items without recheck can be immediately
returned. Do you think it will work correctly when there is
an operator class which will return recheck true and false for
the items under the same page?
src/backend/access/index/indexam.c:258:
/* Prepare data structures for getting original indexed values from heap */
scan->indexInfo = BuildIndexInfo(scan->indexRelation);
scan->estate = CreateExecutorState();
scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
With the changes in indexam.c, heap access become legal for all index
access methods. I think it is better than the previous version but
I am leaving the judgement to someone experienced. I will try to
summarize the pros and cons of sorting the rows in the GiST access
method, as far as I understand.
Pros:
* It does not require another queue. It should be effective to sort
the rows inside the queue the GiST access method already has.
* It does not complicate index access method infrastructure.
Cons:
* It could be done without additional heap access.
* Other access methods could make use of the sorting infrastructure
one day.
* It could be more transparent to the users. Sorting information
could be shown on the explain output.
* A more suitable data structure like binary heap could be used
for the queue to sort the rows.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Sep 25, 2014 at 9:00 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
Fixed, thanks.
Here are my questions and comments about the code.
doc/src/sgml/gist.sgml:812:
be rechecked from heap tuple before tuple is returned. If
<literal>recheck</> flag isn't set then it's true by default for
compatibility reasons. The <literal>recheck</> flag can be usedonly
Recheck flag is set to false on gistget.c so I think it should say
"false by default". On the other hand, it is true by default on
the consistent function. It is written as "the safest assumption"
on the code comments. I don't know why the safest is chosen over
the backwards compatible for the consistent function.
Agree. It should be clarified in docs.
src/backend/access/gist/gistget.c:505:
/* Recheck distance from heap tuple if needed */
if (GISTSearchItemIsHeap(*item) &&
searchTreeItemNeedDistanceRecheck(scan,so->curTreeItem))
{
searchTreeItemDistanceRecheck(scan,so->curTreeItem, item);
continue;
}Why so->curTreeItem is passed to these functions? They can use
scan->opaque->curTreeItem.
I didn't get the difference. Few lines before:
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
src/backend/access/gist/gistscan.c:49:
/*
* When all distance values are the same, items withoutrecheck
* can be immediately returned. So they are placed first.
*/
if (recheckCmp == 0 && distance_a.recheck !=distance_b.recheck)
recheckCmp = distance_a.recheck ? 1 : -1;
I don't understand why items without recheck can be immediately
returned. Do you think it will work correctly when there is
an operator class which will return recheck true and false for
the items under the same page?
Yes, I believe so. Item with recheck can't decrease it's distance, it can
only increase it. In the corner case item can have same distance after
recheck as it was before. Then anyway items which distances are the same
can be returned in any order.
src/backend/access/index/indexam.c:258:
/* Prepare data structures for getting original indexed values
from heap */
scan->indexInfo = BuildIndexInfo(scan->indexRelation);
scan->estate = CreateExecutorState();
scan->slot =MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
With the changes in indexam.c, heap access become legal for all index
access methods. I think it is better than the previous version but
I am leaving the judgement to someone experienced. I will try to
summarize the pros and cons of sorting the rows in the GiST access
method, as far as I understand.Pros:
* It does not require another queue. It should be effective to sort
the rows inside the queue the GiST access method already has.
* It does not complicate index access method infrastructure.Cons:
* It could be done without additional heap access.
* Other access methods could make use of the sorting infrastructure
one day.
* It could be more transparent to the users. Sorting information
could be shown on the explain output.
It would be also nice to show some information about KNN itself.
* A more suitable data structure like binary heap could be used
for the queue to sort the rows.
Binary heap seems to be better data structure for whole KNN-GiST. But it's
a subject for a separate patch: replace RB-tree to heap in KNN-GiST. It's
not related to recheck stuff.
------
With best regards,
Alexander Korotkov.
On Sun, Sep 14, 2014 at 11:34:26PM +0400, Alexander Korotkov wrote:
Cost estimation of GiST is a big problem anyway. It doesn't care (and
can't) about amount of recheck for regular operators. In this patch, same
would be for knn recheck. The problem is that touching heap from access
This is very important work. While our existing KNN-GiST index code
works fine for scalar values and point-to-point distance ordering, it
doesn't work well for 2-dimensional objects because they are only
indexed by their bounding boxes (a rectangle around the object). The
indexed bounding box can't produce accurate distances to other objects.
As an example, see this PostGIS blog post showing how to use LIMIT in a
CTE to filter results and then compute the closest object (search for
"LIMIT 50"):
http://shisaa.jp/postset/postgis-postgresqls-spatial-partner-part-3.html
This patch fixes our code for distances from a point to indexed 2-D
objects.
Does this also fix the identical PostGIS problem or is there something
PostGIS needs to do?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Sep 26, 2014 at 5:18 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Sun, Sep 14, 2014 at 11:34:26PM +0400, Alexander Korotkov wrote:
Cost estimation of GiST is a big problem anyway. It doesn't care
(and
can't) about amount of recheck for regular operators. In this
patch, same
would be for knn recheck. The problem is that touching heap from
access
This is very important work. While our existing KNN-GiST index code
works fine for scalar values and point-to-point distance ordering, it
doesn't work well for 2-dimensional objects because they are only
indexed by their bounding boxes (a rectangle around the object). The
indexed bounding box can't produce accurate distances to other objects.As an example, see this PostGIS blog post showing how to use LIMIT in a
CTE to filter results and then compute the closest object (search for
"LIMIT 50"):http://shisaa.jp/postset/postgis-postgresqls-spatial-partner-part-3.html
This patch fixes our code for distances from a point to indexed 2-D
objects.Does this also fix the identical PostGIS problem or is there something
PostGIS needs to do?
This patch provides general infrastructure for recheck in KNN-GiST. PostGIS
need corresponding change in its GiST opclass. Since PostGIS already define
<-> and <#> operators as distance to bounding box border and bounding box
center, it can't change their behaviour.
it has to support new operator "exact distance" in opclass.
------
With best regards,
Alexander Korotkov.
On Fri, Sep 26, 2014 at 10:49:42AM +0400, Alexander Korotkov wrote:
Does this also fix the identical PostGIS problem or is there something
PostGIS needs to do?This patch provides general infrastructure for recheck in KNN-GiST. PostGIS
need corresponding change in its GiST opclass. Since PostGIS already define <->
and <#> operators as distance to bounding box border and bounding box center,
it can't change their behaviour.
it has to support new operator "exact distance" in opclass.�
Ah, OK, so they just need something that can be used for the recheck. I
think they currently use ST_Distance() for that. Does it have to be an
operator? If they defined an operator for ST_Distance(), would
ST_Distance() work too for KNN-GiST?
In summary, you still create a normal GiST index on the column:
http://shisaa.jp/postset/postgis-postgresqls-spatial-partner-part-3.html
CREATE INDEX planet_osm_line_ref_index ON planet_osm_line(ref);
which indexes by the bounding box. The new code will allow ordered
index hits to be filtered by something like ST_Distance(), rather than
having to a LIMIT 50 in a CTE, then call ST_Distance(), like this:
EXPLAIN ANALYZE WITH distance AS (
SELECT way AS road, ref AS route
FROM planet_osm_line
WHERE highway = 'secondary'
ORDER BY ST_GeomFromText('POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))', 900913) <#> way
LIMIT 50
)
SELECT ST_Distance(ST_GeomFromText('POLYGON((14239931.42 3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08 3053984.84,14239931.42 3054117.72))', 900913), road) AS true_distance, route
FROM distance
ORDER BY true_distance
LIMIT 1;
Notice the CTE uses <#> (bounding box center), and then the outer query
uses ST_Distance and LIMIT 1 to find the closest item.
Excellent!
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 29, 2014 at 6:16 AM, Bruce Momjian <bruce@momjian.us> wrote:
On Fri, Sep 26, 2014 at 10:49:42AM +0400, Alexander Korotkov wrote:
Does this also fix the identical PostGIS problem or is there
something
PostGIS needs to do?
This patch provides general infrastructure for recheck in KNN-GiST.
PostGIS
need corresponding change in its GiST opclass. Since PostGIS already
define <->
and <#> operators as distance to bounding box border and bounding box
center,
it can't change their behaviour.
it has to support new operator "exact distance" in opclass.Ah, OK, so they just need something that can be used for the recheck. I
think they currently use ST_Distance() for that. Does it have to be an
operator? If they defined an operator for ST_Distance(), would
ST_Distance() work too for KNN-GiST?
Currently, ST_Distance is a function, but it's no problem to make it an
operator with KNN-GiST support.
In summary, you still create a normal GiST index on the column:
http://shisaa.jp/postset/postgis-postgresqls-spatial-partner-part-3.html
CREATE INDEX planet_osm_line_ref_index ON planet_osm_line(ref);
which indexes by the bounding box. The new code will allow ordered
index hits to be filtered by something like ST_Distance(), rather than
having to a LIMIT 50 in a CTE, then call ST_Distance(), like this:EXPLAIN ANALYZE WITH distance AS (
SELECT way AS road, ref AS route
FROM planet_osm_line
WHERE highway = 'secondary'
ORDER BY ST_GeomFromText('POLYGON((14239931.42
3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08
3053984.84,14239931.42 3054117.72))', 900913) <#> way
LIMIT 50
)
SELECT ST_Distance(ST_GeomFromText('POLYGON((14239931.42
3054117.72,14239990.49 3054224.25,14240230.15 3054091.38,14240171.08
3053984.84,14239931.42 3054117.72))', 900913), road) AS true_distance, route
FROM distance
ORDER BY true_distance
LIMIT 1;
Yeah. It this query 50 is pure empirical value. It could be both too low or
too high. Too low value can cause wrong query answers. Too high value can
cause lower performance. With patch simple KNN query will work like this
query with always right value in LIMIT clause.
Notice the CTE uses <#> (bounding box center), and then the outer query
uses ST_Distance and LIMIT 1 to find the closest item.Excellent!
Thanks. The main question now is design of this patch. Currently, it does
all the work inside access method. We already have some discussion of pro
and cons of this method. I would like to clarify alternatives now. I can
see following way:
1. Implement new executor node which performs sorting by priority queue.
Let's call it "Priority queue". I think it should be separate node from
"Sort" node. Despite "Priority queue" and "Sort" are essentially similar
from user view, they would be completely different in implementation.
2. Implement some interface to transfer distance values from access
method to "Priority queue" node.
3. Somehow tell the planner that it could use "Priority queue" in
corresponding cases. I see two ways of doing this:
- Add flag to operator in opclass indicating that index can only
order by lower bound of "col op value", not by "col op value" itself.
- Define new relation between operators. Value of one operator could
be lower bound for value of another operator. So, planner can
put "Priority
queue" node when lower bound ordering is possible from index. Also "ALTER
OPERATOR" command would be reasonable, so extensions could upgrade.
Besides overhead, this way makes significant infrastructural changes. So,
it may be over-engineering. However, it's probably more clean and beautiful
solution.
I would like to get some feedback from people familiar with KNN-GiST like
Heikki or Tom. What do you think about this? Any other ideas?
------
With best regards,
Alexander Korotkov.
Thanks. The main question now is design of this patch. Currently, it does
all the work inside access method. We already have some discussion of pro
and cons of this method. I would like to clarify alternatives now. I can
see following way:1. Implement new executor node which performs sorting by priority queue.
Let's call it "Priority queue". I think it should be separate node from
"Sort" node. Despite "Priority queue" and "Sort" are essentially similar
from user view, they would be completely different in implementation.
2. Implement some interface to transfer distance values from access
method to "Priority queue" node.
If we assume that all of them need recheck, maybe it can be done
without passing distance values.
3. Somehow tell the planner that it could use "Priority queue" in
corresponding cases. I see two ways of doing this:
- Add flag to operator in opclass indicating that index can only
order by lower bound of "col op value", not by "col op value" itself.
- Define new relation between operators. Value of one operator could
be lower bound for value of another operator. So, planner can
put "Priority
queue" node when lower bound ordering is possible from index. Also "ALTER
OPERATOR" command would be reasonable, so extensions could upgrade.
I think, it would be better to make it a property of the operator
class. We can add a column to pg_amop or define another value for
amoppurpose on pg_amop. Syntax can be something like this:
CREATE OPERATOR CLASS circle_ops DEFAULT
FOR TYPE circle USING gist AS
OPERATOR 15 <->(circle, point) FOR ORDER BY pg_catalog.float_ops LOWER BOUND;
While looking at it, I realize that current version of the patch does
not use the sort operator family defined with the operator class. It
assumes that the distance function will return values compatible with
the operator. Operator class definition makes me think that there is
not such an assumption.
Besides overhead, this way makes significant infrastructural changes. So,
it may be over-engineering. However, it's probably more clean and beautiful
solution.
I would like to get some feedback from people familiar with KNN-GiST like
Heikki or Tom. What do you think about this? Any other ideas?
I would be happy to test and review the changes. I think it is nice
to solve the problem in a generalized way improving the access method
infrastructure. Definitely, we should have a consensus on the design
before working on the infrastructure changes.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jan 13, 2014 at 9:17 AM, Alexander Korotkov
<aekorotkov@gmail.com> wrote:
This patch was split from thread:
/messages/by-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.comI've split it to separate thead, because it's related to partial sort only
conceptually not technically. Also I renamed it to "knn-gist-recheck" from
"partial-knn" as more appropriate name. In the attached version docs are
updated. Possible weak point of this patch design is that it fetches heap
tuple from GiST scan. However, I didn't receive any notes about its design,
so, I'm going to put it to commitfest.
The partial sort thing is not in the current 2014-10 commitfest
(although this patch is). Is that intentional?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Nov 22, 2014 at 2:20 AM, Peter Geoghegan <pg@heroku.com> wrote:
On Mon, Jan 13, 2014 at 9:17 AM, Alexander Korotkov
<aekorotkov@gmail.com> wrote:This patch was split from thread:
/messages/by-id/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com
I've split it to separate thead, because it's related to partial sort
only
conceptually not technically. Also I renamed it to "knn-gist-recheck"
from
"partial-knn" as more appropriate name. In the attached version docs are
updated. Possible weak point of this patch design is that it fetches heap
tuple from GiST scan. However, I didn't receive any notes about itsdesign,
so, I'm going to put it to commitfest.
The partial sort thing is not in the current 2014-10 commitfest
(although this patch is). Is that intentional?
It's not. I just didn't revise partial sort yet :(
------
With best regards,
Alexander Korotkov.
On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
3. A binary heap would be a better data structure to buffer the rechecked
values. A Red-Black tree allows random insertions and deletions, but in
this case you need to insert arbitrary values but only remove the minimum
item. That's exactly what a binary heap excels at. We have a nice binary
heap implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.Hmm. For me binary heap would be a better data structure for KNN-GiST at
all :-)
I decided to give this a shot, replacing the red-black tree in GiST with
the binary heap we have in lib/binaryheap.c. It made the GiST code
somewhat simpler, as the binaryheap interface is simpler than the
red-black tree one. Unfortunately, performance was somewhat worse. That
was quite surprising, as insertions and deletions are both O(log N) in
both data structures, but the red-black tree implementation is more
complicated.
I implemented another data structure called a Pairing Heap. It's also a
fairly simple data structure, but insertions are O(1) instead of O(log
N). It also performs fairly well in practice.
With that, I got a small but measurable improvement. To test, I created
a table like this:
create table gisttest (id integer, p point);
insert into gisttest select id, point(random(), random()) from
generate_series(1, 1000000) id;
create index i_gisttest on gisttest using gist (p);
And I ran this query with pgbench:
select id from gisttest order by p <-> '(0,0)' limit 1000;
With unpatched master, I got about 650 TPS, and with the patch 720 TPS.
That's a nice little improvement, but perhaps more importantly, the
pairing heap implementation consumes less memory. To measure that, I put
a MemoryContextStats(so->queueCtx) call into gistendscan. With the above
query, but without the "limit" clause, on master I got:
GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998
chunks); 21296 used
And with the patch:
GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502
chunks); 21072 used
That's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice
to reduce that memory consumption, as there is no hard upper bound on
how much might be needed. If the GiST tree is really disorganized for
some reason, a query might need a lot more.
So all in all, I quite like this patch, even though it doesn't do
anything too phenomenal. It adds a some code, in the form of the new
pairing heap implementation, but it makes the GiST code a little bit
simpler. And it gives a small performance gain, and reduces memory usage
a bit.
- Heikki
Attachments:
knn-gist-pairingheap-1.patchtext/x-diff; name=knn-gist-pairingheap-1.patchDownload
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..52b2c53 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -18,6 +18,7 @@
#include "access/relscan.h"
#include "miscadmin.h"
#include "pgstat.h"
+#include "lib/pairingheap.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
@@ -243,8 +244,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
GISTPageOpaque opaque;
OffsetNumber maxoff;
OffsetNumber i;
- GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
- bool isNew;
MemoryContext oldcxt;
Assert(!GISTSearchItemIsHeap(*pageItem));
@@ -275,18 +274,15 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for the right sibling index page */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
item->blkno = opaque->rightlink;
item->data.parentlsn = pageItem->data.parentlsn;
/* Insert it into the queue using same distances as for this page */
- tmpItem->head = item;
- tmpItem->lastHeap = NULL;
- memcpy(tmpItem->distances, myDistances,
+ memcpy(item->distances, myDistances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->fbNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -348,8 +344,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for this item */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
if (GistPageIsLeaf(page))
{
@@ -372,12 +367,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
}
/* Insert it into the queue using new distance data */
- tmpItem->head = item;
- tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
- memcpy(tmpItem->distances, so->distances,
+ memcpy(item->distances, so->distances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->fbNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -390,44 +383,24 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
- *
- * NOTE: on successful return, so->curTreeItem is the GISTSearchTreeItem that
- * contained the result item. Callers can use so->curTreeItem->distances as
- * the distances value for the item.
*/
static GISTSearchItem *
getNextGISTSearchItem(GISTScanOpaque so)
{
- for (;;)
- {
- GISTSearchItem *item;
-
- /* Update curTreeItem if we don't have one */
- if (so->curTreeItem == NULL)
- {
- so->curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so->queue);
- /* Done when tree is empty */
- if (so->curTreeItem == NULL)
- break;
- }
+ GISTSearchItem *item;
- item = so->curTreeItem->head;
- if (item != NULL)
- {
- /* Delink item from chain */
- so->curTreeItem->head = item->next;
- if (item == so->curTreeItem->lastHeap)
- so->curTreeItem->lastHeap = NULL;
- /* Return item; caller is responsible to pfree it */
- return item;
- }
-
- /* curTreeItem is exhausted, so remove it from rbtree */
- rb_delete(so->queue, (RBNode *) so->curTreeItem);
- so->curTreeItem = NULL;
+ if (!pairingheap_empty(so->queue))
+ {
+ item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
+ }
+ else
+ {
+ /* Done when both heaps are empty */
+ item = NULL;
}
- return NULL;
+ /* Return item; caller is responsible to pfree it */
+ return item;
}
/*
@@ -458,7 +431,7 @@ getNextNearest(IndexScanDesc scan)
/* visit an index page, extract its items into queue */
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
}
pfree(item);
@@ -491,7 +464,6 @@ gistgettuple(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
so->firstCall = false;
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -534,7 +506,7 @@ gistgettuple(PG_FUNCTION_ARGS)
* this page, we fall out of the inner "do" and loop around to
* return them.
*/
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
pfree(item);
} while (so->nPageData == 0);
@@ -560,7 +532,6 @@ gistgetbitmap(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
/* Begin the scan by processing the root page */
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -580,7 +551,7 @@ gistgetbitmap(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, tbm, &ntids);
+ gistScanPage(scan, item, item->distances, tbm, &ntids);
pfree(item);
}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 8360b16..52c39c5 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -22,14 +22,13 @@
/*
- * RBTree support functions for the GISTSearchTreeItem queue
+ * Pairing heap comparison function for the GISTSearchItem queue
*/
-
static int
-GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
+pairingheap_GISTSearchItem_cmp(const pairingheap_item *a, const pairingheap_item *b, void *arg)
{
- const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
- const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
+ const GISTSearchItem *sa = (const GISTSearchItem *) a;
+ const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
int i;
@@ -40,56 +39,13 @@ GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
}
- return 0;
-}
-
-static void
-GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
-{
- GISTSearchTreeItem *scurrent = (GISTSearchTreeItem *) existing;
- const GISTSearchTreeItem *snew = (const GISTSearchTreeItem *) newrb;
- GISTSearchItem *newitem = snew->head;
-
- /* snew should have just one item in its chain */
- Assert(newitem && newitem->next == NULL);
-
- /*
- * If new item is heap tuple, it goes to front of chain; otherwise insert
- * it before the first index-page item, so that index pages are visited in
- * LIFO order, ensuring depth-first search of index pages. See comments
- * in gist_private.h.
- */
- if (GISTSearchItemIsHeap(*newitem))
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- if (scurrent->lastHeap == NULL)
- scurrent->lastHeap = newitem;
- }
- else if (scurrent->lastHeap == NULL)
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- }
- else
- {
- newitem->next = scurrent->lastHeap->next;
- scurrent->lastHeap->next = newitem;
- }
-}
-
-static RBNode *
-GISTSearchTreeItemAllocator(void *arg)
-{
- IndexScanDesc scan = (IndexScanDesc) arg;
-
- return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
-}
+ /* Heap items go before inner pages, to ensure a depth-first search */
+ if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
+ return -1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return 1;
-static void
-GISTSearchTreeItemDeleter(RBNode *rb, void *arg)
-{
- pfree(rb);
+ return 0;
}
@@ -127,7 +83,6 @@ gistbeginscan(PG_FUNCTION_ARGS)
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
- so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
@@ -188,15 +143,9 @@ gistrescan(PG_FUNCTION_ARGS)
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
- so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
- GISTSearchTreeItemComparator,
- GISTSearchTreeItemCombiner,
- GISTSearchTreeItemAllocator,
- GISTSearchTreeItemDeleter,
- scan);
+ so->queue = pairingheap_allocate(pairingheap_GISTSearchItem_cmp, scan);
MemoryContextSwitchTo(oldCxt);
- so->curTreeItem = NULL;
so->firstCall = true;
/* Update scan key, if a new one is given */
@@ -327,6 +276,7 @@ gistendscan(PG_FUNCTION_ARGS)
* freeGISTstate is enough to clean up everything made by gistbeginscan,
* as well as the queueCxt if there is a separate context for it.
*/
+ MemoryContextStats(so->queueCxt);
freeGISTstate(so->giststate);
PG_RETURN_VOID();
diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 327a1bc..b24ece6 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/lib
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-OBJS = ilist.o binaryheap.o stringinfo.o
+OBJS = ilist.o binaryheap.o pairingheap.o stringinfo.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/pairingheap.c b/src/backend/lib/pairingheap.c
new file mode 100644
index 0000000..32eeba6
--- /dev/null
+++ b/src/backend/lib/pairingheap.c
@@ -0,0 +1,179 @@
+/*-------------------------------------------------------------------------
+ *
+ * pairingheap.c
+ * A Pairing Heap implementaion
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/lib/pairingheap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include <math.h>
+
+#include "lib/pairingheap.h"
+
+/*
+ * pairingheap_allocate
+ *
+ * Returns a pointer to a newly-allocated heap, with the heap property
+ * defined by the given comparator function, which will be invoked with the
+ * additional argument specified by 'arg'.
+ */
+pairingheap *
+pairingheap_allocate(pairingheap_comparator compare, void *arg)
+{
+ pairingheap *heap;
+
+ heap = (pairingheap *) palloc(sizeof(pairingheap));
+ heap->ph_compare = compare;
+ heap->ph_arg = arg;
+
+ heap->ph_root = NULL;
+
+ return heap;
+}
+
+/*
+ * pairingheap_free
+ *
+ * Releases memory used by the given pairingheap.
+ *
+ * Note: The items in the heap are not released!
+ */
+void
+pairingheap_free(pairingheap *heap)
+{
+ pfree(heap);
+}
+
+
+/* A helper function to merge two subheaps into one. */
+static pairingheap_item *
+merge(pairingheap *heap, pairingheap_item *a, pairingheap_item *b)
+{
+ if (a == NULL)
+ return b;
+ if (b == NULL)
+ return a;
+
+ /* Put the larger of the items as a child of the smaller one */
+ if (heap->ph_compare(a, b, heap->ph_arg) < 0)
+ {
+ b->next_sibling = a->first_child;
+ a->first_child = b;
+ return a;
+ }
+ else
+ {
+ a->next_sibling = b->first_child;
+ b->first_child = a;
+ return b;
+ }
+}
+
+/*
+ * pairingheap_add
+ *
+ * Adds the given datum to the heap in O(1) time.
+ */
+void
+pairingheap_add(pairingheap *heap, pairingheap_item *d)
+{
+ d->first_child = NULL;
+
+ /* Link the new item as a new tree */
+ heap->ph_root = merge(heap, heap->ph_root, d);
+}
+
+/*
+ * pairingheap_first
+ *
+ * Returns a pointer to the first (root, topmost) node in the heap
+ * without modifying the heap. The caller must ensure that this
+ * routine is not used on an empty heap. Always O(1).
+ */
+pairingheap_item *
+pairingheap_first(pairingheap *heap)
+{
+ Assert(!pairingheap_empty(heap));
+ return heap->ph_root;
+}
+
+/*
+ * pairingheap_remove_first
+ *
+ * Removes the first (root, topmost) node in the heap and returns a
+ * pointer to it after rebalancing the heap. The caller must ensure
+ * that this routine is not used on an empty heap. O(log n) amortized.
+ */
+pairingheap_item *
+pairingheap_remove_first(pairingheap *heap)
+{
+ pairingheap_item *result;
+ pairingheap_item *children;
+ pairingheap_item *item, *next;
+ pairingheap_item *l = NULL;
+ pairingheap_item *newroot;
+
+ Assert(!pairingheap_empty(heap));
+
+ /* Remove the smallest root. */
+ result = heap->ph_root;
+ children = result->first_child;
+
+ /*
+ * In the trivial case that the heap became empty, or the root had only
+ * a single child, we're done.
+ */
+ if (children == NULL || children->next_sibling == NULL)
+ {
+ heap->ph_root = children;
+ return result;
+ }
+
+ /* Walk the remaining subheaps from left to right, merging in pairs */
+ next = children;
+ for (;;)
+ {
+ item = next;
+ if (item == NULL)
+ break;
+ if (item->next_sibling == NULL)
+ {
+ /* last odd item at the end of list */
+ item->next_sibling = l;
+ l = item;
+ break;
+ }
+ else
+ {
+ next = item->next_sibling->next_sibling;
+
+ item = merge(heap, item, item->next_sibling);
+ item->next_sibling = l;
+ l = item;
+ }
+ }
+
+ /*
+ * Ok, 'l' now contains the pairs in reverse order. Now merge them into
+ * a single heap.
+ */
+ newroot = l;
+ next = l->next_sibling;
+ while (next)
+ {
+ item = next;
+ next = item->next_sibling;
+
+ newroot = merge(heap, newroot, item);
+ }
+ heap->ph_root = newroot;
+
+ return result;
+}
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 2cbc918..7cf87bf 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -18,9 +18,9 @@
#include "access/itup.h"
#include "access/xlogreader.h"
#include "fmgr.h"
+#include "lib/pairingheap.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
-#include "utils/rbtree.h"
#include "utils/hsearch.h"
/*
@@ -123,7 +123,7 @@ typedef struct GISTSearchHeapItem
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
- struct GISTSearchItem *next; /* list link */
+ pairingheap_item fbNode;
BlockNumber blkno; /* index page number, or InvalidBlockNumber */
union
{
@@ -131,24 +131,12 @@ typedef struct GISTSearchItem
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
+ double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
-/*
- * Within a GISTSearchTreeItem's chain, heap items always appear before
- * index-page items, since we want to visit heap items first. lastHeap points
- * to the last heap item in the chain, or is NULL if there are none.
- */
-typedef struct GISTSearchTreeItem
-{
- RBNode rbnode; /* this is an RBTree item */
- GISTSearchItem *head; /* first chain member */
- GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
- double distances[1]; /* array with numberOfOrderBys entries */
-} GISTSearchTreeItem;
-
-#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
+#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
@@ -156,15 +144,12 @@ typedef struct GISTSearchTreeItem
typedef struct GISTScanOpaqueData
{
GISTSTATE *giststate; /* index information, see above */
- RBTree *queue; /* queue of unvisited items */
+ pairingheap *queue; /* queue of unvisited items */
MemoryContext queueCxt; /* context holding the queue */
bool qual_ok; /* false if qual can never be satisfied */
bool firstCall; /* true until first gistgettuple call */
- GISTSearchTreeItem *curTreeItem; /* current queue item, if any */
-
/* pre-allocated workspace arrays */
- GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
diff --git a/src/include/lib/pairingheap.h b/src/include/lib/pairingheap.h
new file mode 100644
index 0000000..3038401
--- /dev/null
+++ b/src/include/lib/pairingheap.h
@@ -0,0 +1,49 @@
+/*
+ * pairingheap.h
+ *
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * src/include/lib/pairingheap.h
+ */
+
+#ifndef PAIRINGHEAP_H
+#define PAIRINGHEAP_H
+
+/*
+ * This represents an element stored in the heap. You can embed this in
+ * a larger struct containing the actual data you're storing.
+ */
+typedef struct pairingheap_item
+{
+ struct pairingheap_item *first_child;
+ struct pairingheap_item *next_sibling;
+} pairingheap_item;
+
+/*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+typedef int (*pairingheap_comparator) (const pairingheap_item *a, const pairingheap_item *b, void *arg);
+
+/*
+ * A pairing heap.
+ */
+typedef struct pairingheap
+{
+ pairingheap_comparator ph_compare; /* comparison function */
+ void *ph_arg; /* opaque argument to ph_compare */
+ pairingheap_item *ph_root; /* current root of the heap */
+} pairingheap;
+
+extern pairingheap *pairingheap_allocate(pairingheap_comparator compare,
+ void *arg);
+extern void pairingheap_free(pairingheap *heap);
+extern void pairingheap_add(pairingheap *heap, pairingheap_item *d);
+extern pairingheap_item *pairingheap_first(pairingheap *heap);
+extern pairingheap_item *pairingheap_remove_first(pairingheap *heap);
+
+#define pairingheap_empty(h) ((h)->ph_root == NULL)
+
+#endif /* PAIRINGHEAP_H */
On Wed, Dec 10, 2014 at 1:50 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
3. A binary heap would be a better data structure to buffer the rechecked
values. A Red-Black tree allows random insertions and deletions, but in
this case you need to insert arbitrary values but only remove theminimum
item. That's exactly what a binary heap excels at. We have a nice binary
heap implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.Hmm. For me binary heap would be a better data structure for KNN-GiST at
all :-)I decided to give this a shot, replacing the red-black tree in GiST with
the binary heap we have in lib/binaryheap.c. It made the GiST code somewhat
simpler, as the binaryheap interface is simpler than the red-black tree
one. Unfortunately, performance was somewhat worse. That was quite
surprising, as insertions and deletions are both O(log N) in both data
structures, but the red-black tree implementation is more complicated.I implemented another data structure called a Pairing Heap. It's also a
fairly simple data structure, but insertions are O(1) instead of O(log N).
It also performs fairly well in practice.With that, I got a small but measurable improvement. To test, I created a
table like this:create table gisttest (id integer, p point);
insert into gisttest select id, point(random(), random()) from
generate_series(1, 1000000) id;
create index i_gisttest on gisttest using gist (p);And I ran this query with pgbench:
select id from gisttest order by p <-> '(0,0)' limit 1000;
With unpatched master, I got about 650 TPS, and with the patch 720 TPS.
That's a nice little improvement, but perhaps more importantly, the pairing
heap implementation consumes less memory. To measure that, I put a
MemoryContextStats(so->queueCtx) call into gistendscan. With the above
query, but without the "limit" clause, on master I got:GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998
chunks); 21296 usedAnd with the patch:
GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502 chunks);
21072 usedThat's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice
to reduce that memory consumption, as there is no hard upper bound on how
much might be needed. If the GiST tree is really disorganized for some
reason, a query might need a lot more.So all in all, I quite like this patch, even though it doesn't do anything
too phenomenal. It adds a some code, in the form of the new pairing heap
implementation, but it makes the GiST code a little bit simpler. And it
gives a small performance gain, and reduces memory usage a bit.- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
It may be better to replace the lib/binaryheap altogether as it offers
comparable/better performance.
On 12/10/2014 10:59 PM, Arthur Silva wrote:
It may be better to replace the lib/binaryheap altogether as it offers
comparable/better performance.
It's not always better. A binary heap is more memory-efficient, for
starters.
There are only two uses of lib/binaryheap: reorderbuffer.c and merge
append. Reorderbuffer isn't that performance critical, although a binary
heap may well be better there, because the comparison function is very
cheap. For merge append, it might be a win, especially if the comparison
function is expensive. (That's on the assumption that the overall number
of comparisons needed with a pairing heap is smaller - I'm not sure how
true that is). That would be worth testing.
I'd love to test some other heap implementation in in tuplesort.c. It
has a custom binary heap implementation that's used in the final merge
phase of an external sort, and also when doing a so-called bounded sort,
i.e. "ORDER BY x LIMIT Y". But that would be difficult to replace,
because tuplesort.c collects tuples in an array in memory, and then
turns that into a heap. An array is efficient to turn into a binary
heap, but to switch to another data structure, you'd suddenly need extra
memory. And we do the switch when we run out of work_mem, so allocating
more isn't really an option.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 11, 2014 at 12:50 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
3. A binary heap would be a better data structure to buffer the
rechecked
values. A Red-Black tree allows random insertions and deletions, but in
this case you need to insert arbitrary values but only remove the
minimum
item. That's exactly what a binary heap excels at. We have a nice binary
heap implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.Hmm. For me binary heap would be a better data structure for KNN-GiST at
all :-)I decided to give this a shot, replacing the red-black tree in GiST with the
binary heap we have in lib/binaryheap.c. It made the GiST code somewhat
simpler, as the binaryheap interface is simpler than the red-black tree one.
Unfortunately, performance was somewhat worse. That was quite surprising, as
insertions and deletions are both O(log N) in both data structures, but the
red-black tree implementation is more complicated.I implemented another data structure called a Pairing Heap. It's also a
fairly simple data structure, but insertions are O(1) instead of O(log N).
It also performs fairly well in practice.With that, I got a small but measurable improvement. To test, I created a
table like this:create table gisttest (id integer, p point);
insert into gisttest select id, point(random(), random()) from
generate_series(1, 1000000) id;
create index i_gisttest on gisttest using gist (p);And I ran this query with pgbench:
select id from gisttest order by p <-> '(0,0)' limit 1000;
With unpatched master, I got about 650 TPS, and with the patch 720 TPS.
That's a nice little improvement, but perhaps more importantly, the pairing
heap implementation consumes less memory. To measure that, I put a
MemoryContextStats(so->queueCtx) call into gistendscan. With the above
query, but without the "limit" clause, on master I got:GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998 chunks);
21296 usedAnd with the patch:
GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502 chunks);
21072 usedThat's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice to
reduce that memory consumption, as there is no hard upper bound on how much
might be needed. If the GiST tree is really disorganized for some reason, a
query might need a lot more.So all in all, I quite like this patch, even though it doesn't do anything
too phenomenal. It adds a some code, in the form of the new pairing heap
implementation, but it makes the GiST code a little bit simpler. And it
gives a small performance gain, and reduces memory usage a bit.
Hum. It looks that this patch using binary heap is intended to be a
replacement red-black tree method. Any reason why it isn't added to
the CF to track it?
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Jan 28, 2014 at 10:54 PM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.
This patch is tracked with this entry in the commit fest app and is
marked as "Ready for committer". Hence I am moving this specific part
to 2014-12 to keep track of it:
3. A binary heap would be a better data structure to buffer the rechecked
values. A Red-Black tree allows random insertions and deletions, but in this
case you need to insert arbitrary values but only remove the minimum item.
That's exactly what a binary heap excels at. We have a nice binary heap
implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.
Based on those comments, I am marking this entry as returned with feedback:
https://commitfest.postgresql.org/action/patch_view?id=1367
Heikki has sent as well a new patch to use a binary heap method
instead of the red-black tree here:
/messages/by-id/54886BB8.9040000@vmware.com
IMO this last patch should be added in the CF app, that's not the case now.
Regards,
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/15/2014 03:49 AM, Michael Paquier wrote:
On Thu, Dec 11, 2014 at 12:50 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:On 01/28/2014 04:12 PM, Alexander Korotkov wrote:
3. A binary heap would be a better data structure to buffer the
rechecked
values. A Red-Black tree allows random insertions and deletions, but in
this case you need to insert arbitrary values but only remove the
minimum
item. That's exactly what a binary heap excels at. We have a nice binary
heap implementation in the backend that you can use, see
src/backend/lib/binaryheap.c.Hmm. For me binary heap would be a better data structure for KNN-GiST at
all :-)I decided to give this a shot, replacing the red-black tree in GiST with the
binary heap we have in lib/binaryheap.c. It made the GiST code somewhat
simpler, as the binaryheap interface is simpler than the red-black tree one.
Unfortunately, performance was somewhat worse. That was quite surprising, as
insertions and deletions are both O(log N) in both data structures, but the
red-black tree implementation is more complicated.I implemented another data structure called a Pairing Heap. It's also a
fairly simple data structure, but insertions are O(1) instead of O(log N).
It also performs fairly well in practice.With that, I got a small but measurable improvement. To test, I created a
table like this:create table gisttest (id integer, p point);
insert into gisttest select id, point(random(), random()) from
generate_series(1, 1000000) id;
create index i_gisttest on gisttest using gist (p);And I ran this query with pgbench:
select id from gisttest order by p <-> '(0,0)' limit 1000;
With unpatched master, I got about 650 TPS, and with the patch 720 TPS.
That's a nice little improvement, but perhaps more importantly, the pairing
heap implementation consumes less memory. To measure that, I put a
MemoryContextStats(so->queueCtx) call into gistendscan. With the above
query, but without the "limit" clause, on master I got:GiST scan context: 2109752 total in 10 blocks; 2088456 free (24998 chunks);
21296 usedAnd with the patch:
GiST scan context: 1061160 total in 9 blocks; 1040088 free (12502 chunks);
21072 usedThat's 2MB vs 1MB. While that's not much in absolute terms, it'd be nice to
reduce that memory consumption, as there is no hard upper bound on how much
might be needed. If the GiST tree is really disorganized for some reason, a
query might need a lot more.So all in all, I quite like this patch, even though it doesn't do anything
too phenomenal. It adds a some code, in the form of the new pairing heap
implementation, but it makes the GiST code a little bit simpler. And it
gives a small performance gain, and reduces memory usage a bit.Hum. It looks that this patch using binary heap is intended to be a
replacement red-black tree method.
Right.
Here's a new version of the patch. It now uses the same pairing heap
code that I posted in the other thread ("advance local xmin more
aggressivley",
/messages/by-id/5488ACF0.8050901@vmware.com). The
pairingheap_remove() function is unused in this patch, but it is needed
by that other patch.
Any reason why it isn't added to the CF to track it?
No. Will add.
- Heikki
Attachments:
knn-gist-pairingheap-2.patchtext/x-diff; name=knn-gist-pairingheap-2.patchDownload
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..e5eb6f6 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -18,6 +18,7 @@
#include "access/relscan.h"
#include "miscadmin.h"
#include "pgstat.h"
+#include "lib/pairingheap.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
@@ -243,8 +244,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
GISTPageOpaque opaque;
OffsetNumber maxoff;
OffsetNumber i;
- GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
- bool isNew;
MemoryContext oldcxt;
Assert(!GISTSearchItemIsHeap(*pageItem));
@@ -275,18 +274,15 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for the right sibling index page */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
item->blkno = opaque->rightlink;
item->data.parentlsn = pageItem->data.parentlsn;
/* Insert it into the queue using same distances as for this page */
- tmpItem->head = item;
- tmpItem->lastHeap = NULL;
- memcpy(tmpItem->distances, myDistances,
+ memcpy(item->distances, myDistances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -348,8 +344,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for this item */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
if (GistPageIsLeaf(page))
{
@@ -372,12 +367,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
}
/* Insert it into the queue using new distance data */
- tmpItem->head = item;
- tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
- memcpy(tmpItem->distances, so->distances,
+ memcpy(item->distances, so->distances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -390,44 +383,24 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
- *
- * NOTE: on successful return, so->curTreeItem is the GISTSearchTreeItem that
- * contained the result item. Callers can use so->curTreeItem->distances as
- * the distances value for the item.
*/
static GISTSearchItem *
getNextGISTSearchItem(GISTScanOpaque so)
{
- for (;;)
- {
- GISTSearchItem *item;
-
- /* Update curTreeItem if we don't have one */
- if (so->curTreeItem == NULL)
- {
- so->curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so->queue);
- /* Done when tree is empty */
- if (so->curTreeItem == NULL)
- break;
- }
+ GISTSearchItem *item;
- item = so->curTreeItem->head;
- if (item != NULL)
- {
- /* Delink item from chain */
- so->curTreeItem->head = item->next;
- if (item == so->curTreeItem->lastHeap)
- so->curTreeItem->lastHeap = NULL;
- /* Return item; caller is responsible to pfree it */
- return item;
- }
-
- /* curTreeItem is exhausted, so remove it from rbtree */
- rb_delete(so->queue, (RBNode *) so->curTreeItem);
- so->curTreeItem = NULL;
+ if (!pairingheap_is_empty(so->queue))
+ {
+ item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
+ }
+ else
+ {
+ /* Done when both heaps are empty */
+ item = NULL;
}
- return NULL;
+ /* Return item; caller is responsible to pfree it */
+ return item;
}
/*
@@ -458,7 +431,7 @@ getNextNearest(IndexScanDesc scan)
/* visit an index page, extract its items into queue */
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
}
pfree(item);
@@ -491,7 +464,6 @@ gistgettuple(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
so->firstCall = false;
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -534,7 +506,7 @@ gistgettuple(PG_FUNCTION_ARGS)
* this page, we fall out of the inner "do" and loop around to
* return them.
*/
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
pfree(item);
} while (so->nPageData == 0);
@@ -560,7 +532,6 @@ gistgetbitmap(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
/* Begin the scan by processing the root page */
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -580,7 +551,7 @@ gistgetbitmap(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, tbm, &ntids);
+ gistScanPage(scan, item, item->distances, tbm, &ntids);
pfree(item);
}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 8360b16..eff02c4 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -22,14 +22,13 @@
/*
- * RBTree support functions for the GISTSearchTreeItem queue
+ * Pairing heap comparison function for the GISTSearchItem queue
*/
-
static int
-GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
+pairingheap_GISTSearchItem_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
{
- const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
- const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
+ const GISTSearchItem *sa = (const GISTSearchItem *) a;
+ const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
int i;
@@ -37,59 +36,16 @@ GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
for (i = 0; i < scan->numberOfOrderBys; i++)
{
if (sa->distances[i] != sb->distances[i])
- return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
- }
-
- return 0;
-}
-
-static void
-GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
-{
- GISTSearchTreeItem *scurrent = (GISTSearchTreeItem *) existing;
- const GISTSearchTreeItem *snew = (const GISTSearchTreeItem *) newrb;
- GISTSearchItem *newitem = snew->head;
-
- /* snew should have just one item in its chain */
- Assert(newitem && newitem->next == NULL);
-
- /*
- * If new item is heap tuple, it goes to front of chain; otherwise insert
- * it before the first index-page item, so that index pages are visited in
- * LIFO order, ensuring depth-first search of index pages. See comments
- * in gist_private.h.
- */
- if (GISTSearchItemIsHeap(*newitem))
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- if (scurrent->lastHeap == NULL)
- scurrent->lastHeap = newitem;
+ return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
}
- else if (scurrent->lastHeap == NULL)
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- }
- else
- {
- newitem->next = scurrent->lastHeap->next;
- scurrent->lastHeap->next = newitem;
- }
-}
-static RBNode *
-GISTSearchTreeItemAllocator(void *arg)
-{
- IndexScanDesc scan = (IndexScanDesc) arg;
-
- return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
-}
+ /* Heap items go before inner pages, to ensure a depth-first search */
+ if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
+ return -1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return 1;
-static void
-GISTSearchTreeItemDeleter(RBNode *rb, void *arg)
-{
- pfree(rb);
+ return 0;
}
@@ -127,7 +83,6 @@ gistbeginscan(PG_FUNCTION_ARGS)
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
- so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
@@ -188,15 +143,9 @@ gistrescan(PG_FUNCTION_ARGS)
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
- so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
- GISTSearchTreeItemComparator,
- GISTSearchTreeItemCombiner,
- GISTSearchTreeItemAllocator,
- GISTSearchTreeItemDeleter,
- scan);
+ so->queue = pairingheap_allocate(pairingheap_GISTSearchItem_cmp, scan);
MemoryContextSwitchTo(oldCxt);
- so->curTreeItem = NULL;
so->firstCall = true;
/* Update scan key, if a new one is given */
diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 327a1bc..b24ece6 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/lib
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-OBJS = ilist.o binaryheap.o stringinfo.o
+OBJS = ilist.o binaryheap.o pairingheap.o stringinfo.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/pairingheap.c b/src/backend/lib/pairingheap.c
new file mode 100644
index 0000000..f0db138
--- /dev/null
+++ b/src/backend/lib/pairingheap.c
@@ -0,0 +1,235 @@
+/*-------------------------------------------------------------------------
+ *
+ * pairingheap.c
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/lib/pairingheap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "lib/pairingheap.h"
+
+static pairingheap_node *merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b);
+static pairingheap_node *merge_children(pairingheap *heap, pairingheap_node *children);
+
+/*
+ * pairingheap_allocate
+ *
+ * Returns a pointer to a newly-allocated heap, with the heap property
+ * defined by the given comparator function, which will be invoked with the
+ * additional argument specified by 'arg'.
+ */
+pairingheap *
+pairingheap_allocate(pairingheap_comparator compare, void *arg)
+{
+ pairingheap *heap;
+
+ heap = (pairingheap *) palloc(sizeof(pairingheap));
+ heap->ph_compare = compare;
+ heap->ph_arg = arg;
+
+ heap->ph_root = NULL;
+
+ return heap;
+}
+
+/*
+ * pairingheap_free
+ *
+ * Releases memory used by the given pairingheap.
+ *
+ * Note: The items in the heap are not released!
+ */
+void
+pairingheap_free(pairingheap *heap)
+{
+ pfree(heap);
+}
+
+
+/* A helper function to merge two subheaps into one. */
+static pairingheap_node *
+merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b)
+{
+ if (a == NULL)
+ return b;
+ if (b == NULL)
+ return a;
+
+ /* Put the larger of the items as a child of the smaller one */
+ if (heap->ph_compare(a, b, heap->ph_arg) < 0)
+ {
+ pairingheap_node *tmp;
+
+ tmp = a;
+ a = b;
+ b = tmp;
+ }
+
+ if (a->first_child)
+ a->first_child->prev_or_parent = b;
+ b->prev_or_parent = a;
+ b->next_sibling = a->first_child;
+ a->first_child = b;
+ return a;
+}
+
+/*
+ * pairingheap_add
+ *
+ * Adds the given datum to the heap in O(1) time.
+ */
+void
+pairingheap_add(pairingheap *heap, pairingheap_node *d)
+{
+ d->first_child = NULL;
+
+ /* Link the new item as a new tree */
+ heap->ph_root = merge(heap, heap->ph_root, d);
+}
+
+/*
+ * pairingheap_first
+ *
+ * Returns a pointer to the first (root, topmost) node in the heap
+ * without modifying the heap. The caller must ensure that this
+ * routine is not used on an empty heap. Always O(1).
+ */
+pairingheap_node *
+pairingheap_first(pairingheap *heap)
+{
+ Assert(!pairingheap_empty(heap));
+ return heap->ph_root;
+}
+
+/*
+ * pairingheap_remove_first
+ *
+ * Removes the first (root, topmost) node in the heap and returns a
+ * pointer to it after rebalancing the heap. The caller must ensure
+ * that this routine is not used on an empty heap. O(log n) amortized.
+ */
+pairingheap_node *
+pairingheap_remove_first(pairingheap *heap)
+{
+ pairingheap_node *result;
+ pairingheap_node *children;
+
+ Assert(!pairingheap_empty(heap));
+
+ /* Remove the smallest root. */
+ result = heap->ph_root;
+ children = result->first_child;
+
+ heap->ph_root = merge_children(heap, children);
+
+ return result;
+}
+
+/*
+ * Merge a list of subheaps into a single heap.
+ *
+ * This implements the basic two-pass merging strategy, first forming
+ * pairs from left to right, and then merging the pairs.
+ */
+static pairingheap_node *
+merge_children(pairingheap *heap, pairingheap_node *children)
+{
+ pairingheap_node *item, *next;
+ pairingheap_node *pairs;
+ pairingheap_node *newroot;
+
+ if (children == NULL || children->next_sibling == NULL)
+ return children;
+
+ /* Walk the remaining subheaps from left to right, merging in pairs */
+ next = children;
+ pairs = NULL;
+ for (;;)
+ {
+ item = next;
+ if (item == NULL)
+ break;
+ if (item->next_sibling == NULL)
+ {
+ /* last odd item at the end of list */
+ item->next_sibling = pairs;
+ pairs = item;
+ break;
+ }
+ else
+ {
+ next = item->next_sibling->next_sibling;
+
+ item = merge(heap, item, item->next_sibling);
+ item->next_sibling = pairs;
+ pairs = item;
+ }
+ }
+
+ /*
+ * Form a single (sub)heap from the pairs.
+ */
+ newroot = pairs;
+ next = pairs->next_sibling;
+ while (next)
+ {
+ item = next;
+ next = item->next_sibling;
+
+ newroot = merge(heap, newroot, item);
+ }
+
+ return newroot;
+}
+
+/*
+ * Remove 'item' from the heap. O(log n) amortized.
+ */
+void
+pairingheap_remove(pairingheap *heap, pairingheap_node *item)
+{
+ pairingheap_node *children;
+ pairingheap_node *replacement;
+ pairingheap_node *next_sibling;
+ pairingheap_node **prev_ptr;
+
+ if (item == heap->ph_root)
+ {
+ (void) pairingheap_remove_first(heap);
+ return;
+ }
+
+ children = item->first_child;
+ next_sibling = item->next_sibling;
+
+ if (item->prev_or_parent->first_child == item)
+ prev_ptr = &item->prev_or_parent->first_child;
+ else
+ prev_ptr = &item->prev_or_parent->next_sibling;
+ Assert(*prev_ptr == item);
+
+ /* Form a new heap of the children */
+ replacement = merge_children(heap, children);
+
+ if (replacement == NULL)
+ {
+ *prev_ptr = next_sibling;
+ if (next_sibling)
+ next_sibling->prev_or_parent = item->prev_or_parent;
+ }
+ else
+ {
+ replacement->prev_or_parent = item->prev_or_parent;
+ replacement->next_sibling = item->next_sibling;
+ *prev_ptr = replacement;
+ if (next_sibling)
+ next_sibling->prev_or_parent = replacement;
+ }
+}
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 2cbc918..07bc607 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -18,9 +18,9 @@
#include "access/itup.h"
#include "access/xlogreader.h"
#include "fmgr.h"
+#include "lib/pairingheap.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
-#include "utils/rbtree.h"
#include "utils/hsearch.h"
/*
@@ -123,7 +123,7 @@ typedef struct GISTSearchHeapItem
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
- struct GISTSearchItem *next; /* list link */
+ pairingheap_node phNode;
BlockNumber blkno; /* index page number, or InvalidBlockNumber */
union
{
@@ -131,24 +131,12 @@ typedef struct GISTSearchItem
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
+ double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
-/*
- * Within a GISTSearchTreeItem's chain, heap items always appear before
- * index-page items, since we want to visit heap items first. lastHeap points
- * to the last heap item in the chain, or is NULL if there are none.
- */
-typedef struct GISTSearchTreeItem
-{
- RBNode rbnode; /* this is an RBTree item */
- GISTSearchItem *head; /* first chain member */
- GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
- double distances[1]; /* array with numberOfOrderBys entries */
-} GISTSearchTreeItem;
-
-#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
+#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
@@ -156,15 +144,12 @@ typedef struct GISTSearchTreeItem
typedef struct GISTScanOpaqueData
{
GISTSTATE *giststate; /* index information, see above */
- RBTree *queue; /* queue of unvisited items */
+ pairingheap *queue; /* queue of unvisited items */
MemoryContext queueCxt; /* context holding the queue */
bool qual_ok; /* false if qual can never be satisfied */
bool firstCall; /* true until first gistgettuple call */
- GISTSearchTreeItem *curTreeItem; /* current queue item, if any */
-
/* pre-allocated workspace arrays */
- GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
diff --git a/src/include/lib/pairingheap.h b/src/include/lib/pairingheap.h
new file mode 100644
index 0000000..e78196d
--- /dev/null
+++ b/src/include/lib/pairingheap.h
@@ -0,0 +1,67 @@
+/*
+ * pairingheap.h
+ *
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * src/include/lib/pairingheap.h
+ */
+
+#ifndef PAIRINGHEAP_H
+#define PAIRINGHEAP_H
+
+/*
+ * This represents an element stored in the heap. Embed this in a larger
+ * struct containing the actual data you're storing.
+ */
+typedef struct pairingheap_node
+{
+ struct pairingheap_node *first_child;
+ struct pairingheap_node *next_sibling;
+ struct pairingheap_node *prev_or_parent;
+} pairingheap_node;
+
+/*
+ * Return the containing struct of 'type' where 'membername' is the
+ * pairingheap_node pointed at by 'ptr'.
+ *
+ * This is used to convert a pairingheap_node * back to its containing struct.
+ */
+#define pairingheap_container(type, membername, ptr) \
+ (AssertVariableIsOfTypeMacro(ptr, pairingheap_node *), \
+ AssertVariableIsOfTypeMacro(((type *) NULL)->membername, pairingheap_node), \
+ ((type *) ((char *) (ptr) - offsetof(type, membername))))
+
+/*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+typedef int (*pairingheap_comparator) (const pairingheap_node *a, const pairingheap_node *b, void *arg);
+
+/*
+ * A pairing heap.
+ */
+typedef struct pairingheap
+{
+ pairingheap_comparator ph_compare; /* comparison function */
+ void *ph_arg; /* opaque argument to ph_compare */
+ pairingheap_node *ph_root; /* current root of the heap */
+} pairingheap;
+
+extern pairingheap *pairingheap_allocate(pairingheap_comparator compare,
+ void *arg);
+extern void pairingheap_free(pairingheap *heap);
+extern void pairingheap_add(pairingheap *heap, pairingheap_node *d);
+extern pairingheap_node *pairingheap_first(pairingheap *heap);
+extern pairingheap_node *pairingheap_remove_first(pairingheap *heap);
+extern void pairingheap_remove(pairingheap *heap, pairingheap_node *d);
+
+#define pairingheap_reset(h) ((h)->ph_root = NULL)
+
+#define pairingheap_is_empty(h) ((h)->ph_root == NULL)
+
+/* Returns true if the heap contains exactly one item */
+#define pairingheap_is_singular(h) ((h)->ph_root && (h)->ph_root->first_child == NULL)
+
+#endif /* PAIRINGHEAP_H */
On 2014-12-15 15:08:28 +0200, Heikki Linnakangas wrote:
+/*------------------------------------------------------------------------- + * + * pairingheap.c + * A Pairing Heap implementation + * + * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group + * + * IDENTIFICATION + * src/backend/lib/pairingheap.c + * + *------------------------------------------------------------------------- + */
diff --git a/src/include/lib/pairingheap.h b/src/include/lib/pairingheap.h new file mode 100644 index 0000000..e78196d --- /dev/null +++ b/src/include/lib/pairingheap.h @@ -0,0 +1,67 @@ +/* + * pairingheap.h + * + * A Pairing Heap implementation + * + * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group + * + * src/include/lib/pairingheap.h + */ +
If we add another heap implementation we probably should at least hint
at the different advantages somewhere.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 08/03/2014 04:48 PM, Emre Hasegeli wrote:
1. This patch introduces a new "polygon <-> point" operator. That seems
useful on its own, with or without this patch.Yeah, but exact-knn cant come with no one implementation. But it would
better come in a separate patch.I tried to split them. Separated patches are attached. I changed
the order of the arguments as point <-> polygon, because point was
the first one on all the others. Its commutator was required for
the index, so I added it on the second patch. I also added tests
for the operator. I think it is ready for committer as a separate
patch. We can add it to the open CommitFest.
Ok, committed this part now with minor changes. The implementation was
copy-pasted from circle <-> polygon, so I put the common logic to a
dist_ppoly_internal function, and called that in both dist_cpoly and
dist_ppoly.
I was surprised that there were no documentation changes in the patch,
but looking at the docs, we just list the geometric operators without
explaining what the argument types are. That's not very comprehensive,
might be good to expand the docs on that, but it's not this patch's fault.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Dec 15, 2014 at 5:08 AM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
Here's a new version of the patch. It now uses the same pairing heap code
that I posted in the other thread ("advance local xmin more aggressivley",
/messages/by-id/5488ACF0.8050901@vmware.com). The
pairingheap_remove() function is unused in this patch, but it is needed by
that other patch.
Under enable-cassert, this tries to call pairingheap_empty, but that
function doesn't exist.
I looked in the other patch and didn't find it defined there, either.
cheers,
Jeff
On 12/15/2014 11:59 PM, Jeff Janes wrote:
On Mon, Dec 15, 2014 at 5:08 AM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
Here's a new version of the patch. It now uses the same pairing heap code
that I posted in the other thread ("advance local xmin more aggressivley",
/messages/by-id/5488ACF0.8050901@vmware.com). The
pairingheap_remove() function is unused in this patch, but it is needed by
that other patch.Under enable-cassert, this tries to call pairingheap_empty, but that
function doesn't exist.I looked in the other patch and didn't find it defined there, either.
Ah, I renamed pairingheap_empty to pairingheap_is_empty at the last
minute, and missed the Asserts. Here's a corrected version.
- Heikki
Attachments:
knn-gist-pairingheap-3.patchtext/x-diff; name=knn-gist-pairingheap-3.patchDownload
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..e5eb6f6 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -18,6 +18,7 @@
#include "access/relscan.h"
#include "miscadmin.h"
#include "pgstat.h"
+#include "lib/pairingheap.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
@@ -243,8 +244,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
GISTPageOpaque opaque;
OffsetNumber maxoff;
OffsetNumber i;
- GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
- bool isNew;
MemoryContext oldcxt;
Assert(!GISTSearchItemIsHeap(*pageItem));
@@ -275,18 +274,15 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for the right sibling index page */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
item->blkno = opaque->rightlink;
item->data.parentlsn = pageItem->data.parentlsn;
/* Insert it into the queue using same distances as for this page */
- tmpItem->head = item;
- tmpItem->lastHeap = NULL;
- memcpy(tmpItem->distances, myDistances,
+ memcpy(item->distances, myDistances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -348,8 +344,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for this item */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
if (GistPageIsLeaf(page))
{
@@ -372,12 +367,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
}
/* Insert it into the queue using new distance data */
- tmpItem->head = item;
- tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
- memcpy(tmpItem->distances, so->distances,
+ memcpy(item->distances, so->distances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -390,44 +383,24 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
- *
- * NOTE: on successful return, so->curTreeItem is the GISTSearchTreeItem that
- * contained the result item. Callers can use so->curTreeItem->distances as
- * the distances value for the item.
*/
static GISTSearchItem *
getNextGISTSearchItem(GISTScanOpaque so)
{
- for (;;)
- {
- GISTSearchItem *item;
-
- /* Update curTreeItem if we don't have one */
- if (so->curTreeItem == NULL)
- {
- so->curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so->queue);
- /* Done when tree is empty */
- if (so->curTreeItem == NULL)
- break;
- }
+ GISTSearchItem *item;
- item = so->curTreeItem->head;
- if (item != NULL)
- {
- /* Delink item from chain */
- so->curTreeItem->head = item->next;
- if (item == so->curTreeItem->lastHeap)
- so->curTreeItem->lastHeap = NULL;
- /* Return item; caller is responsible to pfree it */
- return item;
- }
-
- /* curTreeItem is exhausted, so remove it from rbtree */
- rb_delete(so->queue, (RBNode *) so->curTreeItem);
- so->curTreeItem = NULL;
+ if (!pairingheap_is_empty(so->queue))
+ {
+ item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
+ }
+ else
+ {
+ /* Done when both heaps are empty */
+ item = NULL;
}
- return NULL;
+ /* Return item; caller is responsible to pfree it */
+ return item;
}
/*
@@ -458,7 +431,7 @@ getNextNearest(IndexScanDesc scan)
/* visit an index page, extract its items into queue */
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
}
pfree(item);
@@ -491,7 +464,6 @@ gistgettuple(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
so->firstCall = false;
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -534,7 +506,7 @@ gistgettuple(PG_FUNCTION_ARGS)
* this page, we fall out of the inner "do" and loop around to
* return them.
*/
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
pfree(item);
} while (so->nPageData == 0);
@@ -560,7 +532,6 @@ gistgetbitmap(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
/* Begin the scan by processing the root page */
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -580,7 +551,7 @@ gistgetbitmap(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, tbm, &ntids);
+ gistScanPage(scan, item, item->distances, tbm, &ntids);
pfree(item);
}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 8360b16..eff02c4 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -22,14 +22,13 @@
/*
- * RBTree support functions for the GISTSearchTreeItem queue
+ * Pairing heap comparison function for the GISTSearchItem queue
*/
-
static int
-GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
+pairingheap_GISTSearchItem_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
{
- const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
- const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
+ const GISTSearchItem *sa = (const GISTSearchItem *) a;
+ const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
int i;
@@ -37,59 +36,16 @@ GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
for (i = 0; i < scan->numberOfOrderBys; i++)
{
if (sa->distances[i] != sb->distances[i])
- return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
- }
-
- return 0;
-}
-
-static void
-GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
-{
- GISTSearchTreeItem *scurrent = (GISTSearchTreeItem *) existing;
- const GISTSearchTreeItem *snew = (const GISTSearchTreeItem *) newrb;
- GISTSearchItem *newitem = snew->head;
-
- /* snew should have just one item in its chain */
- Assert(newitem && newitem->next == NULL);
-
- /*
- * If new item is heap tuple, it goes to front of chain; otherwise insert
- * it before the first index-page item, so that index pages are visited in
- * LIFO order, ensuring depth-first search of index pages. See comments
- * in gist_private.h.
- */
- if (GISTSearchItemIsHeap(*newitem))
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- if (scurrent->lastHeap == NULL)
- scurrent->lastHeap = newitem;
+ return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
}
- else if (scurrent->lastHeap == NULL)
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- }
- else
- {
- newitem->next = scurrent->lastHeap->next;
- scurrent->lastHeap->next = newitem;
- }
-}
-static RBNode *
-GISTSearchTreeItemAllocator(void *arg)
-{
- IndexScanDesc scan = (IndexScanDesc) arg;
-
- return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
-}
+ /* Heap items go before inner pages, to ensure a depth-first search */
+ if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
+ return -1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return 1;
-static void
-GISTSearchTreeItemDeleter(RBNode *rb, void *arg)
-{
- pfree(rb);
+ return 0;
}
@@ -127,7 +83,6 @@ gistbeginscan(PG_FUNCTION_ARGS)
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
- so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
@@ -188,15 +143,9 @@ gistrescan(PG_FUNCTION_ARGS)
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
- so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
- GISTSearchTreeItemComparator,
- GISTSearchTreeItemCombiner,
- GISTSearchTreeItemAllocator,
- GISTSearchTreeItemDeleter,
- scan);
+ so->queue = pairingheap_allocate(pairingheap_GISTSearchItem_cmp, scan);
MemoryContextSwitchTo(oldCxt);
- so->curTreeItem = NULL;
so->firstCall = true;
/* Update scan key, if a new one is given */
diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 327a1bc..b24ece6 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/lib
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-OBJS = ilist.o binaryheap.o stringinfo.o
+OBJS = ilist.o binaryheap.o pairingheap.o stringinfo.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/pairingheap.c b/src/backend/lib/pairingheap.c
new file mode 100644
index 0000000..a7e8901
--- /dev/null
+++ b/src/backend/lib/pairingheap.c
@@ -0,0 +1,235 @@
+/*-------------------------------------------------------------------------
+ *
+ * pairingheap.c
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/lib/pairingheap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "lib/pairingheap.h"
+
+static pairingheap_node *merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b);
+static pairingheap_node *merge_children(pairingheap *heap, pairingheap_node *children);
+
+/*
+ * pairingheap_allocate
+ *
+ * Returns a pointer to a newly-allocated heap, with the heap property
+ * defined by the given comparator function, which will be invoked with the
+ * additional argument specified by 'arg'.
+ */
+pairingheap *
+pairingheap_allocate(pairingheap_comparator compare, void *arg)
+{
+ pairingheap *heap;
+
+ heap = (pairingheap *) palloc(sizeof(pairingheap));
+ heap->ph_compare = compare;
+ heap->ph_arg = arg;
+
+ heap->ph_root = NULL;
+
+ return heap;
+}
+
+/*
+ * pairingheap_free
+ *
+ * Releases memory used by the given pairingheap.
+ *
+ * Note: The items in the heap are not released!
+ */
+void
+pairingheap_free(pairingheap *heap)
+{
+ pfree(heap);
+}
+
+
+/* A helper function to merge two subheaps into one. */
+static pairingheap_node *
+merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b)
+{
+ if (a == NULL)
+ return b;
+ if (b == NULL)
+ return a;
+
+ /* Put the larger of the items as a child of the smaller one */
+ if (heap->ph_compare(a, b, heap->ph_arg) < 0)
+ {
+ pairingheap_node *tmp;
+
+ tmp = a;
+ a = b;
+ b = tmp;
+ }
+
+ if (a->first_child)
+ a->first_child->prev_or_parent = b;
+ b->prev_or_parent = a;
+ b->next_sibling = a->first_child;
+ a->first_child = b;
+ return a;
+}
+
+/*
+ * pairingheap_add
+ *
+ * Adds the given datum to the heap in O(1) time.
+ */
+void
+pairingheap_add(pairingheap *heap, pairingheap_node *d)
+{
+ d->first_child = NULL;
+
+ /* Link the new item as a new tree */
+ heap->ph_root = merge(heap, heap->ph_root, d);
+}
+
+/*
+ * pairingheap_first
+ *
+ * Returns a pointer to the first (root, topmost) node in the heap
+ * without modifying the heap. The caller must ensure that this
+ * routine is not used on an empty heap. Always O(1).
+ */
+pairingheap_node *
+pairingheap_first(pairingheap *heap)
+{
+ Assert(!pairingheap_is_empty(heap));
+ return heap->ph_root;
+}
+
+/*
+ * pairingheap_remove_first
+ *
+ * Removes the first (root, topmost) node in the heap and returns a
+ * pointer to it after rebalancing the heap. The caller must ensure
+ * that this routine is not used on an empty heap. O(log n) amortized.
+ */
+pairingheap_node *
+pairingheap_remove_first(pairingheap *heap)
+{
+ pairingheap_node *result;
+ pairingheap_node *children;
+
+ Assert(!pairingheap_is_empty(heap));
+
+ /* Remove the smallest root. */
+ result = heap->ph_root;
+ children = result->first_child;
+
+ heap->ph_root = merge_children(heap, children);
+
+ return result;
+}
+
+/*
+ * Merge a list of subheaps into a single heap.
+ *
+ * This implements the basic two-pass merging strategy, first forming
+ * pairs from left to right, and then merging the pairs.
+ */
+static pairingheap_node *
+merge_children(pairingheap *heap, pairingheap_node *children)
+{
+ pairingheap_node *item, *next;
+ pairingheap_node *pairs;
+ pairingheap_node *newroot;
+
+ if (children == NULL || children->next_sibling == NULL)
+ return children;
+
+ /* Walk the remaining subheaps from left to right, merging in pairs */
+ next = children;
+ pairs = NULL;
+ for (;;)
+ {
+ item = next;
+ if (item == NULL)
+ break;
+ if (item->next_sibling == NULL)
+ {
+ /* last odd item at the end of list */
+ item->next_sibling = pairs;
+ pairs = item;
+ break;
+ }
+ else
+ {
+ next = item->next_sibling->next_sibling;
+
+ item = merge(heap, item, item->next_sibling);
+ item->next_sibling = pairs;
+ pairs = item;
+ }
+ }
+
+ /*
+ * Form a single (sub)heap from the pairs.
+ */
+ newroot = pairs;
+ next = pairs->next_sibling;
+ while (next)
+ {
+ item = next;
+ next = item->next_sibling;
+
+ newroot = merge(heap, newroot, item);
+ }
+
+ return newroot;
+}
+
+/*
+ * Remove 'item' from the heap. O(log n) amortized.
+ */
+void
+pairingheap_remove(pairingheap *heap, pairingheap_node *item)
+{
+ pairingheap_node *children;
+ pairingheap_node *replacement;
+ pairingheap_node *next_sibling;
+ pairingheap_node **prev_ptr;
+
+ if (item == heap->ph_root)
+ {
+ (void) pairingheap_remove_first(heap);
+ return;
+ }
+
+ children = item->first_child;
+ next_sibling = item->next_sibling;
+
+ if (item->prev_or_parent->first_child == item)
+ prev_ptr = &item->prev_or_parent->first_child;
+ else
+ prev_ptr = &item->prev_or_parent->next_sibling;
+ Assert(*prev_ptr == item);
+
+ /* Form a new heap of the children */
+ replacement = merge_children(heap, children);
+
+ if (replacement == NULL)
+ {
+ *prev_ptr = next_sibling;
+ if (next_sibling)
+ next_sibling->prev_or_parent = item->prev_or_parent;
+ }
+ else
+ {
+ replacement->prev_or_parent = item->prev_or_parent;
+ replacement->next_sibling = item->next_sibling;
+ *prev_ptr = replacement;
+ if (next_sibling)
+ next_sibling->prev_or_parent = replacement;
+ }
+}
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 2cbc918..07bc607 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -18,9 +18,9 @@
#include "access/itup.h"
#include "access/xlogreader.h"
#include "fmgr.h"
+#include "lib/pairingheap.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
-#include "utils/rbtree.h"
#include "utils/hsearch.h"
/*
@@ -123,7 +123,7 @@ typedef struct GISTSearchHeapItem
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
- struct GISTSearchItem *next; /* list link */
+ pairingheap_node phNode;
BlockNumber blkno; /* index page number, or InvalidBlockNumber */
union
{
@@ -131,24 +131,12 @@ typedef struct GISTSearchItem
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
+ double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
-/*
- * Within a GISTSearchTreeItem's chain, heap items always appear before
- * index-page items, since we want to visit heap items first. lastHeap points
- * to the last heap item in the chain, or is NULL if there are none.
- */
-typedef struct GISTSearchTreeItem
-{
- RBNode rbnode; /* this is an RBTree item */
- GISTSearchItem *head; /* first chain member */
- GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
- double distances[1]; /* array with numberOfOrderBys entries */
-} GISTSearchTreeItem;
-
-#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
+#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
@@ -156,15 +144,12 @@ typedef struct GISTSearchTreeItem
typedef struct GISTScanOpaqueData
{
GISTSTATE *giststate; /* index information, see above */
- RBTree *queue; /* queue of unvisited items */
+ pairingheap *queue; /* queue of unvisited items */
MemoryContext queueCxt; /* context holding the queue */
bool qual_ok; /* false if qual can never be satisfied */
bool firstCall; /* true until first gistgettuple call */
- GISTSearchTreeItem *curTreeItem; /* current queue item, if any */
-
/* pre-allocated workspace arrays */
- GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
diff --git a/src/include/lib/pairingheap.h b/src/include/lib/pairingheap.h
new file mode 100644
index 0000000..e78196d
--- /dev/null
+++ b/src/include/lib/pairingheap.h
@@ -0,0 +1,67 @@
+/*
+ * pairingheap.h
+ *
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * src/include/lib/pairingheap.h
+ */
+
+#ifndef PAIRINGHEAP_H
+#define PAIRINGHEAP_H
+
+/*
+ * This represents an element stored in the heap. Embed this in a larger
+ * struct containing the actual data you're storing.
+ */
+typedef struct pairingheap_node
+{
+ struct pairingheap_node *first_child;
+ struct pairingheap_node *next_sibling;
+ struct pairingheap_node *prev_or_parent;
+} pairingheap_node;
+
+/*
+ * Return the containing struct of 'type' where 'membername' is the
+ * pairingheap_node pointed at by 'ptr'.
+ *
+ * This is used to convert a pairingheap_node * back to its containing struct.
+ */
+#define pairingheap_container(type, membername, ptr) \
+ (AssertVariableIsOfTypeMacro(ptr, pairingheap_node *), \
+ AssertVariableIsOfTypeMacro(((type *) NULL)->membername, pairingheap_node), \
+ ((type *) ((char *) (ptr) - offsetof(type, membername))))
+
+/*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+typedef int (*pairingheap_comparator) (const pairingheap_node *a, const pairingheap_node *b, void *arg);
+
+/*
+ * A pairing heap.
+ */
+typedef struct pairingheap
+{
+ pairingheap_comparator ph_compare; /* comparison function */
+ void *ph_arg; /* opaque argument to ph_compare */
+ pairingheap_node *ph_root; /* current root of the heap */
+} pairingheap;
+
+extern pairingheap *pairingheap_allocate(pairingheap_comparator compare,
+ void *arg);
+extern void pairingheap_free(pairingheap *heap);
+extern void pairingheap_add(pairingheap *heap, pairingheap_node *d);
+extern pairingheap_node *pairingheap_first(pairingheap *heap);
+extern pairingheap_node *pairingheap_remove_first(pairingheap *heap);
+extern void pairingheap_remove(pairingheap *heap, pairingheap_node *d);
+
+#define pairingheap_reset(h) ((h)->ph_root = NULL)
+
+#define pairingheap_is_empty(h) ((h)->ph_root == NULL)
+
+/* Returns true if the heap contains exactly one item */
+#define pairingheap_is_singular(h) ((h)->ph_root && (h)->ph_root->first_child == NULL)
+
+#endif /* PAIRINGHEAP_H */
On 10/06/2014 12:36 PM, Emre Hasegeli wrote:
Thanks. The main question now is design of this patch. Currently, it does
all the work inside access method. We already have some discussion of pro
and cons of this method. I would like to clarify alternatives now. I can
see following way:1. Implement new executor node which performs sorting by priority queue.
Let's call it "Priority queue". I think it should be separate node from
"Sort" node. Despite "Priority queue" and "Sort" are essentially similar
from user view, they would be completely different in implementation.
2. Implement some interface to transfer distance values from access
method to "Priority queue" node.If we assume that all of them need recheck, maybe it can be done
without passing distance values.
No, the executor needs the lower-bound distance value, as calculated by
the indexam, so that it knows which tuples it can return from the queue
already. For example, imagine the following items coming from the index:
tuple # lower bound actual distance
1 1 1
2 2 10
3 30 30
4 40 40
After the executor has fetched tuple 2, and re-checked the distance, it
pushes the tuple to the queue. It then fetches tuple 3, with lower bound
30, and it can now immediately return tuple # 2 from the queue. Because
10 < 30, so there cannot be any more tuples coming from the index that
would need to go before tuple # 2.
The executor needs the lower bound as calculated by the index, as well
as the actual distance it calculates itself, to make those decisions.
3. Somehow tell the planner that it could use "Priority queue" in
corresponding cases. I see two ways of doing this:
- Add flag to operator in opclass indicating that index can only
order by lower bound of "col op value", not by "col op value" itself.
- Define new relation between operators. Value of one operator could
be lower bound for value of another operator. So, planner can
put "Priority
queue" node when lower bound ordering is possible from index. Also "ALTER
OPERATOR" command would be reasonable, so extensions could upgrade.I think, it would be better to make it a property of the operator
class. We can add a column to pg_amop or define another value for
amoppurpose on pg_amop. Syntax can be something like this:CREATE OPERATOR CLASS circle_ops DEFAULT
FOR TYPE circle USING gist AS
OPERATOR 15 <->(circle, point) FOR ORDER BY pg_catalog.float_ops LOWER BOUND;While looking at it, I realize that current version of the patch does
not use the sort operator family defined with the operator class. It
assumes that the distance function will return values compatible with
the operator. Operator class definition makes me think that there is
not such an assumption.
Yeah. I also noticed that the type of the argument passed to the
consistent function varies, and doesn't necessarily match that declared
in pg_proc. Looking at gist_point_consistent, the argument type can be a
point, a polygon, or a circle, depending on the "strategy group". But
it's declared as a point in pg_proc.
Besides overhead, this way makes significant infrastructural changes. So,
it may be over-engineering. However, it's probably more clean and beautiful
solution.
I would like to get some feedback from people familiar with KNN-GiST like
Heikki or Tom. What do you think about this? Any other ideas?I would be happy to test and review the changes. I think it is nice
to solve the problem in a generalized way improving the access method
infrastructure. Definitely, we should have a consensus on the design
before working on the infrastructure changes.
I took a stab on this. I added the reorder queue directly to the Index
Scan node, rather than adding a whole new node type for it. It seems
reasonable, as Index Scan is responsible for rechecking the quals, too,
even though re-ordering the tuples is more complicated than rechecking
quals.
To recap, the idea is that the index can define an ordering op, even if
it cannot return the tuples in exactly the right order. It is enough
that for each tuple, it returns a lower bound of the expression that is
used for sorting. For example, for "ORDER BY key <-> column", it is
enough that it returns a lower bound of "key <-> column" for each tuple.
The index must return the tuples ordered by the lower bounds. The
executor re-checks the expressions, and re-orders the tuples to the
correct order.
Patch attached. It should be applied on top of my pairing heap patch at
/messages/by-id/548FFA2C.7060000@vmware.com. Some
caveats:
* The signature of the distance function is unchanged, it doesn't get a
recheck argument. It is just assumed that if the consistent function
sets the recheck flag, then the distance needs to be rechecked as well.
We might want to add the recheck argument, like you Alexander did in
your patch, but it's not important right now.
* I used the "distance" term in the executor, although the ORDER BY expr
machinery is more general than that. The value returned by the ORDER BY
expression doesn't have to be a distance, although that's the only thing
supported by GiST and the built-in opclasses.
* I short-circuited the planner to assume that the ORDER BY expression
always returns a float. That's true today for knn-GiST, but is obviously
a bogus assumption in general.
This needs some work to get into a committable state, but from a
modularity point of view, this is much better than having the indexam to
peek into the heap.
- Heikki
Attachments:
knn-gist-recheck-distance-in-executor-1.patchtext/x-diff; name=knn-gist-recheck-distance-in-executor-1.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
index 5de282b..0430738 100644
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@@ -105,6 +105,7 @@
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
@@ -163,6 +164,7 @@
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
@@ -207,6 +209,12 @@
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
@@ -798,13 +806,22 @@ my_distance(PG_FUNCTION_ARGS)
The arguments to the <function>distance</> function are identical to
the arguments of the <function>consistent</> function, except that no
- recheck flag is used. The distance to a leaf index entry must always
- be determined exactly, since there is no way to re-order the tuples
- once they are returned. Some approximation is allowed when determining
- the distance to an internal tree node, so long as the result is never
- greater than any child's actual distance. Thus, for example, distance
- to a bounding box is usually sufficient in geometric applications. The
- result value can be any finite <type>float8</> value. (Infinity and
+ recheck flag is used.
+ </para>
+
+ <para>
+ Some approximation is allowed when determining the distance to an
+ internal tree node, so long as the result is never greater than any
+ child's actual distance. Thus, for example, distance
+ to a bounding box is usually sufficient in geometric applications. For
+ leaf nodes, the returned distance must be accurate, if the consistent
+ function returns *recheck == false for the tuple. Otherwise the same
+ approximation is allowed, and the executor will re-order ambiguous cases
+ after recalculating the actual distance.
+ </para>
+
+ <para>
+ The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index e5eb6f6..8569928 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -192,9 +192,8 @@ gistindex_keytest(IndexScanDesc scan,
* always be zero, but might as well pass it for possible future
* use.)
*
- * Note that Distance functions don't get a recheck argument. We
- * can't tolerate lossy distance calculations on leaf tuples;
- * there is no opportunity to re-sort the tuples afterwards.
+ * Note that Distance functions don't get a recheck argument.
+ * Distance is rechecked whenever the quals are.
*/
dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
@@ -411,6 +410,7 @@ getNextNearest(IndexScanDesc scan)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
do
{
@@ -423,7 +423,11 @@ getNextNearest(IndexScanDesc scan)
{
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
- scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
res = true;
}
else
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index db0bec6..22ab256 100644
--- a/src/backend/access/gist/gistproc.c
+++ b/src/backend/access/gist/gistproc.c
@@ -1459,3 +1459,36 @@ gist_point_distance(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(distance);
}
+
+/*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+Datum
+gist_bbox_distance(PG_FUNCTION_ARGS)
+{
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index eff02c4..1837b78 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -85,6 +85,11 @@ gistbeginscan(PG_FUNCTION_ARGS)
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 2b89dc6..9279454 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -28,14 +28,86 @@
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+#include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+/*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+typedef struct
+{
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+} ReorderTuple;
+
+static int
+cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+{
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+}
+
+static int
+reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+{
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+}
+
+static void
+copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+{
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+}
static TupleTableSlot *IndexNext(IndexScanState *node);
+static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
@@ -54,6 +126,8 @@ IndexNext(IndexScanState *node)
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
@@ -72,11 +146,60 @@ IndexNext(IndexScanState *node)
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
- /*
- * ok, now that we have what we need, fetch the next tuple.
- */
- while ((tuple = index_getnext(scandesc, direction)) != NULL)
+ for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
@@ -103,6 +226,71 @@ IndexNext(IndexScanState *node)
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ RecheckOrderBys(node, slot);
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
@@ -114,6 +302,41 @@ IndexNext(IndexScanState *node)
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+static void
+RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+{
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+}
+
+/*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
@@ -465,6 +688,7 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
@@ -501,6 +725,9 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
@@ -581,6 +808,52 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 8f9ae4f..645c6d8 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -22,6 +22,7 @@
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -102,7 +103,7 @@ static void copy_plan_costsize(Plan *dest, Plan *src);
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
- List *indexorderby, List *indexorderbyorig,
+ List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
@@ -1158,6 +1159,7 @@ create_indexscan_plan(PlannerInfo *root,
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
@@ -1266,6 +1268,25 @@ create_indexscan_plan(PlannerInfo *root,
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ for (i = 0; i < numOrderBys; i++)
+ {
+ /*
+ * FIXME: Currently, amcanorderbyop is only supported by GiST, and
+ * this is only used for float8 distances. The correct way would
+ * be to dig this from the path key, like make_sort_from_pathkeys
+ * does.
+ */
+ sortOperators[i] = Float8LessOperator;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
@@ -1285,6 +1306,7 @@ create_indexscan_plan(PlannerInfo *root,
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
@@ -3327,6 +3349,7 @@ make_indexscan(List *qptlist,
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
@@ -3344,6 +3367,7 @@ make_indexscan(List *qptlist,
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index bc56b0a..85abfcf 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -2676,6 +2676,18 @@ dist_ppoly(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(result);
}
+Datum
+dist_polyp(PG_FUNCTION_ARGS)
+{
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+}
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
@@ -5092,6 +5104,21 @@ dist_pc(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(result);
}
+/*
+ * Distance from a circle to a point
+ */
+Datum
+dist_cpoint(PG_FUNCTION_ARGS)
+{
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+}
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index d99158f..170069e 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -147,7 +147,10 @@ extern void index_restrpos(IndexScanDesc scan);
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index f2c7ca1..8cfd0c4 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -91,6 +91,15 @@ typedef struct IndexScanDescData
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 7165f54..a28e67d 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -650,6 +650,7 @@ DATA(insert ( 2594 604 604 11 s 2577 783 0 ));
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
@@ -669,6 +670,7 @@ DATA(insert ( 2595 718 718 11 s 1514 783 0 ));
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index 309aee6..41a8011 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -205,6 +205,7 @@ DATA(insert ( 2594 604 604 4 2580 ));
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
@@ -212,6 +213,7 @@ DATA(insert ( 2595 718 718 4 2580 ));
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index 88c737b..63ae366 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1014,9 +1014,13 @@ DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 ci
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
-DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
+DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
-DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
+DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
+DESCR("distance between");
+DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
+DESCR("distance between");
+DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index eace352..11ac6fa 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -845,6 +845,8 @@ DATA(insert OID = 727 ( dist_sl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 70
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
@@ -4078,6 +4080,8 @@ DATA(insert OID = 2179 ( gist_point_consistent PGNSP PGUID 12 1 0 0 0 f f f f t
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+DATA(insert OID = 3589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 41b13b2..23e278c 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -17,6 +17,7 @@
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+#include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
@@ -1237,6 +1238,7 @@ typedef struct
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
@@ -1247,12 +1249,20 @@ typedef struct
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
@@ -1263,6 +1273,15 @@ typedef struct IndexScanState
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 48203a0..2219ed6 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -302,7 +302,11 @@ typedef Scan SeqScan;
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
- * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
+ * indexorderbyorig is used at run time to recheck the ordering, if the index
+ * does not calculate an accurate ordering. It is also needed for EXPLAIN.
+ *
+ * indexsortops is an array of operators used to sort the ORDER BY expressions,
+ * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
@@ -316,7 +320,8 @@ typedef struct IndexScan
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
- List *indexorderbyorig; /* the same in original form */
+ List *indexorderbyorig; /* the same in original form */
+ Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
index 91610d8..a0a8abe 100644
--- a/src/include/utils/geo_decls.h
+++ b/src/include/utils/geo_decls.h
@@ -394,8 +394,10 @@ extern Datum circle_diameter(PG_FUNCTION_ARGS);
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
@@ -419,6 +421,7 @@ extern Datum gist_circle_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index 5603817..cb18986 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -372,6 +372,36 @@ SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth
48
(1 row)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+-------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+(10 rows)
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+-----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+(10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
@@ -1152,6 +1182,54 @@ SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth
48
(1 row)
+EXPLAIN (COSTS OFF)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+-----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+(3 rows)
+
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+-------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+(10 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+(3 rows)
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+-----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+(10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index f779fa0..5df9008 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -224,6 +224,10 @@ SELECT count(*) FROM radix_text_tbl WHERE t > 'Worth
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
@@ -437,6 +441,14 @@ EXPLAIN (COSTS OFF)
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+EXPLAIN (COSTS OFF)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
On 12/15/2014 03:14 PM, Andres Freund wrote:
If we add another heap implementation we probably should at least hint
at the different advantages somewhere.
How about adding a src/backend/lib/README for that, per attached?
- Heikki
Attachments:
knn-gist-pairingheap-4.patchtext/x-diff; name=knn-gist-pairingheap-4.patchDownload
diff --git a/src/backend/lib/Makefile b/src/backend/lib/Makefile
index 327a1bc..b24ece6 100644
--- a/src/backend/lib/Makefile
+++ b/src/backend/lib/Makefile
@@ -12,6 +12,6 @@ subdir = src/backend/lib
top_builddir = ../../..
include $(top_builddir)/src/Makefile.global
-OBJS = ilist.o binaryheap.o stringinfo.o
+OBJS = ilist.o binaryheap.o pairingheap.o stringinfo.o
include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/lib/README b/src/backend/lib/README
new file mode 100644
index 0000000..49cf99a
--- /dev/null
+++ b/src/backend/lib/README
@@ -0,0 +1,21 @@
+This directory contains a general purpose data structures, for use anywhere
+in the backend:
+
+binaryheap.c - a binary heap
+
+pairingheap.c - a pairing heap
+
+ilist.c - single and double-linked lists.
+
+stringinfo.c - an extensible string type
+
+
+Aside from the inherent characteristics of the data structures, there are a
+few practical differences between the binary heap and the pairing heap. The
+binary heap is fully allocated at creation, and cannot be expanded beyond the
+allocated size. The pairing heap on the other hand has no inherent maximum
+size, but the caller needs to allocate each element being stored in the heap,
+while the binary heap works with plain Datums or pointers.
+
+In addition to these, there is an implementation of a Red-Black tree in
+src/backend/utils/adt/rbtree.c.
diff --git a/src/backend/lib/pairingheap.c b/src/backend/lib/pairingheap.c
new file mode 100644
index 0000000..a7e8901
--- /dev/null
+++ b/src/backend/lib/pairingheap.c
@@ -0,0 +1,235 @@
+/*-------------------------------------------------------------------------
+ *
+ * pairingheap.c
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * src/backend/lib/pairingheap.c
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#include "postgres.h"
+
+#include "lib/pairingheap.h"
+
+static pairingheap_node *merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b);
+static pairingheap_node *merge_children(pairingheap *heap, pairingheap_node *children);
+
+/*
+ * pairingheap_allocate
+ *
+ * Returns a pointer to a newly-allocated heap, with the heap property
+ * defined by the given comparator function, which will be invoked with the
+ * additional argument specified by 'arg'.
+ */
+pairingheap *
+pairingheap_allocate(pairingheap_comparator compare, void *arg)
+{
+ pairingheap *heap;
+
+ heap = (pairingheap *) palloc(sizeof(pairingheap));
+ heap->ph_compare = compare;
+ heap->ph_arg = arg;
+
+ heap->ph_root = NULL;
+
+ return heap;
+}
+
+/*
+ * pairingheap_free
+ *
+ * Releases memory used by the given pairingheap.
+ *
+ * Note: The items in the heap are not released!
+ */
+void
+pairingheap_free(pairingheap *heap)
+{
+ pfree(heap);
+}
+
+
+/* A helper function to merge two subheaps into one. */
+static pairingheap_node *
+merge(pairingheap *heap, pairingheap_node *a, pairingheap_node *b)
+{
+ if (a == NULL)
+ return b;
+ if (b == NULL)
+ return a;
+
+ /* Put the larger of the items as a child of the smaller one */
+ if (heap->ph_compare(a, b, heap->ph_arg) < 0)
+ {
+ pairingheap_node *tmp;
+
+ tmp = a;
+ a = b;
+ b = tmp;
+ }
+
+ if (a->first_child)
+ a->first_child->prev_or_parent = b;
+ b->prev_or_parent = a;
+ b->next_sibling = a->first_child;
+ a->first_child = b;
+ return a;
+}
+
+/*
+ * pairingheap_add
+ *
+ * Adds the given datum to the heap in O(1) time.
+ */
+void
+pairingheap_add(pairingheap *heap, pairingheap_node *d)
+{
+ d->first_child = NULL;
+
+ /* Link the new item as a new tree */
+ heap->ph_root = merge(heap, heap->ph_root, d);
+}
+
+/*
+ * pairingheap_first
+ *
+ * Returns a pointer to the first (root, topmost) node in the heap
+ * without modifying the heap. The caller must ensure that this
+ * routine is not used on an empty heap. Always O(1).
+ */
+pairingheap_node *
+pairingheap_first(pairingheap *heap)
+{
+ Assert(!pairingheap_is_empty(heap));
+ return heap->ph_root;
+}
+
+/*
+ * pairingheap_remove_first
+ *
+ * Removes the first (root, topmost) node in the heap and returns a
+ * pointer to it after rebalancing the heap. The caller must ensure
+ * that this routine is not used on an empty heap. O(log n) amortized.
+ */
+pairingheap_node *
+pairingheap_remove_first(pairingheap *heap)
+{
+ pairingheap_node *result;
+ pairingheap_node *children;
+
+ Assert(!pairingheap_is_empty(heap));
+
+ /* Remove the smallest root. */
+ result = heap->ph_root;
+ children = result->first_child;
+
+ heap->ph_root = merge_children(heap, children);
+
+ return result;
+}
+
+/*
+ * Merge a list of subheaps into a single heap.
+ *
+ * This implements the basic two-pass merging strategy, first forming
+ * pairs from left to right, and then merging the pairs.
+ */
+static pairingheap_node *
+merge_children(pairingheap *heap, pairingheap_node *children)
+{
+ pairingheap_node *item, *next;
+ pairingheap_node *pairs;
+ pairingheap_node *newroot;
+
+ if (children == NULL || children->next_sibling == NULL)
+ return children;
+
+ /* Walk the remaining subheaps from left to right, merging in pairs */
+ next = children;
+ pairs = NULL;
+ for (;;)
+ {
+ item = next;
+ if (item == NULL)
+ break;
+ if (item->next_sibling == NULL)
+ {
+ /* last odd item at the end of list */
+ item->next_sibling = pairs;
+ pairs = item;
+ break;
+ }
+ else
+ {
+ next = item->next_sibling->next_sibling;
+
+ item = merge(heap, item, item->next_sibling);
+ item->next_sibling = pairs;
+ pairs = item;
+ }
+ }
+
+ /*
+ * Form a single (sub)heap from the pairs.
+ */
+ newroot = pairs;
+ next = pairs->next_sibling;
+ while (next)
+ {
+ item = next;
+ next = item->next_sibling;
+
+ newroot = merge(heap, newroot, item);
+ }
+
+ return newroot;
+}
+
+/*
+ * Remove 'item' from the heap. O(log n) amortized.
+ */
+void
+pairingheap_remove(pairingheap *heap, pairingheap_node *item)
+{
+ pairingheap_node *children;
+ pairingheap_node *replacement;
+ pairingheap_node *next_sibling;
+ pairingheap_node **prev_ptr;
+
+ if (item == heap->ph_root)
+ {
+ (void) pairingheap_remove_first(heap);
+ return;
+ }
+
+ children = item->first_child;
+ next_sibling = item->next_sibling;
+
+ if (item->prev_or_parent->first_child == item)
+ prev_ptr = &item->prev_or_parent->first_child;
+ else
+ prev_ptr = &item->prev_or_parent->next_sibling;
+ Assert(*prev_ptr == item);
+
+ /* Form a new heap of the children */
+ replacement = merge_children(heap, children);
+
+ if (replacement == NULL)
+ {
+ *prev_ptr = next_sibling;
+ if (next_sibling)
+ next_sibling->prev_or_parent = item->prev_or_parent;
+ }
+ else
+ {
+ replacement->prev_or_parent = item->prev_or_parent;
+ replacement->next_sibling = item->next_sibling;
+ *prev_ptr = replacement;
+ if (next_sibling)
+ next_sibling->prev_or_parent = replacement;
+ }
+}
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 2cbc918..07bc607 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -18,9 +18,9 @@
#include "access/itup.h"
#include "access/xlogreader.h"
#include "fmgr.h"
+#include "lib/pairingheap.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
-#include "utils/rbtree.h"
#include "utils/hsearch.h"
/*
@@ -123,7 +123,7 @@ typedef struct GISTSearchHeapItem
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
- struct GISTSearchItem *next; /* list link */
+ pairingheap_node phNode;
BlockNumber blkno; /* index page number, or InvalidBlockNumber */
union
{
@@ -131,24 +131,12 @@ typedef struct GISTSearchItem
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
+ double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
-/*
- * Within a GISTSearchTreeItem's chain, heap items always appear before
- * index-page items, since we want to visit heap items first. lastHeap points
- * to the last heap item in the chain, or is NULL if there are none.
- */
-typedef struct GISTSearchTreeItem
-{
- RBNode rbnode; /* this is an RBTree item */
- GISTSearchItem *head; /* first chain member */
- GISTSearchItem *lastHeap; /* last heap-tuple member, if any */
- double distances[1]; /* array with numberOfOrderBys entries */
-} GISTSearchTreeItem;
-
-#define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
+#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
@@ -156,15 +144,12 @@ typedef struct GISTSearchTreeItem
typedef struct GISTScanOpaqueData
{
GISTSTATE *giststate; /* index information, see above */
- RBTree *queue; /* queue of unvisited items */
+ pairingheap *queue; /* queue of unvisited items */
MemoryContext queueCxt; /* context holding the queue */
bool qual_ok; /* false if qual can never be satisfied */
bool firstCall; /* true until first gistgettuple call */
- GISTSearchTreeItem *curTreeItem; /* current queue item, if any */
-
/* pre-allocated workspace arrays */
- GISTSearchTreeItem *tmpTreeItem; /* workspace to pass to rb_insert */
double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
diff --git a/src/include/lib/pairingheap.h b/src/include/lib/pairingheap.h
new file mode 100644
index 0000000..2368c6d
--- /dev/null
+++ b/src/include/lib/pairingheap.h
@@ -0,0 +1,70 @@
+/*
+ * pairingheap.h
+ *
+ * A Pairing Heap implementation
+ *
+ * Portions Copyright (c) 2012-2014, PostgreSQL Global Development Group
+ *
+ * src/include/lib/pairingheap.h
+ */
+
+#ifndef PAIRINGHEAP_H
+#define PAIRINGHEAP_H
+
+/*
+ * This represents an element stored in the heap. Embed this in a larger
+ * struct containing the actual data you're storing.
+ */
+typedef struct pairingheap_node
+{
+ struct pairingheap_node *first_child;
+ struct pairingheap_node *next_sibling;
+ struct pairingheap_node *prev_or_parent;
+} pairingheap_node;
+
+/*
+ * Return the containing struct of 'type' where 'membername' is the
+ * pairingheap_node pointed at by 'ptr'.
+ *
+ * This is used to convert a pairingheap_node * back to its containing struct.
+ */
+#define pairingheap_container(type, membername, ptr) \
+ (AssertVariableIsOfTypeMacro(ptr, pairingheap_node *), \
+ AssertVariableIsOfTypeMacro(((type *) NULL)->membername, pairingheap_node), \
+ ((type *) ((char *) (ptr) - offsetof(type, membername))))
+
+/*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+typedef int (*pairingheap_comparator) (const pairingheap_node *a, const pairingheap_node *b, void *arg);
+
+/*
+ * A pairing heap.
+ */
+typedef struct pairingheap
+{
+ pairingheap_comparator ph_compare; /* comparison function */
+ void *ph_arg; /* opaque argument to ph_compare */
+ pairingheap_node *ph_root; /* current root of the heap */
+} pairingheap;
+
+extern pairingheap *pairingheap_allocate(pairingheap_comparator compare,
+ void *arg);
+extern void pairingheap_free(pairingheap *heap);
+extern void pairingheap_add(pairingheap *heap, pairingheap_node *d);
+extern pairingheap_node *pairingheap_first(pairingheap *heap);
+extern pairingheap_node *pairingheap_remove_first(pairingheap *heap);
+extern void pairingheap_remove(pairingheap *heap, pairingheap_node *d);
+
+/* Resets the heap to be empty. */
+#define pairingheap_reset(h) ((h)->ph_root = NULL)
+
+/* Is the heap empty? */
+#define pairingheap_is_empty(h) ((h)->ph_root == NULL)
+
+/* Is there exactly one item in the heap? */
+#define pairingheap_is_singular(h) \
+ ((h)->ph_root && (h)->ph_root->first_child == NULL)
+
+#endif /* PAIRINGHEAP_H */
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 7a8692b..e5eb6f6 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -18,6 +18,7 @@
#include "access/relscan.h"
#include "miscadmin.h"
#include "pgstat.h"
+#include "lib/pairingheap.h"
#include "utils/builtins.h"
#include "utils/memutils.h"
#include "utils/rel.h"
@@ -243,8 +244,6 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
GISTPageOpaque opaque;
OffsetNumber maxoff;
OffsetNumber i;
- GISTSearchTreeItem *tmpItem = so->tmpTreeItem;
- bool isNew;
MemoryContext oldcxt;
Assert(!GISTSearchItemIsHeap(*pageItem));
@@ -275,18 +274,15 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for the right sibling index page */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
item->blkno = opaque->rightlink;
item->data.parentlsn = pageItem->data.parentlsn;
/* Insert it into the queue using same distances as for this page */
- tmpItem->head = item;
- tmpItem->lastHeap = NULL;
- memcpy(tmpItem->distances, myDistances,
+ memcpy(item->distances, myDistances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -348,8 +344,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
oldcxt = MemoryContextSwitchTo(so->queueCxt);
/* Create new GISTSearchItem for this item */
- item = palloc(sizeof(GISTSearchItem));
- item->next = NULL;
+ item = palloc(SizeOfGISTSearchItem(scan->numberOfOrderBys));
if (GistPageIsLeaf(page))
{
@@ -372,12 +367,10 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
}
/* Insert it into the queue using new distance data */
- tmpItem->head = item;
- tmpItem->lastHeap = GISTSearchItemIsHeap(*item) ? item : NULL;
- memcpy(tmpItem->distances, so->distances,
+ memcpy(item->distances, so->distances,
sizeof(double) * scan->numberOfOrderBys);
- (void) rb_insert(so->queue, (RBNode *) tmpItem, &isNew);
+ pairingheap_add(so->queue, &item->phNode);
MemoryContextSwitchTo(oldcxt);
}
@@ -390,44 +383,24 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
- *
- * NOTE: on successful return, so->curTreeItem is the GISTSearchTreeItem that
- * contained the result item. Callers can use so->curTreeItem->distances as
- * the distances value for the item.
*/
static GISTSearchItem *
getNextGISTSearchItem(GISTScanOpaque so)
{
- for (;;)
- {
- GISTSearchItem *item;
-
- /* Update curTreeItem if we don't have one */
- if (so->curTreeItem == NULL)
- {
- so->curTreeItem = (GISTSearchTreeItem *) rb_leftmost(so->queue);
- /* Done when tree is empty */
- if (so->curTreeItem == NULL)
- break;
- }
+ GISTSearchItem *item;
- item = so->curTreeItem->head;
- if (item != NULL)
- {
- /* Delink item from chain */
- so->curTreeItem->head = item->next;
- if (item == so->curTreeItem->lastHeap)
- so->curTreeItem->lastHeap = NULL;
- /* Return item; caller is responsible to pfree it */
- return item;
- }
-
- /* curTreeItem is exhausted, so remove it from rbtree */
- rb_delete(so->queue, (RBNode *) so->curTreeItem);
- so->curTreeItem = NULL;
+ if (!pairingheap_is_empty(so->queue))
+ {
+ item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
+ }
+ else
+ {
+ /* Done when both heaps are empty */
+ item = NULL;
}
- return NULL;
+ /* Return item; caller is responsible to pfree it */
+ return item;
}
/*
@@ -458,7 +431,7 @@ getNextNearest(IndexScanDesc scan)
/* visit an index page, extract its items into queue */
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
}
pfree(item);
@@ -491,7 +464,6 @@ gistgettuple(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
so->firstCall = false;
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -534,7 +506,7 @@ gistgettuple(PG_FUNCTION_ARGS)
* this page, we fall out of the inner "do" and loop around to
* return them.
*/
- gistScanPage(scan, item, so->curTreeItem->distances, NULL, NULL);
+ gistScanPage(scan, item, item->distances, NULL, NULL);
pfree(item);
} while (so->nPageData == 0);
@@ -560,7 +532,6 @@ gistgetbitmap(PG_FUNCTION_ARGS)
pgstat_count_index_scan(scan->indexRelation);
/* Begin the scan by processing the root page */
- so->curTreeItem = NULL;
so->curPageData = so->nPageData = 0;
fakeItem.blkno = GIST_ROOT_BLKNO;
@@ -580,7 +551,7 @@ gistgetbitmap(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
- gistScanPage(scan, item, so->curTreeItem->distances, tbm, &ntids);
+ gistScanPage(scan, item, item->distances, tbm, &ntids);
pfree(item);
}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 8360b16..eff02c4 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -22,14 +22,13 @@
/*
- * RBTree support functions for the GISTSearchTreeItem queue
+ * Pairing heap comparison function for the GISTSearchItem queue
*/
-
static int
-GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
+pairingheap_GISTSearchItem_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
{
- const GISTSearchTreeItem *sa = (const GISTSearchTreeItem *) a;
- const GISTSearchTreeItem *sb = (const GISTSearchTreeItem *) b;
+ const GISTSearchItem *sa = (const GISTSearchItem *) a;
+ const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
int i;
@@ -37,59 +36,16 @@ GISTSearchTreeItemComparator(const RBNode *a, const RBNode *b, void *arg)
for (i = 0; i < scan->numberOfOrderBys; i++)
{
if (sa->distances[i] != sb->distances[i])
- return (sa->distances[i] > sb->distances[i]) ? 1 : -1;
- }
-
- return 0;
-}
-
-static void
-GISTSearchTreeItemCombiner(RBNode *existing, const RBNode *newrb, void *arg)
-{
- GISTSearchTreeItem *scurrent = (GISTSearchTreeItem *) existing;
- const GISTSearchTreeItem *snew = (const GISTSearchTreeItem *) newrb;
- GISTSearchItem *newitem = snew->head;
-
- /* snew should have just one item in its chain */
- Assert(newitem && newitem->next == NULL);
-
- /*
- * If new item is heap tuple, it goes to front of chain; otherwise insert
- * it before the first index-page item, so that index pages are visited in
- * LIFO order, ensuring depth-first search of index pages. See comments
- * in gist_private.h.
- */
- if (GISTSearchItemIsHeap(*newitem))
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- if (scurrent->lastHeap == NULL)
- scurrent->lastHeap = newitem;
+ return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
}
- else if (scurrent->lastHeap == NULL)
- {
- newitem->next = scurrent->head;
- scurrent->head = newitem;
- }
- else
- {
- newitem->next = scurrent->lastHeap->next;
- scurrent->lastHeap->next = newitem;
- }
-}
-static RBNode *
-GISTSearchTreeItemAllocator(void *arg)
-{
- IndexScanDesc scan = (IndexScanDesc) arg;
-
- return palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
-}
+ /* Heap items go before inner pages, to ensure a depth-first search */
+ if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
+ return -1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return 1;
-static void
-GISTSearchTreeItemDeleter(RBNode *rb, void *arg)
-{
- pfree(rb);
+ return 0;
}
@@ -127,7 +83,6 @@ gistbeginscan(PG_FUNCTION_ARGS)
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
- so->tmpTreeItem = palloc(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys);
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
@@ -188,15 +143,9 @@ gistrescan(PG_FUNCTION_ARGS)
/* create new, empty RBTree for search queue */
oldCxt = MemoryContextSwitchTo(so->queueCxt);
- so->queue = rb_create(GSTIHDRSZ + sizeof(double) * scan->numberOfOrderBys,
- GISTSearchTreeItemComparator,
- GISTSearchTreeItemCombiner,
- GISTSearchTreeItemAllocator,
- GISTSearchTreeItemDeleter,
- scan);
+ so->queue = pairingheap_allocate(pairingheap_GISTSearchItem_cmp, scan);
MemoryContextSwitchTo(oldCxt);
- so->curTreeItem = NULL;
so->firstCall = true;
/* Update scan key, if a new one is given */
On Wed, Dec 17, 2014 at 6:07 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:
How about adding a src/backend/lib/README for that, per attached?
I took a quick look at this. Some observations:
* I like the idea of adding a .../lib README. However, the README
fails to note that ilist.c implements an *integrated* list, unlike the
much more prevalent cell-based List structure. It should note that,
since that's the whole point of ilist.c.
* pairingheap_remove() is technically dead code. It still makes sense
that you'd have it in this patch, but I think there's an argument for
not including it at all on the theory that if you need to use it you
should use a different data structure. After all, the actual
(non-amortized) complexity of that operation is O(n) [1]https://www.cise.ufl.edu/~sahni/dsaac/enrich/c13/pairing.htm -- Peter Geoghegan, and if
remove operations are infrequent as we might expect, that might be the
more important consideration. As long as you are including
pairingheap_remove(), though, why is the local variable "prev_ptr" a
pointer to a pointer to a pairingheap_node, rather than just a pointer
to a pairingheap_node?
* Similarly, the function-like macro pairingheap_reset() doesn't seem
to pull its weight. Why does it exist alongside pairingheap_free()?
I'm not seeing a need to re-use a heap like that.
* "Assert(!pairingheap_is_empty(heap))" appears in a few places.
You're basically asserting that a pointer isn't null, often
immediately before dereferencing the pointer. This seems to be of
questionable value.
* I think that the indentation of code could use some tweaking.
* More comments, please. In particular, comment the struct fields in
pairingheap_node. There are various blocks of code that could use at
least an additional terse comment, too.
* You talked about tuplesort.c integration. In order for that to
happen, I think the comparator logic should know less about min-heaps.
This should formally be a max-heap, with the ability to provide
customizations only encapsulated in the comparator (like inverting the
comparison logic to get a min-heap, or like custom NULLs first/last
behavior). So IMV this comment should be more generic/anticipatory:
+ /*
+ * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b,
+ * and >0 iff a > b. For a min-heap, the conditions are reversed.
+ */
+ typedef int (*pairingheap_comparator) (const pairingheap_node *a,
const pairingheap_node *b, void *arg);
I think the functions should be called pairing_max_heap* for this
reason, too. Although that isn't consistent with binaryheap.c, so I
guess this whole argument is a non-starter.
* We should just move rbtree.c to .../lib. We're not using CVS anymore
-- the history will be magically preserved.
Anyway, to get to the heart of the matter: in general, I think the
argument for the patch is sound. It's not a stellar improvement, but
it's worthwhile. That's all I have for now...
[1]: https://www.cise.ufl.edu/~sahni/dsaac/enrich/c13/pairing.htm -- Peter Geoghegan
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/20/2014 10:50 PM, Peter Geoghegan wrote:
On Wed, Dec 17, 2014 at 6:07 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:How about adding a src/backend/lib/README for that, per attached?
I took a quick look at this. Some observations:
* I like the idea of adding a .../lib README. However, the README
fails to note that ilist.c implements an *integrated* list, unlike the
much more prevalent cell-based List structure. It should note that,
since that's the whole point of ilist.c.
Added a sentence on that.
* pairingheap_remove() is technically dead code. It still makes sense
that you'd have it in this patch, but I think there's an argument for
not including it at all on the theory that if you need to use it you
should use a different data structure. After all, the actual
(non-amortized) complexity of that operation is O(n) [1], and if
remove operations are infrequent as we might expect, that might be the
more important consideration. As long as you are including
pairingheap_remove(), though, why is the local variable "prev_ptr" a
pointer to a pointer to a pairingheap_node, rather than just a pointer
to a pairingheap_node?* Similarly, the function-like macro pairingheap_reset() doesn't seem
to pull its weight. Why does it exist alongside pairingheap_free()?
I'm not seeing a need to re-use a heap like that.
pairingheap_remove and pairingheap_reset are both unused in this patch,
but they were needed for the other use case, tracking snapshots to
advance xmin more aggressively, discussed here:
/messages/by-id/5488ACF0.8050901@vmware.com. In
fact, without the pairingheap_remove() operation, the prev_or_parent
pointer wouldn't be necessarily at all. We could've added them as a
separate patch, but that seems like unnecessary code churn.
The prev_ptr variable is used to set the parent's first_child pointer,
or the previous sibling's next_sibling pointer, depending on whether the
removed node is the parent's first child or not. I'll add more comments
in pairingheap_remove to explain that.
* "Assert(!pairingheap_is_empty(heap))" appears in a few places.
You're basically asserting that a pointer isn't null, often
immediately before dereferencing the pointer. This seems to be of
questionable value.
I copied that from binaryheap.c. It has some documentation value; they
make it easy to see that the functions require the heap to not be empty.
It's also explained in comments, but still.
* I think that the indentation of code could use some tweaking.
* More comments, please. In particular, comment the struct fields in
pairingheap_node. There are various blocks of code that could use at
least an additional terse comment, too.
Added some comments, hope it's better now.
* You talked about tuplesort.c integration. In order for that to
happen, I think the comparator logic should know less about min-heaps.
This should formally be a max-heap, with the ability to provide
customizations only encapsulated in the comparator (like inverting the
comparison logic to get a min-heap, or like custom NULLs first/last
behavior). So IMV this comment should be more generic/anticipatory:+ /* + * For a max-heap, the comparator must return <0 iff a < b, 0 iff a == b, + * and >0 iff a > b. For a min-heap, the conditions are reversed. + */ + typedef int (*pairingheap_comparator) (const pairingheap_node *a, const pairingheap_node *b, void *arg);I think the functions should be called pairing_max_heap* for this
reason, too. Although that isn't consistent with binaryheap.c, so I
guess this whole argument is a non-starter.
I don't see what the problem is. The pairingheap.c (and binaryheap.c)
code works the same for min and max-heaps. The comments assume a
max-heap in a few places, but that seems OK to me in the context.
* We should just move rbtree.c to .../lib. We're not using CVS anymore
-- the history will be magically preserved.
Yeah, I tend to agree. Tom Lane has not liked moving things, because it
breaks back-patching. That's true in general, even though git has some
smarts to follow renames. I think it would work in this case, though.
Anyway, let's discuss and do that as a separate patch, so that we don't
get stuck on that.
Anyway, to get to the heart of the matter: in general, I think the
argument for the patch is sound. It's not a stellar improvement, but
it's worthwhile. That's all I have for now...
Ok, thanks for the review! I have committed this, with some cleanup and
more comments added.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 16, 2014 at 4:37 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
Patch attached. It should be applied on top of my pairing heap patch at
/messages/by-id/548FFA2C.7060000@vmware.com. Some
caveats:* The signature of the distance function is unchanged, it doesn't get a
recheck argument. It is just assumed that if the consistent function sets
the recheck flag, then the distance needs to be rechecked as well. We might
want to add the recheck argument, like you Alexander did in your patch, but
it's not important right now.
I didn't get how that expected to work if we have only order by qual
without filter qual. In this case consistent function just isn't called at
all.
* I used the "distance" term in the executor, although the ORDER BY expr
machinery is more general than that. The value returned by the ORDER BY
expression doesn't have to be a distance, although that's the only thing
supported by GiST and the built-in opclasses.* I short-circuited the planner to assume that the ORDER BY expression
always returns a float. That's true today for knn-GiST, but is obviously a
bogus assumption in general.This needs some work to get into a committable state, but from a
modularity point of view, this is much better than having the indexam to
peek into the heap.
Nice idea to put reordering into index scan node. Doesn't look like much of
overengineering. I'm going to bring it to more commitable state.
------
With best regards,
Alexander Korotkov.
On Thu, Jan 8, 2015 at 1:12 AM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
On Tue, Dec 16, 2014 at 4:37 PM, Heikki Linnakangas <
hlinnakangas@vmware.com> wrote:Patch attached. It should be applied on top of my pairing heap patch at
/messages/by-id/548FFA2C.7060000@vmware.com. Some
caveats:* The signature of the distance function is unchanged, it doesn't get a
recheck argument. It is just assumed that if the consistent function sets
the recheck flag, then the distance needs to be rechecked as well. We might
want to add the recheck argument, like you Alexander did in your patch, but
it's not important right now.I didn't get how that expected to work if we have only order by qual
without filter qual. In this case consistent function just isn't called at
all.* I used the "distance" term in the executor, although the ORDER BY expr
machinery is more general than that. The value returned by the ORDER BY
expression doesn't have to be a distance, although that's the only thing
supported by GiST and the built-in opclasses.* I short-circuited the planner to assume that the ORDER BY expression
always returns a float. That's true today for knn-GiST, but is obviously a
bogus assumption in general.This needs some work to get into a committable state, but from a
modularity point of view, this is much better than having the indexam to
peek into the heap.Nice idea to put reordering into index scan node. Doesn't look like much
of overengineering. I'm going to bring it to more commitable state.
Following changes has been made in attached patch:
* Get sort operators from pathkeys.
* Recheck argument of distance function has been reverted.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-5.patchapplication/octet-stream; name=knn-gist-recheck-5.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 31ce279..c354411
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_distance(PG_FUNCTION_ARGS)
*** 779,784 ****
--- 787,793 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 791,804 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
! greater than any child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. The
! result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
--- 800,821 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function.
! </para>
!
! <para>
! Some approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never greater than any
! child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. For
! leaf nodes, the returned distance must be accurate, if the
! <function>distance</> function returns *recheck == false for the tuple.
! Otherwise the same approximation is allowed, and the executor will
! re-order ambiguous cases after recalculating the actual distance.
! </para>
!
! <para>
! The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 717cb85..801e969
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistindex_keytest(IndexScanDesc scan,
*** 176,181 ****
--- 176,182 ----
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,207 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
*distance_p = DatumGetFloat8(dist);
}
--- 193,210 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument.
! * Distance is rechecked whenever the quals are.
*/
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&recheck));
!
! *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
*************** getNextNearest(IndexScanDesc scan)
*** 411,416 ****
--- 414,420 ----
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
do
{
*************** getNextNearest(IndexScanDesc scan)
*** 424,429 ****
--- 428,438 ----
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
res = true;
}
else
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9fab6c8..c1232d5
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1441,1443 ****
--- 1441,1478 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_bbox_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ *recheck = true;
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index cc8d818..066238e
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 85,90 ****
--- 85,95 ----
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
new file mode 100644
index 48fa919..d46cb3d
*** a/src/backend/executor/nodeIndexscan.c
--- b/src/backend/executor/nodeIndexscan.c
***************
*** 28,41 ****
--- 28,113 ----
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+ #include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+ #include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+ /*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+ typedef struct
+ {
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+ } ReorderTuple;
+
+ static int
+ cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+ {
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+ }
+
+ static int
+ reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+ {
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+ }
+
+ static void
+ copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+ {
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+ }
static TupleTableSlot *IndexNext(IndexScanState *node);
+ static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
*************** IndexNext(IndexScanState *node)
*** 54,59 ****
--- 126,133 ----
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
*************** IndexNext(IndexScanState *node)
*** 72,82 ****
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! /*
! * ok, now that we have what we need, fetch the next tuple.
! */
! while ((tuple = index_getnext(scandesc, direction)) != NULL)
{
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
--- 146,205 ----
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
*************** IndexNext(IndexScanState *node)
*** 103,108 ****
--- 226,296 ----
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ RecheckOrderBys(node, slot);
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
*************** IndexNext(IndexScanState *node)
*** 114,119 ****
--- 302,342 ----
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+ static void
+ RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+ {
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+ }
+
+ /*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 465,470 ****
--- 688,694 ----
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 501,506 ****
--- 725,733 ----
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 581,586 ****
--- 808,859 ----
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 655be81..bb71638
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 22,27 ****
--- 22,28 ----
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+ #include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
*************** static void copy_plan_costsize(Plan *des
*** 102,108 ****
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
--- 103,109 ----
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
*************** static Plan *prepare_sort_from_pathkeys(
*** 168,174 ****
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
--- 169,175 ----
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
*************** create_indexscan_plan(PlannerInfo *root,
*** 1158,1163 ****
--- 1159,1165 ----
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
*************** create_indexscan_plan(PlannerInfo *root,
*** 1266,1271 ****
--- 1268,1303 ----
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell, *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ EquivalenceMember *em;
+ pathkey = (PathKey *)lfirst(pathkeyCell);
+ expr = (Expr *)lfirst(exprCell);
+
+ em = find_ec_member_for_tle(pathkey->pk_eclass, expr, NULL);
+
+ sortOperators[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
*************** create_indexscan_plan(PlannerInfo *root,
*** 1285,1290 ****
--- 1317,1323 ----
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
*************** make_indexscan(List *qptlist,
*** 3327,3332 ****
--- 3360,3366 ----
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
*************** make_indexscan(List *qptlist,
*** 3344,3349 ****
--- 3378,3384 ----
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3967,3973 ****
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
--- 4002,4008 ----
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3998,4004 ****
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
--- 4033,4039 ----
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4126,4139 ****
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
--- 4161,4172 ----
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids)
{
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6b6510e..bf8a5da
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2657,2662 ****
--- 2657,2674 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+ }
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
*************** dist_pc(PG_FUNCTION_ARGS)
*** 5073,5078 ****
--- 5085,5105 ----
PG_RETURN_FLOAT8(result);
}
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d1d6247..359d488
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..e1f2031
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 91,96 ****
--- 91,105 ----
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..4a6fa7f
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 49d3d13..43f77ed
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index af991d3..6e0df88
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 9edfdb8..553cd24
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2179 ( gist_point_con
*** 4084,4089 ****
--- 4086,4093 ----
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 41288ed..cc05fae
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 17,22 ****
--- 17,23 ----
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+ #include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
*************** typedef struct
*** 1237,1242 ****
--- 1238,1244 ----
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
*************** typedef struct
*** 1247,1258 ****
--- 1249,1268 ----
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
*************** typedef struct IndexScanState
*** 1263,1268 ****
--- 1273,1287 ----
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 316c9ce..e171333
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef Scan SeqScan;
*** 302,308 ****
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
--- 302,312 ----
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is used at run time to recheck the ordering, if the index
! * does not calculate an accurate ordering. It is also needed for EXPLAIN.
! *
! * indexsortops is an array of operators used to sort the ORDER BY expressions,
! * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
*************** typedef struct IndexScan
*** 316,322 ****
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
--- 320,327 ----
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
! Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 0b6d3c3..9f92968
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 417,422 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index 5603817..cb18986
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
Hi!
On Mon, Dec 22, 2014 at 1:07 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
Ok, thanks for the review! I have committed this, with some cleanup and
more comments added.
ISTM that checks in pairingheap_GISTSearchItem_cmp is incorrect. This
function should perform inverse comparison. Thus, if item a should be
checked first function should return 1. Current behavior doesn't lead to
incorrect query answers, but it could be slower than correct version.
------
With best regards,
Alexander Korotkov.
Attachments:
pairing_heap_cmp_fix.patchapplication/octet-stream; name=pairing_heap_cmp_fix.patchDownload
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index cc8d818..991858f
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** pairingheap_GISTSearchItem_cmp(const pai
*** 41,49 ****
/* Heap items go before inner pages, to ensure a depth-first search */
if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
- return -1;
- if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
return 1;
return 0;
}
--- 41,49 ----
/* Heap items go before inner pages, to ensure a depth-first search */
if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
return 1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return -1;
return 0;
}
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
Following changes has been made in attached patch:
* Get sort operators from pathkeys.
* Recheck argument of distance function has been reverted.
Few comments were added and pairing heap comparison function was fixed in
attached version of patch (knn-gist-recheck-6.patch).
Also I expected that reordering in executor would be slower than reordering
in GiST because of maintaining two heaps instead of one. I've revised
version of patch with reordering in GiST to use pairing heap. I compare two
types of reordering on 10^7 random points and polygons. Results are below.
Test shows that overhead of reordering in executor is insignificant (less
than statistical error).
Reorder in GiST Reorder in executor
points
limit=10 0.10615 0.0880125
limit=100 0.23666875 0.2292375
limit=1000 1.51486875 1.5208375
polygons
limit=10 0.11650625 0.1347
limit=100 0.46279375 0.45294375
limit=1000 3.5170125 3.54868125
Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-6.patchapplication/octet-stream; name=knn-gist-recheck-6.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 31ce279..c354411
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_distance(PG_FUNCTION_ARGS)
*** 779,784 ****
--- 787,793 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 791,804 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
! greater than any child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. The
! result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
--- 800,821 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function.
! </para>
!
! <para>
! Some approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never greater than any
! child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. For
! leaf nodes, the returned distance must be accurate, if the
! <function>distance</> function returns *recheck == false for the tuple.
! Otherwise the same approximation is allowed, and the executor will
! re-order ambiguous cases after recalculating the actual distance.
! </para>
!
! <para>
! The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 717cb85..53c061d
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistindex_keytest(IndexScanDesc scan,
*** 176,181 ****
--- 176,182 ----
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,207 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
*distance_p = DatumGetFloat8(dist);
}
--- 193,213 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance functions get a recheck argument as well. In this
! * case the returned distance is the lower bound of distance
! * and needs to be rechecked. We return single recheck flag
! * which means that both quals and distances are to be
! * rechecked.
*/
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&recheck));
!
! *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
*************** getNextNearest(IndexScanDesc scan)
*** 411,416 ****
--- 417,423 ----
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
do
{
*************** getNextNearest(IndexScanDesc scan)
*** 424,429 ****
--- 431,441 ----
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
res = true;
}
else
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9fab6c8..37bf5d5
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1441,1443 ****
--- 1441,1480 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_bbox_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ /* Bounding box distance is always inexact. */
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index cc8d818..55c98b4
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** pairingheap_GISTSearchItem_cmp(const pai
*** 41,49 ****
/* Heap items go before inner pages, to ensure a depth-first search */
if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
- return -1;
- if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
return 1;
return 0;
}
--- 41,49 ----
/* Heap items go before inner pages, to ensure a depth-first search */
if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
return 1;
+ if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
+ return -1;
return 0;
}
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 85,90 ****
--- 85,95 ----
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
new file mode 100644
index 48fa919..430abd0
*** a/src/backend/executor/nodeIndexscan.c
--- b/src/backend/executor/nodeIndexscan.c
***************
*** 28,41 ****
--- 28,117 ----
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+ #include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+ #include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+ /*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+ typedef struct
+ {
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+ } ReorderTuple;
+
+ static int
+ cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+ {
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+ }
+
+ /*
+ * Pairing heap provides getting topmost (greatest) element while KNN provides
+ * ascending sort. That's why we inverse sort order.
+ */
+ static int
+ reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+ {
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+ }
+
+ static void
+ copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+ {
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+ }
static TupleTableSlot *IndexNext(IndexScanState *node);
+ static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
*************** IndexNext(IndexScanState *node)
*** 54,59 ****
--- 130,137 ----
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
*************** IndexNext(IndexScanState *node)
*** 72,82 ****
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! /*
! * ok, now that we have what we need, fetch the next tuple.
! */
! while ((tuple = index_getnext(scandesc, direction)) != NULL)
{
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
--- 150,209 ----
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
*************** IndexNext(IndexScanState *node)
*** 103,108 ****
--- 230,300 ----
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ RecheckOrderBys(node, slot);
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
*************** IndexNext(IndexScanState *node)
*** 114,119 ****
--- 306,346 ----
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+ static void
+ RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+ {
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+ }
+
+ /*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 465,470 ****
--- 692,698 ----
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 501,506 ****
--- 729,737 ----
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 581,586 ****
--- 812,863 ----
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 655be81..37b5256
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 22,27 ****
--- 22,28 ----
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+ #include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
*************** static void copy_plan_costsize(Plan *des
*** 102,108 ****
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
--- 103,109 ----
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
*************** static Plan *prepare_sort_from_pathkeys(
*** 168,174 ****
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
--- 169,175 ----
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
*************** create_indexscan_plan(PlannerInfo *root,
*** 1158,1163 ****
--- 1159,1165 ----
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
*************** create_indexscan_plan(PlannerInfo *root,
*** 1266,1271 ****
--- 1268,1309 ----
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell,
+ *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey. Pathkey contains pointer
+ * to equivalence class. But it's not enough because we need the
+ * expression datatype to lookup opfamily member. That's why we've
+ * to dig and equivalence member.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ EquivalenceMember *em;
+ pathkey = (PathKey *) lfirst(pathkeyCell);
+ expr = (Expr *) lfirst(exprCell);
+
+ /* Find equivalence member by order by expression */
+ em = find_ec_member_for_tle(pathkey->pk_eclass, expr, NULL);
+
+ /* Get sort operator from opfamily */
+ sortOperators[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
*************** create_indexscan_plan(PlannerInfo *root,
*** 1285,1290 ****
--- 1323,1329 ----
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
*************** make_indexscan(List *qptlist,
*** 3327,3332 ****
--- 3366,3372 ----
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
*************** make_indexscan(List *qptlist,
*** 3344,3349 ****
--- 3384,3390 ----
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3967,3973 ****
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
--- 4008,4014 ----
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3998,4004 ****
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
--- 4039,4045 ----
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4126,4139 ****
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
--- 4167,4178 ----
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids)
{
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6b6510e..bf8a5da
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2657,2662 ****
--- 2657,2674 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+ }
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
*************** dist_pc(PG_FUNCTION_ARGS)
*** 5073,5078 ****
--- 5085,5105 ----
PG_RETURN_FLOAT8(result);
}
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d1d6247..359d488
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..e1f2031
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 91,96 ****
--- 91,105 ----
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..4a6fa7f
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 49d3d13..43f77ed
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index af991d3..6e0df88
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 9edfdb8..553cd24
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2179 ( gist_point_con
*** 4084,4089 ****
--- 4086,4093 ----
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 41288ed..cc05fae
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 17,22 ****
--- 17,23 ----
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+ #include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
*************** typedef struct
*** 1237,1242 ****
--- 1238,1244 ----
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
*************** typedef struct
*** 1247,1258 ****
--- 1249,1268 ----
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
*************** typedef struct IndexScanState
*** 1263,1268 ****
--- 1273,1287 ----
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 316c9ce..e171333
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef Scan SeqScan;
*** 302,308 ****
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
--- 302,312 ----
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is used at run time to recheck the ordering, if the index
! * does not calculate an accurate ordering. It is also needed for EXPLAIN.
! *
! * indexsortops is an array of operators used to sort the ORDER BY expressions,
! * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
*************** typedef struct IndexScan
*** 316,322 ****
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
--- 320,327 ----
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
! Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 0b6d3c3..9f92968
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 417,422 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index 5603817..cb18986
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
knn-gist-recheck-in-gist.patchapplication/octet-stream; name=knn-gist-recheck-in-gist.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 31ce279..b265e9c
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_same(PG_FUNCTION_ARGS)
*** 760,766 ****
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- 768,774 ----
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
*************** my_distance(PG_FUNCTION_ARGS)
*** 779,784 ****
--- 787,793 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 791,801 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
--- 800,815 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function. When
! <literal>recheck = true</> then value of distance will
! be rechecked from heap tuple before tuple is returned. If
! <literal>recheck</> flag isn't set then it's true by default for
! compatibility reasons. The <literal>recheck</> flag can be used only
! when ordering operator returns <type>float8</> value comparable with
! result of <function>distance</> function. Result of distance function
! should be never greater than result of ordering operator.
! Same approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 717cb85..f2dc301
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 56,62 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 57,63 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 73,79 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 74,83 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 171,177 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 175,182 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,209 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 197,216 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance function gets a recheck argument as well as consistent
! * function. Distance will be re-calculated from heap tuple when
! * needed.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 235,241 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 242,248 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 280,286 ****
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 287,293 ----
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 368,374 ****
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 375,381 ----
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 380,406 ****
}
/*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
}
else
{
! /* Done when both heaps are empty */
! item = NULL;
}
-
- /* Return item; caller is responsible to pfree it */
- return item;
}
/*
--- 387,490 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get index values from heap */
+ if (!index_get_heap_values(scan, &item->data.heap.heapPtr, values, isnull))
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ pfree(item);
+ return;
+ }
+
+ /* Prepare new tree item and reinsert it */
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ item->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ item->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ item->distances[i].value = newDistance;
+
+ }
+ }
+
+ pairingheap_add(so->queue, item);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! for (;;)
! {
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
!
! /* Recheck distance from heap tuple if needed */
! if (GISTSearchItemIsHeap(*item) &&
! searchTreeItemNeedDistanceRecheck(scan, item))
! {
! searchTreeItemDistanceRecheck(scan, item);
! continue;
! }
! return item;
! }
}
else
{
! return NULL;
}
}
/*
*************** getNextNearest(IndexScanDesc scan)
*** 414,420 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 498,504 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 493,499 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 577,583 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 544,550 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 628,634 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9fab6c8..c2692c3
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1089,1094 ****
--- 1089,1095 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1441,1443 ****
--- 1442,1478 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_inexact_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index cc8d818..fab7237
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** pairingheap_GISTSearchItem_cmp(const pai
*** 30,53 ****
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
! }
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return -1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return 1;
! return 0;
! }
/*
* Index AM API functions for scanning GiST indexes
--- 33,65 ----
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i, recheckCmp = 0;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! const GISTSearchTreeItemDistance distance_a = sa->distances[i];
! const GISTSearchTreeItemDistance distance_b = sb->distances[i];
! if (distance_a.value != distance_b.value)
! return (distance_a.value < distance_b.value) ? 1 : -1;
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return 1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return -1;
+ /*
+ * When all distance values are the same, items without recheck
+ * can be immediately returned. So they are placed first.
+ */
+ if (recheckCmp == 0 && distance_a.recheck != distance_b.recheck)
+ recheckCmp = distance_b.recheck ? 1 : -1;
+ }
+
+ return recheckCmp;
+ }
/*
* Index AM API functions for scanning GiST indexes
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 83,91 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
--- 95,111 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Functions for distance recheck from heap tuple */
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
*************** gistrescan(PG_FUNCTION_ARGS)
*** 238,243 ****
--- 258,267 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
new file mode 100644
index e6e4d28..90cf088
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 124,129 ****
--- 124,132 ----
scan->xs_ctup.t_data = NULL;
scan->xs_cbuf = InvalidBuffer;
scan->xs_continue_hot = false;
+ scan->indexInfo = NULL;
+ scan->estate = NULL;
+ scan->slot = NULL;
return scan;
}
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
new file mode 100644
index 00c1d69..9c57311
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
***************
*** 69,74 ****
--- 69,75 ----
#include "access/transam.h"
#include "access/xlog.h"
+ #include "executor/executor.h"
#include "catalog/index.h"
#include "catalog/catalog.h"
#include "pgstat.h"
*************** index_beginscan(Relation heapRelation,
*** 254,259 ****
--- 255,265 ----
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ /* Prepare data structures for getting original indexed values from heap */
+ scan->indexInfo = BuildIndexInfo(scan->indexRelation);
+ scan->estate = CreateExecutorState();
+ scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
+
return scan;
}
*************** index_endscan(IndexScanDesc scan)
*** 377,382 ****
--- 383,393 ----
scan->xs_cbuf = InvalidBuffer;
}
+ if (scan->slot)
+ ExecDropSingleTupleTableSlot(scan->slot);
+ if (scan->estate)
+ FreeExecutorState(scan->estate);
+
/* End the AM's scan */
FunctionCall1(procedure, PointerGetDatum(scan));
*************** index_fetch_heap(IndexScanDesc scan)
*** 564,569 ****
--- 575,623 ----
}
/* ----------------
+ * index_get_heap_values - get original indexed values from heap
+ *
+ * Fetches heap tuple of heapPtr and calculated original indexed values.
+ * Returns true on success. Returns false when heap tuple wasn't found.
+ * Useful for indexes with lossy representation of keys.
+ * ----------------
+ */
+ bool
+ index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS])
+ {
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /* Tuple not found: it has been deleted from heap. */
+ UnlockReleaseBuffer(buffer);
+ return false;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(heap_copytuple(&tup), scan->slot, InvalidBuffer, true);
+ FormIndexDatum(scan->indexInfo, scan->slot, scan->estate, values, isnull);
+
+ UnlockReleaseBuffer(buffer);
+
+ return true;
+ }
+
+ /* ----------------
* index_getnext - get the next heap tuple from a scan
*
* The result is the next heap tuple satisfying the scan keys and the
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6b6510e..76cd485
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** static Point *interpt_sl(LSEG *lseg, LIN
*** 70,79 ****
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
- static double dist_ppoly_internal(Point *pt, POLYGON *poly);
/*
--- 70,79 ----
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+ static double dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
/*
*************** dist_lb(PG_FUNCTION_ARGS)
*** 2623,2628 ****
--- 2623,2660 ----
}
/*
+ * Distance from a point to a circle
+ */
+ Datum
+ dist_pc(PG_FUNCTION_ARGS)
+ {
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
* Distance from a circle to a polygon
*/
Datum
*************** dist_ppoly_internal(Point *pt, POLYGON *
*** 2701,2706 ****
--- 2733,2747 ----
return result;
}
+ /*
+ * Distance from a polygon to a point
+ */
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
+ PG_GETARG_POLYGON_P(0)));
+ }
/*---------------------------------------------------------------------
* interpt_
*************** pt_contained_circle(PG_FUNCTION_ARGS)
*** 5057,5079 ****
}
- /* dist_pc - returns the distance between
- * a point and a circle.
- */
- Datum
- dist_pc(PG_FUNCTION_ARGS)
- {
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
- }
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
--- 5098,5103 ----
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d1d6247..359d488
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index 382826e..007e672
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 17,24 ****
--- 17,26 ----
#include "access/gist.h"
#include "access/itup.h"
#include "access/xlogreader.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
#include "lib/pairingheap.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/hsearch.h"
*************** typedef struct GISTSearchHeapItem
*** 120,125 ****
--- 122,136 ----
bool recheck; /* T if quals must be rechecked */
} GISTSearchHeapItem;
+ /*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
*************** typedef struct GISTSearchItem
*** 131,142 ****
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! double distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
--- 142,154 ----
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! GISTSearchTreeItemDistance distances[1]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(GISTSearchTreeItemDistance) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
*************** typedef struct GISTScanOpaqueData
*** 150,161 ****
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
--- 162,176 ----
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..b1be157
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
***************
*** 19,24 ****
--- 19,25 ----
#include "access/htup_details.h"
#include "access/itup.h"
#include "access/tupdesc.h"
+ #include "nodes/execnodes.h"
typedef struct HeapScanDescData
*************** typedef struct IndexScanDescData
*** 93,98 ****
--- 94,104 ----
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
+
+ /* Data structures for getting original indexed values from heap */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} IndexScanDescData;
/* Struct for heap-or-index scans of system tables */
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..4a6fa7f
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 49d3d13..43f77ed
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index af991d3..b375215
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 9edfdb8..3a091da
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 4074,4079 ****
--- 4076,4083 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 0b6d3c3..696214e
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 417,422 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index 5603817..cb18986
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
On 02/17/2015 02:56 PM, Alexander Korotkov wrote:
Hi!
On Mon, Dec 22, 2014 at 1:07 PM, Heikki Linnakangas <hlinnakangas@vmware.com
wrote:
Ok, thanks for the review! I have committed this, with some cleanup and
more comments added.ISTM that checks in pairingheap_GISTSearchItem_cmp is incorrect. This
function should perform inverse comparison. Thus, if item a should be
checked first function should return 1. Current behavior doesn't lead to
incorrect query answers, but it could be slower than correct version.
Good catch. Fixed, thanks.
While testing this, I also noticed a bug in the pairing heap code
itself. Fixed that too.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 17.2.2015 14:21, Alexander Korotkov wrote:
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).
I meant to do a bit of testing on this (assuming it's still needed), but
the patches need rebasing - Heikki fixed a few issues, so they don't
apply cleanly.
regards
--
Tomas Vondra http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi!
On Tue, Feb 24, 2015 at 5:39 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com>
wrote:
On 17.2.2015 14:21, Alexander Korotkov wrote:
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).I meant to do a bit of testing on this (assuming it's still needed), but
the patches need rebasing - Heikki fixed a few issues, so they don't
apply cleanly.
Both patches are revised.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-in-gist-2.patchapplication/octet-stream; name=knn-gist-recheck-in-gist-2.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 31ce279..b265e9c
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_same(PG_FUNCTION_ARGS)
*** 760,766 ****
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- 768,774 ----
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
*************** my_distance(PG_FUNCTION_ARGS)
*** 779,784 ****
--- 787,793 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 791,801 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
--- 800,815 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function. When
! <literal>recheck = true</> then value of distance will
! be rechecked from heap tuple before tuple is returned. If
! <literal>recheck</> flag isn't set then it's true by default for
! compatibility reasons. The <literal>recheck</> flag can be used only
! when ordering operator returns <type>float8</> value comparable with
! result of <function>distance</> function. Result of distance function
! should be never greater than result of ordering operator.
! Same approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 717cb85..f2dc301
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 56,62 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 57,63 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 73,79 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 74,83 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 171,177 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 175,182 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,209 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 197,216 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance function gets a recheck argument as well as consistent
! * function. Distance will be re-calculated from heap tuple when
! * needed.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 235,241 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 242,248 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 280,286 ****
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 287,293 ----
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 368,374 ****
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 375,381 ----
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 380,406 ****
}
/*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
}
else
{
! /* Done when both heaps are empty */
! item = NULL;
}
-
- /* Return item; caller is responsible to pfree it */
- return item;
}
/*
--- 387,490 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get index values from heap */
+ if (!index_get_heap_values(scan, &item->data.heap.heapPtr, values, isnull))
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ pfree(item);
+ return;
+ }
+
+ /* Prepare new tree item and reinsert it */
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ item->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ item->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ item->distances[i].value = newDistance;
+
+ }
+ }
+
+ pairingheap_add(so->queue, item);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! for (;;)
! {
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
!
! /* Recheck distance from heap tuple if needed */
! if (GISTSearchItemIsHeap(*item) &&
! searchTreeItemNeedDistanceRecheck(scan, item))
! {
! searchTreeItemDistanceRecheck(scan, item);
! continue;
! }
! return item;
! }
}
else
{
! return NULL;
}
}
/*
*************** getNextNearest(IndexScanDesc scan)
*** 414,420 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 498,504 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 493,499 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 577,583 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 544,550 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 628,634 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9fab6c8..c2692c3
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1089,1094 ****
--- 1089,1095 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1441,1443 ****
--- 1442,1478 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_inexact_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 991858f..fab7237
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** pairingheap_GISTSearchItem_cmp(const pai
*** 30,53 ****
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
! }
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return 1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return -1;
! return 0;
! }
/*
* Index AM API functions for scanning GiST indexes
--- 33,65 ----
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i, recheckCmp = 0;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! const GISTSearchTreeItemDistance distance_a = sa->distances[i];
! const GISTSearchTreeItemDistance distance_b = sb->distances[i];
! if (distance_a.value != distance_b.value)
! return (distance_a.value < distance_b.value) ? 1 : -1;
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return 1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return -1;
+ /*
+ * When all distance values are the same, items without recheck
+ * can be immediately returned. So they are placed first.
+ */
+ if (recheckCmp == 0 && distance_a.recheck != distance_b.recheck)
+ recheckCmp = distance_b.recheck ? 1 : -1;
+ }
+
+ return recheckCmp;
+ }
/*
* Index AM API functions for scanning GiST indexes
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 83,91 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
--- 95,111 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Functions for distance recheck from heap tuple */
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ }
+
scan->opaque = so;
MemoryContextSwitchTo(oldCxt);
*************** gistrescan(PG_FUNCTION_ARGS)
*** 238,243 ****
--- 258,267 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
new file mode 100644
index e6e4d28..90cf088
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 124,129 ****
--- 124,132 ----
scan->xs_ctup.t_data = NULL;
scan->xs_cbuf = InvalidBuffer;
scan->xs_continue_hot = false;
+ scan->indexInfo = NULL;
+ scan->estate = NULL;
+ scan->slot = NULL;
return scan;
}
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
new file mode 100644
index 00c1d69..9c57311
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
***************
*** 69,74 ****
--- 69,75 ----
#include "access/transam.h"
#include "access/xlog.h"
+ #include "executor/executor.h"
#include "catalog/index.h"
#include "catalog/catalog.h"
#include "pgstat.h"
*************** index_beginscan(Relation heapRelation,
*** 254,259 ****
--- 255,265 ----
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ /* Prepare data structures for getting original indexed values from heap */
+ scan->indexInfo = BuildIndexInfo(scan->indexRelation);
+ scan->estate = CreateExecutorState();
+ scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
+
return scan;
}
*************** index_endscan(IndexScanDesc scan)
*** 377,382 ****
--- 383,393 ----
scan->xs_cbuf = InvalidBuffer;
}
+ if (scan->slot)
+ ExecDropSingleTupleTableSlot(scan->slot);
+ if (scan->estate)
+ FreeExecutorState(scan->estate);
+
/* End the AM's scan */
FunctionCall1(procedure, PointerGetDatum(scan));
*************** index_fetch_heap(IndexScanDesc scan)
*** 564,569 ****
--- 575,623 ----
}
/* ----------------
+ * index_get_heap_values - get original indexed values from heap
+ *
+ * Fetches heap tuple of heapPtr and calculated original indexed values.
+ * Returns true on success. Returns false when heap tuple wasn't found.
+ * Useful for indexes with lossy representation of keys.
+ * ----------------
+ */
+ bool
+ index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS])
+ {
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /* Tuple not found: it has been deleted from heap. */
+ UnlockReleaseBuffer(buffer);
+ return false;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(heap_copytuple(&tup), scan->slot, InvalidBuffer, true);
+ FormIndexDatum(scan->indexInfo, scan->slot, scan->estate, values, isnull);
+
+ UnlockReleaseBuffer(buffer);
+
+ return true;
+ }
+
+ /* ----------------
* index_getnext - get the next heap tuple from a scan
*
* The result is the next heap tuple satisfying the scan keys and the
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6cb6be5..1b2a511
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** static Point *interpt_sl(LSEG *lseg, LIN
*** 70,79 ****
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
- static double dist_ppoly_internal(Point *pt, POLYGON *poly);
/*
--- 70,79 ----
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+ static double dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
/*
*************** dist_lb(PG_FUNCTION_ARGS)
*** 2623,2628 ****
--- 2623,2660 ----
}
/*
+ * Distance from a point to a circle
+ */
+ Datum
+ dist_pc(PG_FUNCTION_ARGS)
+ {
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
* Distance from a circle to a polygon
*/
Datum
*************** dist_ppoly_internal(Point *pt, POLYGON *
*** 2701,2706 ****
--- 2733,2747 ----
return result;
}
+ /*
+ * Distance from a polygon to a point
+ */
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
+ PG_GETARG_POLYGON_P(0)));
+ }
/*---------------------------------------------------------------------
* interpt_
*************** pt_contained_circle(PG_FUNCTION_ARGS)
*** 5057,5079 ****
}
- /* dist_pc - returns the distance between
- * a point and a circle.
- */
- Datum
- dist_pc(PG_FUNCTION_ARGS)
- {
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
- }
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
--- 5098,5103 ----
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d1d6247..359d488
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index ce83042..27bd81c
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 17,24 ****
--- 17,26 ----
#include "access/gist.h"
#include "access/itup.h"
#include "access/xlogreader.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
#include "lib/pairingheap.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/hsearch.h"
*************** typedef struct GISTSearchHeapItem
*** 120,125 ****
--- 122,136 ----
bool recheck; /* T if quals must be rechecked */
} GISTSearchHeapItem;
+ /*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
*************** typedef struct GISTSearchItem
*** 131,143 ****
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! double distances[FLEXIBLE_ARRAY_MEMBER]; /* numberOfOrderBys
! * entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
--- 142,154 ----
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! GISTSearchTreeItemDistance distances[FLEXIBLE_ARRAY_MEMBER]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(GISTSearchTreeItemDistance) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
*************** typedef struct GISTScanOpaqueData
*** 151,162 ****
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
--- 162,176 ----
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
OffsetNumber nPageData; /* number of valid items in array */
OffsetNumber curPageData; /* next item to return */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..b1be157
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
***************
*** 19,24 ****
--- 19,25 ----
#include "access/htup_details.h"
#include "access/itup.h"
#include "access/tupdesc.h"
+ #include "nodes/execnodes.h"
typedef struct HeapScanDescData
*************** typedef struct IndexScanDescData
*** 93,98 ****
--- 94,104 ----
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
+
+ /* Data structures for getting original indexed values from heap */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} IndexScanDescData;
/* Struct for heap-or-index scans of system tables */
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..4a6fa7f
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 49d3d13..43f77ed
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index af991d3..b375215
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 4268b99..7cbcc70
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 4076,4081 ****
--- 4078,4085 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 8da6c6c..4632f64
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 417,422 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index 5603817..cb18986
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
knn-gist-recheck-7.patchapplication/octet-stream; name=knn-gist-recheck-7.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index 31ce279..c354411
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_distance(PG_FUNCTION_ARGS)
*** 779,784 ****
--- 787,793 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 791,804 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
! greater than any child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. The
! result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
--- 800,821 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function.
! </para>
!
! <para>
! Some approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never greater than any
! child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. For
! leaf nodes, the returned distance must be accurate, if the
! <function>distance</> function returns *recheck == false for the tuple.
! Otherwise the same approximation is allowed, and the executor will
! re-order ambiguous cases after recalculating the actual distance.
! </para>
!
! <para>
! The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index 717cb85..53c061d
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistindex_keytest(IndexScanDesc scan,
*** 176,181 ****
--- 176,182 ----
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,207 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
*distance_p = DatumGetFloat8(dist);
}
--- 193,213 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance functions get a recheck argument as well. In this
! * case the returned distance is the lower bound of distance
! * and needs to be rechecked. We return single recheck flag
! * which means that both quals and distances are to be
! * rechecked.
*/
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&recheck));
!
! *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
*************** getNextNearest(IndexScanDesc scan)
*** 411,416 ****
--- 417,423 ----
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
do
{
*************** getNextNearest(IndexScanDesc scan)
*** 424,429 ****
--- 431,441 ----
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
res = true;
}
else
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9fab6c8..37bf5d5
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1441,1443 ****
--- 1441,1480 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_bbox_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ /* Bounding box distance is always inexact. */
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 991858f..55c98b4
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 85,90 ****
--- 85,95 ----
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
new file mode 100644
index 48fa919..430abd0
*** a/src/backend/executor/nodeIndexscan.c
--- b/src/backend/executor/nodeIndexscan.c
***************
*** 28,41 ****
--- 28,117 ----
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+ #include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+ #include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+ /*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+ typedef struct
+ {
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+ } ReorderTuple;
+
+ static int
+ cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+ {
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+ }
+
+ /*
+ * Pairing heap provides getting topmost (greatest) element while KNN provides
+ * ascending sort. That's why we inverse sort order.
+ */
+ static int
+ reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+ {
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+ }
+
+ static void
+ copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+ {
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+ }
static TupleTableSlot *IndexNext(IndexScanState *node);
+ static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
*************** IndexNext(IndexScanState *node)
*** 54,59 ****
--- 130,137 ----
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
*************** IndexNext(IndexScanState *node)
*** 72,82 ****
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! /*
! * ok, now that we have what we need, fetch the next tuple.
! */
! while ((tuple = index_getnext(scandesc, direction)) != NULL)
{
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
--- 150,209 ----
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
*************** IndexNext(IndexScanState *node)
*** 103,108 ****
--- 230,300 ----
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ RecheckOrderBys(node, slot);
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
*************** IndexNext(IndexScanState *node)
*** 114,119 ****
--- 306,346 ----
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+ static void
+ RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+ {
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+ }
+
+ /*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 465,470 ****
--- 692,698 ----
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 501,506 ****
--- 729,737 ----
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 581,586 ****
--- 812,863 ----
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 76ba1bf..7daaadf
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 22,27 ****
--- 22,28 ----
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+ #include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
*************** static void copy_plan_costsize(Plan *des
*** 102,108 ****
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
--- 103,109 ----
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
*************** static Plan *prepare_sort_from_pathkeys(
*** 168,174 ****
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
--- 169,175 ----
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
*************** create_indexscan_plan(PlannerInfo *root,
*** 1158,1163 ****
--- 1159,1165 ----
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
*************** create_indexscan_plan(PlannerInfo *root,
*** 1266,1271 ****
--- 1268,1309 ----
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell,
+ *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey. Pathkey contains pointer
+ * to equivalence class. But it's not enough because we need the
+ * expression datatype to lookup opfamily member. That's why we've
+ * to dig and equivalence member.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ EquivalenceMember *em;
+ pathkey = (PathKey *) lfirst(pathkeyCell);
+ expr = (Expr *) lfirst(exprCell);
+
+ /* Find equivalence member by order by expression */
+ em = find_ec_member_for_tle(pathkey->pk_eclass, expr, NULL);
+
+ /* Get sort operator from opfamily */
+ sortOperators[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
*************** create_indexscan_plan(PlannerInfo *root,
*** 1285,1290 ****
--- 1323,1329 ----
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
*************** make_indexscan(List *qptlist,
*** 3327,3332 ****
--- 3366,3372 ----
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
*************** make_indexscan(List *qptlist,
*** 3344,3349 ****
--- 3384,3390 ----
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3967,3973 ****
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
--- 4008,4014 ----
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3998,4004 ****
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
--- 4039,4045 ----
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4126,4139 ****
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
--- 4167,4178 ----
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids)
{
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6cb6be5..29a7e75
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2657,2662 ****
--- 2657,2674 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+ }
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
*************** dist_pc(PG_FUNCTION_ARGS)
*** 5073,5078 ****
--- 5085,5105 ----
PG_RETURN_FLOAT8(result);
}
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d1d6247..359d488
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..e1f2031
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 91,96 ****
--- 91,105 ----
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..4a6fa7f
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 3588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 3586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 49d3d13..43f77ed
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 205,210 ****
--- 205,211 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 3589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 212,217 ****
--- 213,219 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 3589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index af991d3..6e0df88
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1014,1022 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1014,1026 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 3586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 3588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 3588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 4268b99..5ecea7c
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 3587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 3585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2179 ( gist_point_con
*** 4086,4091 ****
--- 4088,4095 ----
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 3589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 41288ed..cc05fae
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 17,22 ****
--- 17,23 ----
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+ #include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
*************** typedef struct
*** 1237,1242 ****
--- 1238,1244 ----
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
*************** typedef struct
*** 1247,1258 ****
--- 1249,1268 ----
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
*************** typedef struct IndexScanState
*** 1263,1268 ****
--- 1273,1287 ----
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index f6683f0..2e34b59
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef Scan SeqScan;
*** 303,309 ****
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
--- 303,313 ----
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is used at run time to recheck the ordering, if the index
! * does not calculate an accurate ordering. It is also needed for EXPLAIN.
! *
! * indexsortops is an array of operators used to sort the ORDER BY expressions,
! * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
*************** typedef struct IndexScan
*** 317,323 ****
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
--- 321,328 ----
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
! Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 8da6c6c..b4e8252
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 417,422 ****
--- 419,425 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index 5603817..cb18986
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
On Wed, Feb 25, 2015 at 12:15 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
Hi!
On Tue, Feb 24, 2015 at 5:39 PM, Tomas Vondra <
tomas.vondra@2ndquadrant.com> wrote:On 17.2.2015 14:21, Alexander Korotkov wrote:
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).I meant to do a bit of testing on this (assuming it's still needed), but
the patches need rebasing - Heikki fixed a few issues, so they don't
apply cleanly.Both patches are revised.
Both patches are rebased against current master.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-8.patchapplication/octet-stream; name=knn-gist-recheck-8.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index e7d1ff9..a5b2bda
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_distance(PG_FUNCTION_ARGS)
*** 780,785 ****
--- 788,794 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 792,805 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
! greater than any child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. The
! result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
--- 801,822 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function.
! </para>
!
! <para>
! Some approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never greater than any
! child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. For
! leaf nodes, the returned distance must be accurate, if the
! <function>distance</> function returns *recheck == false for the tuple.
! Otherwise the same approximation is allowed, and the executor will
! re-order ambiguous cases after recalculating the actual distance.
! </para>
!
! <para>
! The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index e4c00c2..0a8b88b
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistindex_keytest(IndexScanDesc scan,
*** 176,181 ****
--- 176,182 ----
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,207 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
*distance_p = DatumGetFloat8(dist);
}
--- 193,213 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance functions get a recheck argument as well. In this
! * case the returned distance is the lower bound of distance
! * and needs to be rechecked. We return single recheck flag
! * which means that both quals and distances are to be
! * rechecked.
*/
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&recheck));
!
! *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
*************** getNextNearest(IndexScanDesc scan)
*** 434,439 ****
--- 440,446 ----
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
if (scan->xs_itup)
{
*************** getNextNearest(IndexScanDesc scan)
*** 454,459 ****
--- 461,471 ----
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
/* in an index-only scan, also return the reconstructed tuple. */
if (scan->xs_want_itup)
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9d21e3f..38dad11
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1478,1480 ****
--- 1478,1517 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_bbox_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ /* Bounding box distance is always inexact. */
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 6f65398..0dba2e4
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 85,90 ****
--- 85,95 ----
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
new file mode 100644
index 48fa919..430abd0
*** a/src/backend/executor/nodeIndexscan.c
--- b/src/backend/executor/nodeIndexscan.c
***************
*** 28,41 ****
--- 28,117 ----
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+ #include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+ #include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+ /*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+ typedef struct
+ {
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+ } ReorderTuple;
+
+ static int
+ cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+ {
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+ }
+
+ /*
+ * Pairing heap provides getting topmost (greatest) element while KNN provides
+ * ascending sort. That's why we inverse sort order.
+ */
+ static int
+ reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+ {
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+ }
+
+ static void
+ copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+ {
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+ }
static TupleTableSlot *IndexNext(IndexScanState *node);
+ static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
*************** IndexNext(IndexScanState *node)
*** 54,59 ****
--- 130,137 ----
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
*************** IndexNext(IndexScanState *node)
*** 72,82 ****
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! /*
! * ok, now that we have what we need, fetch the next tuple.
! */
! while ((tuple = index_getnext(scandesc, direction)) != NULL)
{
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
--- 150,209 ----
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
*************** IndexNext(IndexScanState *node)
*** 103,108 ****
--- 230,300 ----
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ RecheckOrderBys(node, slot);
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
*************** IndexNext(IndexScanState *node)
*** 114,119 ****
--- 306,346 ----
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+ static void
+ RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+ {
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+ }
+
+ /*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 465,470 ****
--- 692,698 ----
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 501,506 ****
--- 729,737 ----
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 581,586 ****
--- 812,863 ----
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index cb69c03..b230d90
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 22,27 ****
--- 22,28 ----
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+ #include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
*************** static void copy_plan_costsize(Plan *des
*** 102,108 ****
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
--- 103,109 ----
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
*************** static Plan *prepare_sort_from_pathkeys(
*** 168,174 ****
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
--- 169,175 ----
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
*************** create_indexscan_plan(PlannerInfo *root,
*** 1158,1163 ****
--- 1159,1165 ----
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
*************** create_indexscan_plan(PlannerInfo *root,
*** 1269,1274 ****
--- 1271,1312 ----
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell,
+ *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey. Pathkey contains pointer
+ * to equivalence class. But it's not enough because we need the
+ * expression datatype to lookup opfamily member. That's why we've
+ * to dig and equivalence member.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ EquivalenceMember *em;
+ pathkey = (PathKey *) lfirst(pathkeyCell);
+ expr = (Expr *) lfirst(exprCell);
+
+ /* Find equivalence member by order by expression */
+ em = find_ec_member_for_tle(pathkey->pk_eclass, expr, NULL);
+
+ /* Get sort operator from opfamily */
+ sortOperators[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
*************** create_indexscan_plan(PlannerInfo *root,
*** 1288,1293 ****
--- 1326,1332 ----
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
*************** make_indexscan(List *qptlist,
*** 3330,3335 ****
--- 3369,3375 ----
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
*************** make_indexscan(List *qptlist,
*** 3347,3352 ****
--- 3387,3393 ----
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3970,3976 ****
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
--- 4011,4017 ----
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4001,4007 ****
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
--- 4042,4048 ----
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4129,4142 ****
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
--- 4170,4181 ----
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids)
{
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6cb6be5..29a7e75
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2657,2662 ****
--- 2657,2674 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+ }
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
*************** dist_pc(PG_FUNCTION_ARGS)
*** 5073,5078 ****
--- 5085,5105 ----
PG_RETURN_FLOAT8(result);
}
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d86590a..f129c4b
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..e1f2031
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 91,96 ****
--- 91,105 ----
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..ed44e05
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 4588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 4586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index a54d11f..5f2c3cb
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 208,213 ****
--- 208,214 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 4589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 215,220 ****
--- 216,222 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 4589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index e22eb27..51b6480
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1015,1023 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1015,1027 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 4586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 4586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 4588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 4588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 8469c82..aee95d1
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 4587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 4585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2179 ( gist_point_con
*** 4121,4126 ****
--- 4123,4130 ----
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 4589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index ac75f86..d78d3f9
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 17,22 ****
--- 17,23 ----
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+ #include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
*************** typedef struct
*** 1241,1246 ****
--- 1242,1248 ----
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
*************** typedef struct
*** 1251,1262 ****
--- 1253,1272 ----
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
*************** typedef struct IndexScanState
*** 1267,1272 ****
--- 1277,1291 ----
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 21cbfa8..44ebcd0
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef Scan SeqScan;
*** 303,309 ****
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
--- 303,313 ----
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is used at run time to recheck the ordering, if the index
! * does not calculate an accurate ordering. It is also needed for EXPLAIN.
! *
! * indexsortops is an array of operators used to sort the ORDER BY expressions,
! * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
*************** typedef struct IndexScan
*** 317,323 ****
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
--- 321,328 ----
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
! Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 2a91620..f177302
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 418,426 ****
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
-
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
--- 420,428 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index abe64e5..a95fe29
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
knn-gist-recheck-in-gist-3.patchapplication/octet-stream; name=knn-gist-recheck-in-gist-3.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index e7d1ff9..ea90833
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_same(PG_FUNCTION_ARGS)
*** 761,767 ****
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
--- 769,775 ----
The <acronym>SQL</> declaration of the function must look like this:
<programlisting>
! CREATE OR REPLACE FUNCTION my_distance(internal, data_type, smallint, oid, internal)
RETURNS float8
AS 'MODULE_PATHNAME'
LANGUAGE C STRICT;
*************** my_distance(PG_FUNCTION_ARGS)
*** 780,785 ****
--- 788,794 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 792,802 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
--- 801,816 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function. When
! <literal>recheck = true</> then value of distance will
! be rechecked from heap tuple before tuple is returned. If
! <literal>recheck</> flag isn't set then it's true by default for
! compatibility reasons. The <literal>recheck</> flag can be used only
! when ordering operator returns <type>float8</> value comparable with
! result of <function>distance</> function. Result of distance function
! should be never greater than result of ordering operator.
! Same approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never
greater than any child's actual distance. Thus, for example, distance
to a bounding box is usually sufficient in geometric applications. The
result value can be any finite <type>float8</> value. (Infinity and
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index e4c00c2..029ab7f
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
***************
*** 16,21 ****
--- 16,22 ----
#include "access/gist_private.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
#include "miscadmin.h"
#include "pgstat.h"
#include "lib/pairingheap.h"
*************** gistindex_keytest(IndexScanDesc scan,
*** 56,62 ****
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! double *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
--- 57,63 ----
GISTSTATE *giststate = so->giststate;
ScanKey key = scan->keyData;
int keySize = scan->numberOfKeys;
! GISTSearchTreeItemDistance *distance_p;
Relation r = scan->indexRelation;
*recheck_p = false;
*************** gistindex_keytest(IndexScanDesc scan,
*** 73,79 ****
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! so->distances[i] = -get_float8_infinity();
return true;
}
--- 74,83 ----
if (GistPageIsLeaf(page)) /* shouldn't happen */
elog(ERROR, "invalid GiST tuple found on leaf page");
for (i = 0; i < scan->numberOfOrderBys; i++)
! {
! so->distances[i].value = -get_float8_infinity();
! so->distances[i].recheck = false;
! }
return true;
}
*************** gistindex_keytest(IndexScanDesc scan,
*** 171,177 ****
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! *distance_p = get_float8_infinity();
}
else
{
--- 175,182 ----
if ((key->sk_flags & SK_ISNULL) || isNull)
{
/* Assume distance computes as null and sorts to the end */
! distance_p->value = get_float8_infinity();
! distance_p->recheck = false;
}
else
{
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,209 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
! *distance_p = DatumGetFloat8(dist);
}
key++;
--- 197,216 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance function gets a recheck argument as well as consistent
! * function. Distance will be re-calculated from heap tuple when
! * needed.
*/
! distance_p->recheck = false;
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&distance_p->recheck));
! distance_p->value = DatumGetFloat8(dist);
}
key++;
*************** gistindex_keytest(IndexScanDesc scan,
*** 237,243 ****
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
--- 244,250 ----
* sibling will be processed next.
*/
static void
! gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, GISTSearchTreeItemDistance *myDistances,
TIDBitmap *tbm, int64 *ntids)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 284,290 ****
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 291,297 ----
/* Insert it into the queue using same distances as for this page */
memcpy(item->distances, myDistances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 391,397 ****
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(double) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
--- 398,404 ----
/* Insert it into the queue using new distance data */
memcpy(item->distances, so->distances,
! sizeof(GISTSearchTreeItemDistance) * scan->numberOfOrderBys);
pairingheap_add(so->queue, &item->phNode);
*************** gistScanPage(IndexScanDesc scan, GISTSea
*** 403,429 ****
}
/*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(GISTScanOpaque so)
{
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
}
else
{
! /* Done when both heaps are empty */
! item = NULL;
}
-
- /* Return item; caller is responsible to pfree it */
- return item;
}
/*
--- 410,513 ----
}
/*
+ * Do this tree item distance values needs recheck?
+ */
+ static bool
+ searchTreeItemNeedDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ int i;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ return true;
+ }
+ return false;
+ }
+
+ /*
+ * Recheck distance values of item from heap and reinsert it into RB-tree.
+ */
+ static void
+ searchTreeItemDistanceRecheck(IndexScanDesc scan, GISTSearchItem *item)
+ {
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
+ Datum values[INDEX_MAX_KEYS];
+ bool isnull[INDEX_MAX_KEYS];
+ bool isNew;
+ int i;
+
+ /* Get index values from heap */
+ if (!index_get_heap_values(scan, &item->data.heap.heapPtr, values, isnull))
+ {
+ /*
+ * Tuple not found: it has been deleted from heap. We don't have to
+ * reinsert it into RB-tree.
+ */
+ pfree(item);
+ return;
+ }
+
+ /* Prepare new tree item and reinsert it */
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ if (item->distances[i].recheck)
+ {
+ /* Re-calculate lossy distance */
+ ScanKey key = scan->orderByData + i;
+ float8 newDistance;
+
+ item->distances[i].recheck = false;
+ if (isnull[key->sk_attno - 1])
+ {
+ item->distances[i].value = -get_float8_infinity();
+ continue;
+ }
+
+ newDistance = DatumGetFloat8(
+ FunctionCall2Coll(&so->orderByRechecks[i],
+ key->sk_collation,
+ values[key->sk_attno - 1],
+ key->sk_argument));
+
+ item->distances[i].value = newDistance;
+
+ }
+ }
+
+ pairingheap_add(so->queue, item);
+ }
+
+ /*
* Extract next item (in order) from search queue
*
* Returns a GISTSearchItem or NULL. Caller must pfree item when done with it.
*/
static GISTSearchItem *
! getNextGISTSearchItem(IndexScanDesc scan)
{
+ GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
GISTSearchItem *item;
if (!pairingheap_is_empty(so->queue))
{
! for (;;)
! {
! item = (GISTSearchItem *) pairingheap_remove_first(so->queue);
!
! /* Recheck distance from heap tuple if needed */
! if (GISTSearchItemIsHeap(*item) &&
! searchTreeItemNeedDistanceRecheck(scan, item))
! {
! searchTreeItemDistanceRecheck(scan, item);
! continue;
! }
! return item;
! }
}
else
{
! return NULL;
}
}
/*
*************** getNextNearest(IndexScanDesc scan)
*** 444,450 ****
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 528,534 ----
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
*************** gistgettuple(PG_FUNCTION_ARGS)
*** 536,542 ****
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
PG_RETURN_BOOL(false);
--- 620,626 ----
/* find and process the next index page */
do
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
PG_RETURN_BOOL(false);
*************** gistgetbitmap(PG_FUNCTION_ARGS)
*** 589,595 ****
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(so);
if (!item)
break;
--- 673,679 ----
*/
for (;;)
{
! GISTSearchItem *item = getNextGISTSearchItem(scan);
if (!item)
break;
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9d21e3f..84dea2d
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_poly_consistent(PG_FUNCTION_ARGS)
*** 1099,1104 ****
--- 1099,1105 ----
PG_RETURN_BOOL(result);
}
+
/**************************************************
* Circle ops
**************************************************/
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1478,1480 ****
--- 1479,1515 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_inexact_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 6f65398..e76c904
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
***************
*** 17,22 ****
--- 17,25 ----
#include "access/gist_private.h"
#include "access/gistscan.h"
#include "access/relscan.h"
+ #include "catalog/index.h"
+ #include "executor/executor.h"
+ #include "executor/tuptable.h"
#include "utils/memutils.h"
#include "utils/rel.h"
*************** pairingheap_GISTSearchItem_cmp(const pai
*** 30,53 ****
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! if (sa->distances[i] != sb->distances[i])
! return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
! }
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return 1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return -1;
! return 0;
! }
/*
* Index AM API functions for scanning GiST indexes
--- 33,65 ----
const GISTSearchItem *sa = (const GISTSearchItem *) a;
const GISTSearchItem *sb = (const GISTSearchItem *) b;
IndexScanDesc scan = (IndexScanDesc) arg;
! int i, recheckCmp = 0;
/* Order according to distance comparison */
for (i = 0; i < scan->numberOfOrderBys; i++)
{
! const GISTSearchTreeItemDistance distance_a = sa->distances[i];
! const GISTSearchTreeItemDistance distance_b = sb->distances[i];
! if (distance_a.value != distance_b.value)
! return (distance_a.value < distance_b.value) ? 1 : -1;
! /* Heap items go before inner pages, to ensure a depth-first search */
! if (GISTSearchItemIsHeap(*sa) && !GISTSearchItemIsHeap(*sb))
! return 1;
! if (!GISTSearchItemIsHeap(*sa) && GISTSearchItemIsHeap(*sb))
! return -1;
+ /*
+ * When all distance values are the same, items without recheck
+ * can be immediately returned. So they are placed first.
+ */
+ if (recheckCmp == 0 && distance_a.recheck != distance_b.recheck)
+ recheckCmp = distance_b.recheck ? 1 : -1;
+ }
+
+ return recheckCmp;
+ }
/*
* Index AM API functions for scanning GiST indexes
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 83,91 ****
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
scan->opaque = so;
/*
--- 95,111 ----
so->queueCxt = giststate->scanCxt; /* see gistrescan */
/* workspaces with size dependent on numberOfOrderBys: */
! so->distances = palloc(sizeof(GISTSearchTreeItemDistance) *
! scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ /* Functions for distance recheck from heap tuple */
+ so->orderByRechecks = (FmgrInfo *)palloc(sizeof(FmgrInfo)
+ * scan->numberOfOrderBys);
+ }
+
scan->opaque = so;
/*
*************** gistrescan(PG_FUNCTION_ARGS)
*** 276,281 ****
--- 296,305 ----
GIST_DISTANCE_PROC, skey->sk_attno,
RelationGetRelationName(scan->indexRelation));
+ /* Copy original sk_func for distance recheck from heap tuple */
+ fmgr_info_copy(&so->orderByRechecks[i], &(skey->sk_func),
+ so->giststate->scanCxt);
+
fmgr_info_copy(&(skey->sk_func), finfo, so->giststate->scanCxt);
/* Restore prior fn_extra pointers, if not first time */
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
new file mode 100644
index e6e4d28..90cf088
*** a/src/backend/access/index/genam.c
--- b/src/backend/access/index/genam.c
*************** RelationGetIndexScan(Relation indexRelat
*** 124,129 ****
--- 124,132 ----
scan->xs_ctup.t_data = NULL;
scan->xs_cbuf = InvalidBuffer;
scan->xs_continue_hot = false;
+ scan->indexInfo = NULL;
+ scan->estate = NULL;
+ scan->slot = NULL;
return scan;
}
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
new file mode 100644
index 2b27e73..fb3cf9e
*** a/src/backend/access/index/indexam.c
--- b/src/backend/access/index/indexam.c
***************
*** 69,74 ****
--- 69,75 ----
#include "access/transam.h"
#include "access/xlog.h"
+ #include "executor/executor.h"
#include "catalog/index.h"
#include "catalog/catalog.h"
#include "pgstat.h"
*************** index_beginscan(Relation heapRelation,
*** 254,259 ****
--- 255,265 ----
scan->heapRelation = heapRelation;
scan->xs_snapshot = snapshot;
+ /* Prepare data structures for getting original indexed values from heap */
+ scan->indexInfo = BuildIndexInfo(scan->indexRelation);
+ scan->estate = CreateExecutorState();
+ scan->slot = MakeSingleTupleTableSlot(RelationGetDescr(heapRelation));
+
return scan;
}
*************** index_endscan(IndexScanDesc scan)
*** 377,382 ****
--- 383,393 ----
scan->xs_cbuf = InvalidBuffer;
}
+ if (scan->slot)
+ ExecDropSingleTupleTableSlot(scan->slot);
+ if (scan->estate)
+ FreeExecutorState(scan->estate);
+
/* End the AM's scan */
FunctionCall1(procedure, PointerGetDatum(scan));
*************** index_fetch_heap(IndexScanDesc scan)
*** 564,569 ****
--- 575,623 ----
}
/* ----------------
+ * index_get_heap_values - get original indexed values from heap
+ *
+ * Fetches heap tuple of heapPtr and calculated original indexed values.
+ * Returns true on success. Returns false when heap tuple wasn't found.
+ * Useful for indexes with lossy representation of keys.
+ * ----------------
+ */
+ bool
+ index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS])
+ {
+ Buffer buffer;
+ bool got_heap_tuple, all_dead;
+ HeapTupleData tup;
+
+ /* Get tuple from heap */
+ buffer = ReadBuffer(scan->heapRelation,
+ ItemPointerGetBlockNumber(heapPtr));
+ LockBuffer(buffer, BUFFER_LOCK_SHARE);
+ got_heap_tuple = heap_hot_search_buffer(heapPtr,
+ scan->heapRelation,
+ buffer,
+ scan->xs_snapshot,
+ &tup,
+ &all_dead,
+ true);
+ if (!got_heap_tuple)
+ {
+ /* Tuple not found: it has been deleted from heap. */
+ UnlockReleaseBuffer(buffer);
+ return false;
+ }
+
+ /* Calculate index datums */
+ ExecStoreTuple(heap_copytuple(&tup), scan->slot, InvalidBuffer, true);
+ FormIndexDatum(scan->indexInfo, scan->slot, scan->estate, values, isnull);
+
+ UnlockReleaseBuffer(buffer);
+
+ return true;
+ }
+
+ /* ----------------
* index_getnext - get the next heap tuple from a scan
*
* The result is the next heap tuple satisfying the scan keys and the
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 6cb6be5..1b2a511
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** static Point *interpt_sl(LSEG *lseg, LIN
*** 70,79 ****
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
- static double dist_ppoly_internal(Point *pt, POLYGON *poly);
/*
--- 70,79 ----
static bool has_interpt_sl(LSEG *lseg, LINE *line);
static double dist_pl_internal(Point *pt, LINE *line);
static double dist_ps_internal(Point *pt, LSEG *lseg);
+ static double dist_ppoly_internal(Point *point, POLYGON *poly);
static Point *line_interpt_internal(LINE *l1, LINE *l2);
static bool lseg_inside_poly(Point *a, Point *b, POLYGON *poly, int start);
static Point *lseg_interpt_internal(LSEG *l1, LSEG *l2);
/*
*************** dist_lb(PG_FUNCTION_ARGS)
*** 2623,2628 ****
--- 2623,2660 ----
}
/*
+ * Distance from a point to a circle
+ */
+ Datum
+ dist_pc(PG_FUNCTION_ARGS)
+ {
+ Point *point = PG_GETARG_POINT_P(0);
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
+
+ /*
* Distance from a circle to a polygon
*/
Datum
*************** dist_ppoly_internal(Point *pt, POLYGON *
*** 2701,2706 ****
--- 2733,2747 ----
return result;
}
+ /*
+ * Distance from a polygon to a point
+ */
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ PG_RETURN_FLOAT8(dist_ppoly_internal(PG_GETARG_POINT_P(1),
+ PG_GETARG_POLYGON_P(0)));
+ }
/*---------------------------------------------------------------------
* interpt_
*************** pt_contained_circle(PG_FUNCTION_ARGS)
*** 5057,5079 ****
}
- /* dist_pc - returns the distance between
- * a point and a circle.
- */
- Datum
- dist_pc(PG_FUNCTION_ARGS)
- {
- Point *point = PG_GETARG_POINT_P(0);
- CIRCLE *circle = PG_GETARG_CIRCLE_P(1);
- float8 result;
-
- result = point_dt(point, &circle->center) - circle->radius;
- if (result < 0)
- result = 0;
- PG_RETURN_FLOAT8(result);
- }
-
-
/* circle_center - returns the center point of the circle.
*/
Datum
--- 5098,5103 ----
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d86590a..f129c4b
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
new file mode 100644
index 9d3714d..5184ab1
*** a/src/include/access/gist_private.h
--- b/src/include/access/gist_private.h
***************
*** 17,24 ****
--- 17,26 ----
#include "access/gist.h"
#include "access/itup.h"
#include "access/xlogreader.h"
+ #include "executor/tuptable.h"
#include "fmgr.h"
#include "lib/pairingheap.h"
+ #include "nodes/execnodes.h"
#include "storage/bufmgr.h"
#include "storage/buffile.h"
#include "utils/hsearch.h"
*************** typedef struct GISTSearchHeapItem
*** 125,130 ****
--- 127,141 ----
* index-only scans */
} GISTSearchHeapItem;
+ /*
+ * KNN distance item: distance which can be rechecked from heap tuple.
+ */
+ typedef struct GISTSearchTreeItemDistance
+ {
+ double value;
+ bool recheck;
+ } GISTSearchTreeItemDistance;
+
/* Unvisited item, either index page or heap tuple */
typedef struct GISTSearchItem
{
*************** typedef struct GISTSearchItem
*** 136,148 ****
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! double distances[FLEXIBLE_ARRAY_MEMBER]; /* numberOfOrderBys
! * entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
--- 147,159 ----
/* we must store parentlsn to detect whether a split occurred */
GISTSearchHeapItem heap; /* heap info, if heap tuple */
} data;
! GISTSearchTreeItemDistance distances[FLEXIBLE_ARRAY_MEMBER]; /* array with numberOfOrderBys entries */
} GISTSearchItem;
#define GISTSearchItemIsHeap(item) ((item).blkno == InvalidBlockNumber)
! #define GSTIHDRSZ offsetof(GISTSearchTreeItem, distances)
! #define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(GISTSearchTreeItemDistance) * (n_distances))
/*
* GISTScanOpaqueData: private state for a scan of a GiST index
*************** typedef struct GISTScanOpaqueData
*** 156,162 ****
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! double *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
--- 167,173 ----
bool firstCall; /* true until first gistgettuple call */
/* pre-allocated workspace arrays */
! GISTSearchTreeItemDistance *distances; /* output area for gistindex_keytest */
/* In a non-ordered search, returnable heap items are stored here: */
GISTSearchHeapItem pageData[BLCKSZ / sizeof(IndexTupleData)];
*************** typedef struct GISTScanOpaqueData
*** 164,169 ****
--- 175,183 ----
OffsetNumber curPageData; /* next item to return */
MemoryContext pageDataCxt; /* context holding the fetched tuples, for
index-only scans */
+
+ /* Data structures for performing recheck of lossy knn distance */
+ FmgrInfo *orderByRechecks; /* functions for lossy knn distance recheck */
} GISTScanOpaqueData;
typedef GISTScanOpaqueData *GISTScanOpaque;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..b1be157
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
***************
*** 19,24 ****
--- 19,25 ----
#include "access/htup_details.h"
#include "access/itup.h"
#include "access/tupdesc.h"
+ #include "nodes/execnodes.h"
typedef struct HeapScanDescData
*************** typedef struct IndexScanDescData
*** 93,98 ****
--- 94,104 ----
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
+
+ /* Data structures for getting original indexed values from heap */
+ IndexInfo *indexInfo; /* index info for index tuple calculation */
+ TupleTableSlot *slot; /* heap tuple slot */
+ EState *estate; /* executor state for index tuple calculation */
} IndexScanDescData;
/* Struct for heap-or-index scans of system tables */
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..ed44e05
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 4588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 4586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index a54d11f..5f2c3cb
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 208,213 ****
--- 208,214 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 4589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 215,220 ****
--- 216,222 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 4589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index e22eb27..e4fa5ab
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1015,1023 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1015,1027 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 4586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 4586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 4588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 4588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 8469c82..08396f9
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 845,850 ****
--- 845,852 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 4587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 4585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2585 ( gist_poly_cons
*** 4109,4114 ****
--- 4111,4118 ----
DESCR("GiST support");
DATA(insert OID = 2586 ( gist_poly_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_poly_compress _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 4589 ( gist_inexact_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ gist_inexact_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
DATA(insert OID = 2591 ( gist_circle_consistent PGNSP PGUID 12 1 0 0 0 f f f f t f i 5 0 16 "2281 718 23 26 2281" _null_ _null_ _null_ _null_ gist_circle_consistent _null_ _null_ _null_ ));
DESCR("GiST support");
DATA(insert OID = 2592 ( gist_circle_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i 1 0 2281 "2281" _null_ _null_ _null_ _null_ gist_circle_compress _null_ _null_ _null_ ));
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 2a91620..668db20
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 392,399 ****
--- 392,401 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 418,426 ****
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
-
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
--- 420,428 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_inexact_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index abe64e5..a95fe29
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
On 04/17/2015 12:05 PM, Alexander Korotkov wrote:
On Wed, Feb 25, 2015 at 12:15 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:Hi!
On Tue, Feb 24, 2015 at 5:39 PM, Tomas Vondra <
tomas.vondra@2ndquadrant.com> wrote:On 17.2.2015 14:21, Alexander Korotkov wrote:
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).I meant to do a bit of testing on this (assuming it's still needed), but
the patches need rebasing - Heikki fixed a few issues, so they don't
apply cleanly.Both patches are revised.
Both patches are rebased against current master.
This looks pretty much ready. I'm going to spend some time on this on
Friday, and if all looks good, commit. (Thursday's a public holiday here).
One quick comment:
It would be good to avoid the extra comparisons of the distances, when
the index doesn't return any lossy items. As the patch stands, it adds
one extra copyDistances() call and a cmp_distances() call for each tuple
(in a knn-search), even if there are no lossy tuples.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, May 13, 2015 at 10:16 PM, Heikki Linnakangas <hlinnaka@iki.fi>
wrote:
On 04/17/2015 12:05 PM, Alexander Korotkov wrote:
On Wed, Feb 25, 2015 at 12:15 PM, Alexander Korotkov <
aekorotkov@gmail.com>
wrote:Hi!
On Tue, Feb 24, 2015 at 5:39 PM, Tomas Vondra <
tomas.vondra@2ndquadrant.com> wrote:On 17.2.2015 14:21, Alexander Korotkov wrote:
On Sun, Feb 15, 2015 at 2:08 PM, Alexander Korotkov
<aekorotkov@gmail.com <mailto:aekorotkov@gmail.com>> wrote:Revised patch with reordering in GiST is attached
(knn-gist-recheck-in-gist.patch) as well as testing script (test.py).I meant to do a bit of testing on this (assuming it's still needed), but
the patches need rebasing - Heikki fixed a few issues, so they don't
apply cleanly.Both patches are revised.
Both patches are rebased against current master.
This looks pretty much ready. I'm going to spend some time on this on
Friday, and if all looks good, commit. (Thursday's a public holiday here).
Very good, thanks!
One quick comment:
It would be good to avoid the extra comparisons of the distances, when the
index doesn't return any lossy items. As the patch stands, it adds one
extra copyDistances() call and a cmp_distances() call for each tuple (in a
knn-search), even if there are no lossy tuples.
I will fix it until Friday.
------
With best regards,
Alexander Korotkov.
On Wed, May 13, 2015 at 10:17 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
One quick comment:
It would be good to avoid the extra comparisons of the distances, when
the index doesn't return any lossy items. As the patch stands, it adds one
extra copyDistances() call and a cmp_distances() call for each tuple (in a
knn-search), even if there are no lossy tuples.I will fix it until Friday.
Attached patch is rebased against current master. Extra extra
copyDistances() call and a cmp_distances() call for each tuple are avoided
in the case of no lossy tuples.
------
With best regards,
Alexander Korotkov.
Attachments:
knn-gist-recheck-9.patchapplication/octet-stream; name=knn-gist-recheck-9.patchDownload
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
new file mode 100644
index e7d1ff9..a5b2bda
*** a/doc/src/sgml/gist.sgml
--- b/doc/src/sgml/gist.sgml
***************
*** 105,110 ****
--- 105,111 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 163,168 ****
--- 164,170 ----
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
***************
*** 207,212 ****
--- 209,220 ----
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
*************** my_distance(PG_FUNCTION_ARGS)
*** 780,785 ****
--- 788,794 ----
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
*************** my_distance(PG_FUNCTION_ARGS)
*** 792,805 ****
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function, except that no
! recheck flag is used. The distance to a leaf index entry must always
! be determined exactly, since there is no way to re-order the tuples
! once they are returned. Some approximation is allowed when determining
! the distance to an internal tree node, so long as the result is never
! greater than any child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. The
! result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
--- 801,822 ----
</programlisting>
The arguments to the <function>distance</> function are identical to
! the arguments of the <function>consistent</> function.
! </para>
!
! <para>
! Some approximation is allowed when determining the distance to an
! internal tree node, so long as the result is never greater than any
! child's actual distance. Thus, for example, distance
! to a bounding box is usually sufficient in geometric applications. For
! leaf nodes, the returned distance must be accurate, if the
! <function>distance</> function returns *recheck == false for the tuple.
! Otherwise the same approximation is allowed, and the executor will
! re-order ambiguous cases after recalculating the actual distance.
! </para>
!
! <para>
! The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
new file mode 100644
index e4c00c2..0a8b88b
*** a/src/backend/access/gist/gistget.c
--- b/src/backend/access/gist/gistget.c
*************** gistindex_keytest(IndexScanDesc scan,
*** 176,181 ****
--- 176,182 ----
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
*************** gistindex_keytest(IndexScanDesc scan,
*** 192,207 ****
* always be zero, but might as well pass it for possible future
* use.)
*
! * Note that Distance functions don't get a recheck argument. We
! * can't tolerate lossy distance calculations on leaf tuples;
! * there is no opportunity to re-sort the tuples afterwards.
*/
! dist = FunctionCall4Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype));
*distance_p = DatumGetFloat8(dist);
}
--- 193,213 ----
* always be zero, but might as well pass it for possible future
* use.)
*
! * Distance functions get a recheck argument as well. In this
! * case the returned distance is the lower bound of distance
! * and needs to be rechecked. We return single recheck flag
! * which means that both quals and distances are to be
! * rechecked.
*/
! dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
! ObjectIdGetDatum(key->sk_subtype),
! PointerGetDatum(&recheck));
!
! *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
*************** getNextNearest(IndexScanDesc scan)
*** 434,439 ****
--- 440,446 ----
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
if (scan->xs_itup)
{
*************** getNextNearest(IndexScanDesc scan)
*** 454,459 ****
--- 461,471 ----
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_distances[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_distance_nulls[i] = false;
+ }
/* in an index-only scan, also return the reconstructed tuple. */
if (scan->xs_want_itup)
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index 9d21e3f..38dad11
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_point_distance(PG_FUNCTION_ARGS)
*** 1478,1480 ****
--- 1478,1517 ----
PG_RETURN_FLOAT8(distance);
}
+
+ /*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+ Datum
+ gist_bbox_distance(PG_FUNCTION_ARGS)
+ {
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ /* Bounding box distance is always inexact. */
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+ }
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
new file mode 100644
index 6f65398..0dba2e4
*** a/src/backend/access/gist/gistscan.c
--- b/src/backend/access/gist/gistscan.c
*************** gistbeginscan(PG_FUNCTION_ARGS)
*** 85,90 ****
--- 85,95 ----
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_distances = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_distance_nulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
new file mode 100644
index 48fa919..6ad8a23
*** a/src/backend/executor/nodeIndexscan.c
--- b/src/backend/executor/nodeIndexscan.c
***************
*** 28,41 ****
--- 28,117 ----
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+ #include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+ #include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+ /*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+ typedef struct
+ {
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *distances;
+ bool *distance_nulls;
+ } ReorderTuple;
+
+ static int
+ cmp_distances(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+ {
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+ }
+
+ /*
+ * Pairing heap provides getting topmost (greatest) element while KNN provides
+ * ascending sort. That's why we inverse sort order.
+ */
+ static int
+ reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+ {
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_distances(rta->distances, rta->distance_nulls,
+ rtb->distances, rtb->distance_nulls,
+ node);
+ }
+
+ static void
+ copyDistances(IndexScanState *node, const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+ {
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_DistanceTypByVals[i],
+ node->iss_DistanceTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+ }
static TupleTableSlot *IndexNext(IndexScanState *node);
+ static void RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot);
/* ----------------------------------------------------------------
*************** IndexNext(IndexScanState *node)
*** 54,59 ****
--- 130,137 ----
IndexScanDesc scandesc;
HeapTuple tuple;
TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *reordertuple;
/*
* extract necessary information from index scan node
*************** IndexNext(IndexScanState *node)
*** 72,82 ****
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! /*
! * ok, now that we have what we need, fetch the next tuple.
! */
! while ((tuple = index_getnext(scandesc, direction)) != NULL)
{
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
--- 150,209 ----
econtext = node->ss.ps.ps_ExprContext;
slot = node->ss.ss_ScanTupleSlot;
! for (;;)
{
+ /* Check the reorder queue first */
+ if (node->iss_ReorderQueue)
+ {
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ if (node->iss_ReachedEnd)
+ break;
+ }
+ else
+ {
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ /* Check if we can return this tuple */
+ if (node->iss_ReachedEnd ||
+ cmp_distances(reordertuple->distances,
+ reordertuple->distance_nulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) < 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = reordertuple->htup;
+ pfree(reordertuple);
+
+ /*
+ * Store the buffered tuple in the scan tuple slot of the
+ * scan state.
+ */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ }
+
+ /* Fetch next tuple from the index */
+ tuple = index_getnext(scandesc, direction);
+
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. If we have a reorder queue,
+ * we still need to drain all the remaining tuples in the queue
+ * before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ if (node->iss_ReorderQueue)
+ continue;
+ else
+ break;
+ }
+
/*
* Store the scanned tuple in the scan tuple slot of the scan state.
* Note: we pass 'false' because tuples returned by amgetnext are
*************** IndexNext(IndexScanState *node)
*** 103,108 ****
--- 230,314 ----
}
}
+ /*
+ * Re-check the ordering.
+ */
+ if (node->iss_ReorderQueue)
+ {
+ /*
+ * The index returned the distance, as calculated by the indexam,
+ * in scandesc->xs_distances. If the index was lossy, we have to
+ * recheck the ordering expression too. Otherwise we take the
+ * indexam's values as is.
+ */
+ if (scandesc->xs_recheck)
+ {
+ RecheckOrderBys(node, slot);
+ }
+ else
+ {
+ /*
+ * When both fetched tuple doesn't require recheck and
+ * reorder queue is empty then we're in simple knn case
+ * without any need of reordering. In this case we can
+ * immediately return fetched tuple without wasting cpu
+ * cycles.
+ */
+ if (pairingheap_is_empty(node->iss_ReorderQueue))
+ return slot;
+ else
+ copyDistances(node,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node->iss_Distances,
+ node->iss_DistanceNulls);
+ }
+
+ /*
+ * Can we return this tuple immediately, or does it need to be
+ * pushed to the reorder queue? If this tuple's distance was
+ * inaccurate, we can't return it yet, because the next tuple
+ * from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ reordertuple = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+ else
+ reordertuple = NULL;
+
+ if ((cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ scandesc->xs_distances,
+ scandesc->xs_distance_nulls,
+ node) > 0) ||
+ (reordertuple && cmp_distances(node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls,
+ node) > 0))
+ {
+ /* Need to put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ reordertuple = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ reordertuple->htup = heap_copytuple(tuple);
+ reordertuple->distances = (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ reordertuple->distance_nulls = (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyDistances(node,
+ node->iss_Distances,
+ node->iss_DistanceNulls,
+ reordertuple->distances,
+ reordertuple->distance_nulls);
+
+ pairingheap_add(node->iss_ReorderQueue, &reordertuple->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ }
+
+ /* Ok, got a tuple to return */
return slot;
}
*************** IndexNext(IndexScanState *node)
*** 114,119 ****
--- 320,360 ----
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+ static void
+ RecheckOrderBys(IndexScanState *node, TupleTableSlot *slot)
+ {
+ IndexScanDesc scandesc;
+ ExprContext *econtext;
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ Assert(i < scandesc->numberOfOrderBys);
+
+ node->iss_Distances[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_DistanceNulls[i],
+ NULL);
+ }
+
+ MemoryContextSwitchTo(oldContext);
+ }
+
+ /*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 465,470 ****
--- 706,712 ----
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 501,506 ****
--- 743,751 ----
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
*************** ExecInitIndexScan(IndexScan *node, EStat
*** 581,586 ****
--- 826,877 ----
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_DistanceTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_DistanceTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid distanceType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexsortops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexsortops[i],
+ &opfamily, &distanceType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexsortops[i]);
+ }
+ get_typlenbyval(distanceType,
+ &indexstate->iss_DistanceTypLens[i],
+ &indexstate->iss_DistanceTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_Distances =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_DistanceNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index c809237..b9ac491
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
***************
*** 22,27 ****
--- 22,28 ----
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+ #include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
*************** static void copy_plan_costsize(Plan *des
*** 102,108 ****
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
--- 103,109 ----
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
! List *indexorderby, List *indexorderbyorig, Oid *sortOperators,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
*************** static Plan *prepare_sort_from_pathkeys(
*** 168,174 ****
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids);
static Material *make_material(Plan *lefttree);
--- 169,175 ----
Oid **p_collations,
bool **p_nullsFirst);
static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
*************** create_indexscan_plan(PlannerInfo *root,
*** 1158,1163 ****
--- 1159,1165 ----
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *sortOperators = NULL;
ListCell *l;
/* it should be a base rel... */
*************** create_indexscan_plan(PlannerInfo *root,
*** 1269,1274 ****
--- 1271,1312 ----
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell,
+ *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+
+ sortOperators = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey. Pathkey contains pointer
+ * to equivalence class. But it's not enough because we need the
+ * expression datatype to lookup opfamily member. That's why we've
+ * to dig and equivalence member.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ EquivalenceMember *em;
+ pathkey = (PathKey *) lfirst(pathkeyCell);
+ expr = (Expr *) lfirst(exprCell);
+
+ /* Find equivalence member by order by expression */
+ em = find_ec_member_for_tle(pathkey->pk_eclass, expr, NULL);
+
+ /* Get sort operator from opfamily */
+ sortOperators[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
*************** create_indexscan_plan(PlannerInfo *root,
*** 1288,1293 ****
--- 1326,1332 ----
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ sortOperators,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
*************** make_indexscan(List *qptlist,
*** 3344,3349 ****
--- 3383,3389 ----
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *sortOperators,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
*************** make_indexscan(List *qptlist,
*** 3361,3366 ****
--- 3401,3407 ----
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
node->indexorderdir = indexscandir;
+ node->indexsortops = sortOperators;
return node;
}
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 3990,3996 ****
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr at right place in tlist */
--- 4031,4037 ----
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4021,4027 ****
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle, relids);
if (em)
{
/* found expr already in tlist */
--- 4062,4068 ----
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
! em = find_ec_member_for_tle(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
*************** prepare_sort_from_pathkeys(PlannerInfo *
*** 4149,4162 ****
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! TargetEntry *tle,
Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
--- 4190,4201 ----
*/
static EquivalenceMember *
find_ec_member_for_tle(EquivalenceClass *ec,
! Expr *tlexpr,
Relids relids)
{
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
while (tlexpr && IsA(tlexpr, RelabelType))
tlexpr = ((RelabelType *) tlexpr)->arg;
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 39a7855..a3af08a
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** dist_ppoly(PG_FUNCTION_ARGS)
*** 2657,2662 ****
--- 2657,2674 ----
PG_RETURN_FLOAT8(result);
}
+ Datum
+ dist_polyp(PG_FUNCTION_ARGS)
+ {
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+ }
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
*************** dist_pc(PG_FUNCTION_ARGS)
*** 5112,5117 ****
--- 5124,5144 ----
PG_RETURN_FLOAT8(result);
}
+ /*
+ * Distance from a circle to a point
+ */
+ Datum
+ dist_cpoint(PG_FUNCTION_ARGS)
+ {
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+ }
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
new file mode 100644
index d86590a..f129c4b
*** a/src/include/access/genam.h
--- b/src/include/access/genam.h
*************** extern void index_restrpos(IndexScanDesc
*** 147,153 ****
--- 147,156 ----
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+ extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
new file mode 100644
index 9bb6362..e1f2031
*** a/src/include/access/relscan.h
--- b/src/include/access/relscan.h
*************** typedef struct IndexScanDescData
*** 91,96 ****
--- 91,105 ----
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the "distance" of the last
+ * returned heap tuple according to the index. If xs_recheck is true,
+ * this needs to be rechecked just like the scan keys, and the value
+ * returned here is a lower-bound on the actual distance.
+ */
+ Datum *xs_distances;
+ bool *xs_distance_nulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index 5aab896..ed44e05
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert ( 2594 604 604 11 s 2577 7
*** 650,655 ****
--- 650,656 ----
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+ DATA(insert ( 2594 604 600 15 o 4588 783 1970 ));
/*
* gist circle_ops
*************** DATA(insert ( 2595 718 718 11 s 1514 7
*** 669,674 ****
--- 670,676 ----
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+ DATA(insert ( 2595 718 600 15 o 4586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index e3de3b5..2aca30c
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert ( 2594 604 604 4 2580 ));
*** 208,213 ****
--- 208,214 ----
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+ DATA(insert ( 2594 604 604 8 4589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
*************** DATA(insert ( 2595 718 718 4 2580 ));
*** 215,220 ****
--- 216,222 ----
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+ DATA(insert ( 2595 718 718 8 4589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
new file mode 100644
index 34ebb50..8f08543
*** a/src/include/catalog/pg_operator.h
--- b/src/include/catalog/pg_operator.h
*************** DATA(insert OID = 1520 ( "<->" PGNSP
*** 1015,1023 ****
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
--- 1015,1027 ----
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
! DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 4586 0 dist_pc - - ));
DESCR("distance between");
! DATA(insert OID = 4586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
! DESCR("distance between");
! DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 4588 0 dist_ppoly - - ));
! DESCR("distance between");
! DATA(insert OID = 4588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index a1e2442..f0a8cd7
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DATA(insert OID = 727 ( dist_sl PGN
*** 856,861 ****
--- 856,863 ----
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+ DATA(insert OID = 4587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+ DATA(insert OID = 4585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
*************** DATA(insert OID = 2179 ( gist_point_con
*** 4165,4170 ****
--- 4167,4174 ----
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+ DATA(insert OID = 4589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+ DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
new file mode 100644
index 5ad2cc2..c7d380b
*** a/src/include/nodes/execnodes.h
--- b/src/include/nodes/execnodes.h
***************
*** 17,22 ****
--- 17,23 ----
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+ #include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
*************** typedef struct
*** 1262,1267 ****
--- 1263,1269 ----
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
*************** typedef struct
*** 1272,1283 ****
--- 1274,1293 ----
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue queue of re-check tuples that need reordering
+ * Distances re-checked distances of last fetched tuple
+ * SortSupport for re-ordering ORDER BY exprs
+ * ReachedEnd have we fetched all tuples from index already?
+ * DistanceTypByVals is the datatype of order by expression pass-by-value?
+ * DistanceTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
*************** typedef struct IndexScanState
*** 1288,1293 ****
--- 1298,1312 ----
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ Datum *iss_Distances;
+ bool *iss_DistanceNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_DistanceTypByVals;
+ int16 *iss_DistanceTypLens;
+ bool iss_ReachedEnd;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
new file mode 100644
index 1494b33..e435cec
*** a/src/include/nodes/plannodes.h
--- b/src/include/nodes/plannodes.h
*************** typedef Scan SeqScan;
*** 311,317 ****
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
--- 311,321 ----
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
! * indexorderbyorig is used at run time to recheck the ordering, if the index
! * does not calculate an accurate ordering. It is also needed for EXPLAIN.
! *
! * indexsortops is an array of operators used to sort the ORDER BY expressions,
! * used together with indexorderbyorig to recheck ordering at run time.
* (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
*************** typedef struct IndexScan
*** 325,331 ****
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
--- 329,336 ----
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
! List *indexorderbyorig; /* the same in original form */
! Oid *indexsortops; /* OIDs of operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 4377baa..69e3d7e
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** extern Datum circle_diameter(PG_FUNCTION
*** 394,401 ****
--- 394,403 ----
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+ extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+ extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
*************** extern Datum gist_circle_consistent(PG_F
*** 420,428 ****
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
-
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
--- 422,430 ----
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+ extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
/* geo_selfuncs.c */
extern Datum areasel(PG_FUNCTION_ARGS);
extern Datum areajoinsel(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
new file mode 100644
index abe64e5..a95fe29
*** a/src/test/regress/expected/create_index.out
--- b/src/test/regress/expected/create_index.out
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 372,377 ****
--- 372,407 ----
48
(1 row)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 1152,1157 ****
--- 1182,1235 ----
48
(1 row)
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+ -----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+ (3 rows)
+
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+ -------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+ (10 rows)
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+ ---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+ (3 rows)
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+ -----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+ (10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
new file mode 100644
index f779fa0..5df9008
*** a/src/test/regress/sql/create_index.sql
--- b/src/test/regress/sql/create_index.sql
*************** SELECT count(*) FROM radix_text_tbl WHER
*** 224,229 ****
--- 224,233 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
*************** EXPLAIN (COSTS OFF)
*** 437,442 ****
--- 441,454 ----
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+ EXPLAIN (COSTS OFF)
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
On 05/14/2015 01:43 PM, Alexander Korotkov wrote:
On Wed, May 13, 2015 at 10:17 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:One quick comment:
It would be good to avoid the extra comparisons of the distances, when
the index doesn't return any lossy items. As the patch stands, it adds one
extra copyDistances() call and a cmp_distances() call for each tuple (in a
knn-search), even if there are no lossy tuples.I will fix it until Friday.
Attached patch is rebased against current master. Extra extra
copyDistances() call and a cmp_distances() call for each tuple are avoided
in the case of no lossy tuples.
Thanks! I spent some time cleaning this up:
* fixed a memory leak
* fixed a silly bug in rechecking multi-column scans
* I restructured the changes to IndexNext. I actually created a whole
separate copy of IndexNext, called IndexNextWithReorder, that is used
when there are ORDER BY expressions that might need to be rechecked.
There is now some duplicated code between them, but I think they are
both easier to understand this way. The IndexNext function is now as
simple as before, and the IndexNextWithReorder doesn't need so many
if()-checks on whether the reorder queue exists at all.
* I renamed Distance to OrderByValues in the executor parts. We call the
"ORDER BY x <-> y" construct an ORDER BY expression, so let's continue
using that terminology.
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.
Forgot to attach the latest patch, here you go.
- Heikki
Attachments:
0001-Allow-GiST-distance-function-to-return-merely-a-lowe.patchapplication/x-patch; name=0001-Allow-GiST-distance-function-to-return-merely-a-lowe.patchDownload
From df00d9c972a760e1ed777a7c9b1603dad1d3f134 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Fri, 15 May 2015 00:56:27 +0300
Subject: [PATCH 1/1] Allow GiST distance function to return merely a
lower-bound.
The distance function can now set *recheck = false, like index quals. The
executor will then re-check the ORDER BY expressions, and use a queue to
reorder the results on the fly.
This makes it possible to do kNN-searches on polygons and circles, which
store a bounding box in the index, rather than the exact value.
Alexander Korotkov and me
---
doc/src/sgml/gist.sgml | 35 ++-
src/backend/access/gist/gistget.c | 22 +-
src/backend/access/gist/gistproc.c | 37 +++
src/backend/access/gist/gistscan.c | 5 +
src/backend/executor/nodeIndexscan.c | 351 ++++++++++++++++++++++++++++-
src/backend/optimizer/plan/createplan.c | 69 ++++--
src/backend/utils/adt/geo_ops.c | 27 +++
src/include/access/genam.h | 3 +
src/include/access/relscan.h | 9 +
src/include/catalog/catversion.h | 2 +-
src/include/catalog/pg_amop.h | 2 +
src/include/catalog/pg_amproc.h | 2 +
src/include/catalog/pg_operator.h | 8 +-
src/include/catalog/pg_proc.h | 4 +
src/include/nodes/execnodes.h | 19 ++
src/include/nodes/plannodes.h | 12 +-
src/include/utils/geo_decls.h | 3 +
src/test/regress/expected/create_index.out | 78 +++++++
src/test/regress/sql/create_index.sql | 12 +
19 files changed, 663 insertions(+), 37 deletions(-)
diff --git a/doc/src/sgml/gist.sgml b/doc/src/sgml/gist.sgml
index e7d1ff9..1291f8d 100644
--- a/doc/src/sgml/gist.sgml
+++ b/doc/src/sgml/gist.sgml
@@ -105,6 +105,7 @@
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
@@ -163,6 +164,7 @@
<literal>~=</>
</entry>
<entry>
+ <literal><-></>
</entry>
</row>
<row>
@@ -207,6 +209,12 @@
</table>
<para>
+ Currently, ordering by the distance operator <literal><-></>
+ is supported only with <literal>point</> by the operator classes
+ of the geometric types.
+ </para>
+
+ <para>
For historical reasons, the <literal>inet_ops</> operator class is
not the default class for types <type>inet</> and <type>cidr</>.
To use it, mention the class name in <command>CREATE INDEX</>,
@@ -780,6 +788,7 @@ my_distance(PG_FUNCTION_ARGS)
data_type *query = PG_GETARG_DATA_TYPE_P(1);
StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
/* Oid subtype = PG_GETARG_OID(3); */
+ /* bool *recheck = (bool *) PG_GETARG_POINTER(4); */
data_type *key = DatumGetDataType(entry->key);
double retval;
@@ -792,14 +801,24 @@ my_distance(PG_FUNCTION_ARGS)
</programlisting>
The arguments to the <function>distance</> function are identical to
- the arguments of the <function>consistent</> function, except that no
- recheck flag is used. The distance to a leaf index entry must always
- be determined exactly, since there is no way to re-order the tuples
- once they are returned. Some approximation is allowed when determining
- the distance to an internal tree node, so long as the result is never
- greater than any child's actual distance. Thus, for example, distance
- to a bounding box is usually sufficient in geometric applications. The
- result value can be any finite <type>float8</> value. (Infinity and
+ the arguments of the <function>consistent</> function.
+ </para>
+
+ <para>
+ Some approximation is allowed when determining the distance, as long as
+ the result is never greater than the entry's actual distance. Thus, for
+ example, distance to a bounding box is usually sufficient in geometric
+ applications. For an internal tree node, the distance returned must not
+ be greater than the distance to any of the child nodes. If the returned
+ distance is not accurate, the function must set *recheck to false. (This
+ is not necessary for internal tree nodes; for them, the calculation is
+ always assumed to be inaccurate). The executor will calculate the
+ accurate distance after fetching the tuple from the heap, and reorder
+ the tuples if necessary.
+ </para>
+
+ <para>
+ The result value can be any finite <type>float8</> value. (Infinity and
minus infinity are used internally to handle cases such as nulls, so it
is not recommended that <function>distance</> functions return these
values.)
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index e4c00c2..90cb3e0 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -176,6 +176,7 @@ gistindex_keytest(IndexScanDesc scan,
else
{
Datum dist;
+ bool recheck;
GISTENTRY de;
gistdentryinit(giststate, key->sk_attno - 1, &de,
@@ -192,16 +193,21 @@ gistindex_keytest(IndexScanDesc scan,
* always be zero, but might as well pass it for possible future
* use.)
*
- * Note that Distance functions don't get a recheck argument. We
- * can't tolerate lossy distance calculations on leaf tuples;
- * there is no opportunity to re-sort the tuples afterwards.
+ * Distance functions get a recheck argument as well. In this
+ * case the returned distance is the lower bound of distance
+ * and needs to be rechecked. We return single recheck flag
+ * which means that both quals and distances are to be
+ * rechecked.
*/
- dist = FunctionCall4Coll(&key->sk_func,
+ dist = FunctionCall5Coll(&key->sk_func,
key->sk_collation,
PointerGetDatum(&de),
key->sk_argument,
Int32GetDatum(key->sk_strategy),
- ObjectIdGetDatum(key->sk_subtype));
+ ObjectIdGetDatum(key->sk_subtype),
+ PointerGetDatum(&recheck));
+
+ *recheck_p |= recheck;
*distance_p = DatumGetFloat8(dist);
}
@@ -434,6 +440,7 @@ getNextNearest(IndexScanDesc scan)
{
GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
bool res = false;
+ int i;
if (scan->xs_itup)
{
@@ -454,6 +461,11 @@ getNextNearest(IndexScanDesc scan)
/* found a heap item at currently minimal distance */
scan->xs_ctup.t_self = item->data.heap.heapPtr;
scan->xs_recheck = item->data.heap.recheck;
+ for (i = 0; i < scan->numberOfOrderBys; i++)
+ {
+ scan->xs_orderbyvals[i] = Float8GetDatum(item->distances[i]);
+ scan->xs_orderbynulls[i] = false;
+ }
/* in an index-only scan, also return the reconstructed tuple. */
if (scan->xs_want_itup)
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index 9d21e3f..38dad11 100644
--- a/src/backend/access/gist/gistproc.c
+++ b/src/backend/access/gist/gistproc.c
@@ -1478,3 +1478,40 @@ gist_point_distance(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(distance);
}
+
+/*
+ * The inexact GiST distance method for geometric types that store bounding
+ * boxes.
+ *
+ * Compute lossy distance from point to index entries. The result is inexact
+ * because index entries are bounding boxes, not the exact shapes of the
+ * indexed geometric types. We use distance from point to MBR of index entry.
+ * This is correct lower bound estimate of distance from point to indexed
+ * geometric type.
+ */
+Datum
+gist_bbox_distance(PG_FUNCTION_ARGS)
+{
+ GISTENTRY *entry = (GISTENTRY *) PG_GETARG_POINTER(0);
+ StrategyNumber strategy = (StrategyNumber) PG_GETARG_UINT16(2);
+ bool *recheck = (bool *) PG_GETARG_POINTER(4);
+ double distance;
+ StrategyNumber strategyGroup = strategy / GeoStrategyNumberOffset;
+
+ /* Bounding box distance is always inexact. */
+ *recheck = true;
+
+ switch (strategyGroup)
+ {
+ case PointStrategyNumberGroup:
+ distance = computeDistance(false,
+ DatumGetBoxP(entry->key),
+ PG_GETARG_POINT_P(1));
+ break;
+ default:
+ elog(ERROR, "unknown strategy number: %d", strategy);
+ distance = 0.0; /* keep compiler quiet */
+ }
+
+ PG_RETURN_FLOAT8(distance);
+}
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 6f65398..099849a 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -85,6 +85,11 @@ gistbeginscan(PG_FUNCTION_ARGS)
/* workspaces with size dependent on numberOfOrderBys: */
so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
so->qual_ok = true; /* in case there are zero keys */
+ if (scan->numberOfOrderBys > 0)
+ {
+ scan->xs_orderbyvals = palloc(sizeof(Datum) * scan->numberOfOrderBys);
+ scan->xs_orderbynulls = palloc(sizeof(bool) * scan->numberOfOrderBys);
+ }
scan->opaque = so;
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 48fa919..9386768 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -16,6 +16,7 @@
* INTERFACE ROUTINES
* ExecIndexScan scans a relation using an index
* IndexNext retrieve next tuple using index
+ * IndexNextWithReorder same, but recheck ORDER BY expressions
* ExecInitIndexScan creates and initializes state info.
* ExecReScanIndexScan rescans the indexed relation.
* ExecEndIndexScan releases all storage.
@@ -28,14 +29,92 @@
#include "access/relscan.h"
#include "executor/execdebug.h"
#include "executor/nodeIndexscan.h"
+#include "lib/pairingheap.h"
#include "optimizer/clauses.h"
#include "utils/array.h"
+#include "utils/datum.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/rel.h"
+/*
+ * When an ordering operator is used, tuples fetched from the index that
+ * need to be reordered are queued in a pairing heap, as ReorderTuples.
+ */
+typedef struct
+{
+ pairingheap_node ph_node;
+ HeapTuple htup;
+ Datum *orderbyvals;
+ bool *orderbynulls;
+} ReorderTuple;
+
+static int
+cmp_orderbyvals(const Datum *adist, const bool *anulls,
+ const Datum *bdist, const bool *bnulls,
+ IndexScanState *node)
+{
+ int i;
+ int result;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ SortSupport ssup = &node->iss_SortSupport[i];
+
+ /* Handle nulls. We only support NULLS LAST. */
+ if (anulls[i] && !bnulls[i])
+ return 1;
+ else if (!anulls[i] && bnulls[i])
+ return -1;
+ else if (anulls[i] && bnulls[i])
+ return 0;
+
+ result = ssup->comparator(adist[i], bdist[i], ssup);
+ if (result != 0)
+ return result;
+ }
+
+ return 0;
+}
+
+/*
+ * Pairing heap provides getting topmost (greatest) element while KNN provides
+ * ascending sort. That's why we inverse the sort order.
+ */
+static int
+reorderbuffer_cmp(const pairingheap_node *a, const pairingheap_node *b, void *arg)
+{
+ ReorderTuple *rta = (ReorderTuple *) a;
+ ReorderTuple *rtb = (ReorderTuple *) b;
+ IndexScanState *node = (IndexScanState *) arg;
+
+ return -cmp_orderbyvals(rta->orderbyvals, rta->orderbynulls,
+ rtb->orderbyvals, rtb->orderbynulls,
+ node);
+}
+
+static void
+copyOrderByVals(IndexScanState *node,
+ const Datum *src_datums, const bool *src_nulls,
+ Datum *dst_datums, bool *dst_nulls)
+{
+ int i;
+
+ for (i = 0; i < node->iss_NumOrderByKeys; i++)
+ {
+ if (!src_nulls[i])
+ dst_datums[i] = datumCopy(src_datums[i],
+ node->iss_OrderByTypByVals[i],
+ node->iss_OrderByTypLens[i]);
+ else
+ dst_datums[i] = (Datum) 0;
+ dst_nulls[i] = src_nulls[i];
+ }
+}
static TupleTableSlot *IndexNext(IndexScanState *node);
+static TupleTableSlot *IndexNextWithReorder(IndexScanState *node);
+static void EvalOrderByExpressions(IndexScanState *node, ExprContext *econtext);
/* ----------------------------------------------------------------
@@ -110,10 +189,221 @@ IndexNext(IndexScanState *node)
* if we get here it means the index scan failed so we are at the end of
* the scan..
*/
+ node->iss_ReachedEnd = true;
+ return ExecClearTuple(slot);
+}
+
+/* ----------------------------------------------------------------
+ * IndexNextWithReorder
+ *
+ * Like IndexNext, but his version can also re-check any
+ * ORDER BY expressions, and reorder the tuples as necessary.
+ * ----------------------------------------------------------------
+ */
+static TupleTableSlot *
+IndexNextWithReorder(IndexScanState *node)
+{
+ EState *estate;
+ ExprContext *econtext;
+ IndexScanDesc scandesc;
+ HeapTuple tuple;
+ TupleTableSlot *slot;
+ MemoryContext oldContext;
+ ReorderTuple *topmost = NULL;
+ ReorderTuple *queued;
+ bool was_exact;
+ Datum *lastfetched_vals;
+ bool *lastfetched_nulls;
+ int cmp;
+
+ estate = node->ss.ps.state;
+ /* only forward scan is supported with reordering. */
+ Assert (!ScanDirectionIsBackward(((IndexScan *) node->ss.ps.plan)->indexorderdir));
+ Assert (ScanDirectionIsForward(estate->es_direction));
+ scandesc = node->iss_ScanDesc;
+ econtext = node->ss.ps.ps_ExprContext;
+ slot = node->ss.ss_ScanTupleSlot;
+
+ for (;;)
+ {
+ /*
+ * Check the reorder queue first. If the topmost tuple in the queue
+ * has an ORDER BY value smaller than (or equal to) the value last
+ * returned by the index, we can return it now.
+ */
+ if (!pairingheap_is_empty(node->iss_ReorderQueue))
+ {
+ topmost = (ReorderTuple *) pairingheap_first(node->iss_ReorderQueue);
+
+ if (node->iss_ReachedEnd ||
+ cmp_orderbyvals(topmost->orderbyvals,
+ topmost->orderbynulls,
+ scandesc->xs_orderbyvals,
+ scandesc->xs_orderbynulls,
+ node) <= 0)
+ {
+ (void) pairingheap_remove_first(node->iss_ReorderQueue);
+
+ tuple = topmost->htup;
+ pfree(topmost->orderbyvals);
+ pfree(topmost->orderbynulls);
+ pfree(topmost);
+
+ /* Pass 'true', as the tuple in the queue is a palloc'd copy */
+ ExecStoreTuple(tuple, slot, InvalidBuffer, true);
+ return slot;
+ }
+ }
+ else if (node->iss_ReachedEnd)
+ {
+ /* Queue is empty, and no more tuples from index. We're done. */
+ return ExecClearTuple(slot);
+ }
+
+ /*
+ * Fetch next tuple from the index.
+ */
+ next_indextuple:
+ tuple = index_getnext(scandesc, ForwardScanDirection);
+ if (!tuple)
+ {
+ /*
+ * No more tuples from the index. But we still need to drain any
+ * remaining tuples from the queue before we're done.
+ */
+ node->iss_ReachedEnd = true;
+ continue;
+ }
+
+ /*
+ * Store the scanned tuple in the scan tuple slot of the scan state.
+ * Note: we pass 'false' because tuples returned by amgetnext are
+ * pointers onto disk pages and must not be pfree()'d.
+ */
+ ExecStoreTuple(tuple, /* tuple to store */
+ slot, /* slot to store in */
+ scandesc->xs_cbuf, /* buffer containing tuple */
+ false); /* don't pfree */
+
+ /*
+ * If the index was lossy, we have to recheck the index quals and
+ * ORDER BY expressions using the fetched tuple.
+ */
+ if (scandesc->xs_recheck)
+ {
+ econtext->ecxt_scantuple = slot;
+ ResetExprContext(econtext);
+ if (!ExecQual(node->indexqualorig, econtext, false))
+ {
+ /* Fails recheck, so drop it and loop back for another */
+ InstrCountFiltered2(node, 1);
+ goto next_indextuple;
+ }
+
+ EvalOrderByExpressions(node, econtext);
+
+ /*
+ * Was the ORDER BY value returned by the index accurate? The
+ * recheck flag means that the index can return inaccurate
+ * values, but then again, the value returned for any particular
+ * tuple could also be exactly correct. Compare the value returned
+ * by the index with the recalculated value. (If the value happened
+ * to be accurate, we can often avoid pushing the tuple to the
+ * queue, just to pop it back out again.)
+ */
+ cmp = cmp_orderbyvals(node->iss_OrderByValues,
+ node->iss_OrderByNulls,
+ scandesc->xs_orderbyvals,
+ scandesc->xs_orderbynulls,
+ node);
+ if (cmp < 0)
+ elog(ERROR, "index returned tuples in wrong order");
+ else if (cmp == 0)
+ was_exact = true;
+ else
+ was_exact = false;
+ lastfetched_vals = node->iss_OrderByValues;
+ lastfetched_nulls = node->iss_OrderByNulls;
+ }
+ else
+ {
+ was_exact = true;
+ lastfetched_vals = scandesc->xs_orderbyvals;
+ lastfetched_nulls = scandesc->xs_orderbynulls;
+ }
+
+ /*
+ * Can we return this tuple immediately, or does it need to be pushed
+ * to the reorder queue? If the ORDER BY expression values returned
+ * by the index were inaccurate, we can't return it yet, because the
+ * next tuple from the index might need to come before this one. Also,
+ * we can't return it yet if there are any smaller tuples in the
+ * queue already.
+ */
+ if (!was_exact || (topmost && cmp_orderbyvals(lastfetched_vals,
+ lastfetched_nulls,
+ topmost->orderbyvals,
+ topmost->orderbynulls,
+ node) > 0))
+ {
+ /* Put this to the queue */
+ oldContext = MemoryContextSwitchTo(estate->es_query_cxt);
+ queued = (ReorderTuple *) palloc(sizeof(ReorderTuple));
+ queued->htup = heap_copytuple(tuple);
+ queued->orderbyvals =
+ (Datum *) palloc(sizeof(Datum) * scandesc->numberOfOrderBys);
+ queued->orderbynulls =
+ (bool *) palloc(sizeof(bool) * scandesc->numberOfOrderBys);
+ copyOrderByVals(node, lastfetched_vals, lastfetched_nulls,
+ queued->orderbyvals, queued->orderbynulls);
+ pairingheap_add(node->iss_ReorderQueue, &queued->ph_node);
+
+ MemoryContextSwitchTo(oldContext);
+
+ continue;
+ }
+ else
+ {
+ /* Can return this tuple immediately. */
+ return slot;
+ }
+ }
+
+ /*
+ * if we get here it means the index scan failed so we are at the end of
+ * the scan..
+ */
return ExecClearTuple(slot);
}
/*
+ * Calculate the expressions in the ORDER BY clause, based on the heap tuple.
+ */
+static void
+EvalOrderByExpressions(IndexScanState *node, ExprContext *econtext)
+{
+ int i;
+ ListCell *l;
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+
+ i = 0;
+ foreach(l, node->indexorderbyorig)
+ {
+ ExprState *orderby = (ExprState *) lfirst(l);
+
+ node->iss_OrderByValues[i] = ExecEvalExpr(orderby,
+ econtext,
+ &node->iss_OrderByNulls[i],
+ NULL);
+ i++;
+ }
+
+ MemoryContextSwitchTo(oldContext);
+}
+
+/*
* IndexRecheck -- access method routine to recheck a tuple in EvalPlanQual
*/
static bool
@@ -147,9 +437,14 @@ ExecIndexScan(IndexScanState *node)
if (node->iss_NumRuntimeKeys != 0 && !node->iss_RuntimeKeysReady)
ExecReScan((PlanState *) node);
- return ExecScan(&node->ss,
- (ExecScanAccessMtd) IndexNext,
- (ExecScanRecheckMtd) IndexRecheck);
+ if (node->iss_NumOrderByKeys > 0)
+ return ExecScan(&node->ss,
+ (ExecScanAccessMtd) IndexNextWithReorder,
+ (ExecScanRecheckMtd) IndexRecheck);
+ else
+ return ExecScan(&node->ss,
+ (ExecScanAccessMtd) IndexNext,
+ (ExecScanRecheckMtd) IndexRecheck);
}
/* ----------------------------------------------------------------
@@ -465,6 +760,7 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
IndexScanState *indexstate;
Relation currentRelation;
bool relistarget;
+ int i;
/*
* create state structure
@@ -501,6 +797,9 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
indexstate->indexqualorig = (List *)
ExecInitExpr((Expr *) node->indexqualorig,
(PlanState *) indexstate);
+ indexstate->indexorderbyorig = (List *)
+ ExecInitExpr((Expr *) node->indexorderbyorig,
+ (PlanState *) indexstate);
/*
* tuple table initialization
@@ -581,6 +880,52 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
NULL, /* no ArrayKeys */
NULL);
+ /* Initialize sort support, if we need to re-check ORDER BY exprs */
+ if (indexstate->iss_NumOrderByKeys > 0)
+ {
+ int numOrderByKeys = indexstate->iss_NumOrderByKeys;
+
+ /*
+ * Prepare sort support, and look up the distance type for each
+ * ORDER BY expression.
+ */
+ indexstate->iss_SortSupport =
+ palloc0(numOrderByKeys * sizeof(SortSupportData));
+ indexstate->iss_OrderByTypByVals =
+ palloc(numOrderByKeys * sizeof(bool));
+ indexstate->iss_OrderByTypLens =
+ palloc(numOrderByKeys * sizeof(int16));
+ for (i = 0; i < indexstate->iss_NumOrderByKeys; i++)
+ {
+ Oid orderbyType;
+ Oid opfamily;
+ int16 strategy;
+
+ PrepareSortSupportFromOrderingOp(node->indexorderbyops[i],
+ &indexstate->iss_SortSupport[i]);
+
+ if (!get_ordering_op_properties(node->indexorderbyops[i],
+ &opfamily, &orderbyType, &strategy))
+ {
+ elog(LOG, "operator %u is not a valid ordering operator",
+ node->indexorderbyops[i]);
+ }
+ get_typlenbyval(orderbyType,
+ &indexstate->iss_OrderByTypLens[i],
+ &indexstate->iss_OrderByTypByVals[i]);
+ }
+
+ /* allocate arrays to hold the re-calculated distances */
+ indexstate->iss_OrderByValues =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(Datum));
+ indexstate->iss_OrderByNulls =
+ palloc(indexstate->iss_NumOrderByKeys * sizeof(bool));
+
+ /* and initialize the reorder queue */
+ indexstate->iss_ReorderQueue = pairingheap_allocate(reorderbuffer_cmp,
+ indexstate);
+ }
+
/*
* If we have runtime keys, we need an ExprContext to evaluate them. The
* node's standard context won't do because we want to reset that context
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index c809237..6403a7b 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -22,6 +22,7 @@
#include "access/skey.h"
#include "access/sysattr.h"
#include "catalog/pg_class.h"
+#include "catalog/pg_operator.h"
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "nodes/makefuncs.h"
@@ -102,7 +103,7 @@ static void copy_plan_costsize(Plan *dest, Plan *src);
static SeqScan *make_seqscan(List *qptlist, List *qpqual, Index scanrelid);
static IndexScan *make_indexscan(List *qptlist, List *qpqual, Index scanrelid,
Oid indexid, List *indexqual, List *indexqualorig,
- List *indexorderby, List *indexorderbyorig,
+ List *indexorderby, List *indexorderbyorig, Oid *indexorderbyops,
ScanDirection indexscandir);
static IndexOnlyScan *make_indexonlyscan(List *qptlist, List *qpqual,
Index scanrelid, Oid indexid,
@@ -167,8 +168,8 @@ static Plan *prepare_sort_from_pathkeys(PlannerInfo *root,
Oid **p_sortOperators,
Oid **p_collations,
bool **p_nullsFirst);
-static EquivalenceMember *find_ec_member_for_tle(EquivalenceClass *ec,
- TargetEntry *tle,
+static EquivalenceMember *find_ec_member_for_expr(EquivalenceClass *ec,
+ Expr *tlexpr,
Relids relids);
static Material *make_material(Plan *lefttree);
@@ -1158,6 +1159,7 @@ create_indexscan_plan(PlannerInfo *root,
List *stripped_indexquals;
List *fixed_indexquals;
List *fixed_indexorderbys;
+ Oid *indexorderbyops = NULL;
ListCell *l;
/* it should be a base rel... */
@@ -1269,6 +1271,42 @@ create_indexscan_plan(PlannerInfo *root,
replace_nestloop_params(root, (Node *) indexorderbys);
}
+ if (best_path->path.pathkeys && indexorderbys)
+ {
+ int numOrderBys = list_length(indexorderbys);
+ int i;
+ ListCell *pathkeyCell,
+ *exprCell;
+ PathKey *pathkey;
+ Expr *expr;
+ EquivalenceMember *em;
+
+ indexorderbyops = (Oid *) palloc(numOrderBys * sizeof(Oid));
+
+ /*
+ * Get ordering operator for each pathkey. Pathkey contains pointer
+ * to equivalence class. But it's not enough because we need the
+ * expression datatype to lookup opfamily member. That's why we've
+ * to dig and equivalence member.
+ */
+ i = 0;
+ forboth (pathkeyCell, best_path->path.pathkeys, exprCell, indexorderbys)
+ {
+ pathkey = (PathKey *) lfirst(pathkeyCell);
+ expr = (Expr *) lfirst(exprCell);
+
+ /* Find equivalence member for the order by expression */
+ em = find_ec_member_for_expr(pathkey->pk_eclass, expr, NULL);
+
+ /* Get sort operator from opfamily */
+ indexorderbyops[i] = get_opfamily_member(pathkey->pk_opfamily,
+ em->em_datatype,
+ em->em_datatype,
+ pathkey->pk_strategy);
+ i++;
+ }
+ }
+
/* Finally ready to build the plan node */
if (indexonly)
scan_plan = (Scan *) make_indexonlyscan(tlist,
@@ -1288,6 +1326,7 @@ create_indexscan_plan(PlannerInfo *root,
stripped_indexquals,
fixed_indexorderbys,
indexorderbys,
+ indexorderbyops,
best_path->indexscandir);
copy_path_costsize(&scan_plan->plan, &best_path->path);
@@ -3344,6 +3383,7 @@ make_indexscan(List *qptlist,
List *indexqualorig,
List *indexorderby,
List *indexorderbyorig,
+ Oid *indexorderbyops,
ScanDirection indexscandir)
{
IndexScan *node = makeNode(IndexScan);
@@ -3360,6 +3400,7 @@ make_indexscan(List *qptlist,
node->indexqualorig = indexqualorig;
node->indexorderby = indexorderby;
node->indexorderbyorig = indexorderbyorig;
+ node->indexorderbyops = indexorderbyops;
node->indexorderdir = indexscandir;
return node;
@@ -3990,7 +4031,7 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
tle = get_tle_by_resno(tlist, reqColIdx[numsortkeys]);
if (tle)
{
- em = find_ec_member_for_tle(ec, tle, relids);
+ em = find_ec_member_for_expr(ec, tle->expr, relids);
if (em)
{
/* found expr at right place in tlist */
@@ -4021,7 +4062,7 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
foreach(j, tlist)
{
tle = (TargetEntry *) lfirst(j);
- em = find_ec_member_for_tle(ec, tle, relids);
+ em = find_ec_member_for_expr(ec, tle->expr, relids);
if (em)
{
/* found expr already in tlist */
@@ -4142,23 +4183,21 @@ prepare_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys,
}
/*
- * find_ec_member_for_tle
- * Locate an EquivalenceClass member matching the given TLE, if any
+ * find_ec_member_for_expr
+ * Locate an EquivalenceClass member matching the given expression, if any
*
* Child EC members are ignored unless they match 'relids'.
*/
static EquivalenceMember *
-find_ec_member_for_tle(EquivalenceClass *ec,
- TargetEntry *tle,
- Relids relids)
+find_ec_member_for_expr(EquivalenceClass *ec,
+ Expr *expr,
+ Relids relids)
{
- Expr *tlexpr;
ListCell *lc;
/* We ignore binary-compatible relabeling on both ends */
- tlexpr = tle->expr;
- while (tlexpr && IsA(tlexpr, RelabelType))
- tlexpr = ((RelabelType *) tlexpr)->arg;
+ while (expr && IsA(expr, RelabelType))
+ expr = ((RelabelType *) expr)->arg;
foreach(lc, ec->ec_members)
{
@@ -4184,7 +4223,7 @@ find_ec_member_for_tle(EquivalenceClass *ec,
while (emexpr && IsA(emexpr, RelabelType))
emexpr = ((RelabelType *) emexpr)->arg;
- if (equal(emexpr, tlexpr))
+ if (equal(emexpr, expr))
return em;
}
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index 39a7855..a3af08a 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -2657,6 +2657,18 @@ dist_ppoly(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(result);
}
+Datum
+dist_polyp(PG_FUNCTION_ARGS)
+{
+ POLYGON *poly = PG_GETARG_POLYGON_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = dist_ppoly_internal(point, poly);
+
+ PG_RETURN_FLOAT8(result);
+}
+
static double
dist_ppoly_internal(Point *pt, POLYGON *poly)
{
@@ -5112,6 +5124,21 @@ dist_pc(PG_FUNCTION_ARGS)
PG_RETURN_FLOAT8(result);
}
+/*
+ * Distance from a circle to a point
+ */
+Datum
+dist_cpoint(PG_FUNCTION_ARGS)
+{
+ CIRCLE *circle = PG_GETARG_CIRCLE_P(0);
+ Point *point = PG_GETARG_POINT_P(1);
+ float8 result;
+
+ result = point_dt(point, &circle->center) - circle->radius;
+ if (result < 0)
+ result = 0;
+ PG_RETURN_FLOAT8(result);
+}
/* circle_center - returns the center point of the circle.
*/
diff --git a/src/include/access/genam.h b/src/include/access/genam.h
index d86590a..f129c4b 100644
--- a/src/include/access/genam.h
+++ b/src/include/access/genam.h
@@ -147,7 +147,10 @@ extern void index_restrpos(IndexScanDesc scan);
extern ItemPointer index_getnext_tid(IndexScanDesc scan,
ScanDirection direction);
extern HeapTuple index_fetch_heap(IndexScanDesc scan);
+extern bool index_get_heap_values(IndexScanDesc scan, ItemPointer heapPtr,
+ Datum values[INDEX_MAX_KEYS], bool isnull[INDEX_MAX_KEYS]);
extern HeapTuple index_getnext(IndexScanDesc scan, ScanDirection direction);
+
extern int64 index_getbitmap(IndexScanDesc scan, TIDBitmap *bitmap);
extern IndexBulkDeleteResult *index_bulk_delete(IndexVacuumInfo *info,
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index 9bb6362..7ee0206 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -91,6 +91,15 @@ typedef struct IndexScanDescData
/* NB: if xs_cbuf is not InvalidBuffer, we hold a pin on that buffer */
bool xs_recheck; /* T means scan keys must be rechecked */
+ /*
+ * If fetching with an ordering operator, the values of the ORDER BY
+ * expression of the last returned heap tuple according to the index. If
+ * xs_recheck is true, this needs to be rechecked just like the scan keys,
+ * and the value returned here is a lower-bound on the actual value.
+ */
+ Datum *xs_orderbyvals;
+ bool *xs_orderbynulls;
+
/* state data for traversing HOT chains in index_getnext */
bool xs_continue_hot; /* T if must keep walking HOT chain */
} IndexScanDescData;
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index a350832..b6a6da9 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -53,6 +53,6 @@
*/
/* yyyymmddN */
-#define CATALOG_VERSION_NO 201505121
+#define CATALOG_VERSION_NO 201505151
#endif
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 5aab896..ed44e05 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -650,6 +650,7 @@ DATA(insert ( 2594 604 604 11 s 2577 783 0 ));
DATA(insert ( 2594 604 604 12 s 2576 783 0 ));
DATA(insert ( 2594 604 604 13 s 2861 783 0 ));
DATA(insert ( 2594 604 604 14 s 2860 783 0 ));
+DATA(insert ( 2594 604 600 15 o 4588 783 1970 ));
/*
* gist circle_ops
@@ -669,6 +670,7 @@ DATA(insert ( 2595 718 718 11 s 1514 783 0 ));
DATA(insert ( 2595 718 718 12 s 2590 783 0 ));
DATA(insert ( 2595 718 718 13 s 2865 783 0 ));
DATA(insert ( 2595 718 718 14 s 2864 783 0 ));
+DATA(insert ( 2595 718 600 15 o 4586 783 1970 ));
/*
* gin array_ops (these anyarray operators are used with all the opclasses
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index e3de3b5..2aca30c 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -208,6 +208,7 @@ DATA(insert ( 2594 604 604 4 2580 ));
DATA(insert ( 2594 604 604 5 2581 ));
DATA(insert ( 2594 604 604 6 2582 ));
DATA(insert ( 2594 604 604 7 2584 ));
+DATA(insert ( 2594 604 604 8 4589 ));
DATA(insert ( 2595 718 718 1 2591 ));
DATA(insert ( 2595 718 718 2 2583 ));
DATA(insert ( 2595 718 718 3 2592 ));
@@ -215,6 +216,7 @@ DATA(insert ( 2595 718 718 4 2580 ));
DATA(insert ( 2595 718 718 5 2581 ));
DATA(insert ( 2595 718 718 6 2582 ));
DATA(insert ( 2595 718 718 7 2584 ));
+DATA(insert ( 2595 718 718 8 4589 ));
DATA(insert ( 3655 3614 3614 1 3654 ));
DATA(insert ( 3655 3614 3614 2 3651 ));
DATA(insert ( 3655 3614 3614 3 3648 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index 34ebb50..8f08543 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -1015,9 +1015,13 @@ DATA(insert OID = 1520 ( "<->" PGNSP PGUID b f f 718 718 701 1520 0 ci
DESCR("distance between");
DATA(insert OID = 1521 ( "#" PGNSP PGUID l f f 0 604 23 0 0 poly_npoints - - ));
DESCR("number of points");
-DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 0 0 dist_pc - - ));
+DATA(insert OID = 1522 ( "<->" PGNSP PGUID b f f 600 718 701 4586 0 dist_pc - - ));
DESCR("distance between");
-DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 0 0 dist_ppoly - - ));
+DATA(insert OID = 4586 ( "<->" PGNSP PGUID b f f 718 600 701 1522 0 dist_cpoint - - ));
+DESCR("distance between");
+DATA(insert OID = 3276 ( "<->" PGNSP PGUID b f f 600 604 701 4588 0 dist_ppoly - - ));
+DESCR("distance between");
+DATA(insert OID = 4588 ( "<->" PGNSP PGUID b f f 604 600 701 3276 0 dist_polyp - - ));
DESCR("distance between");
DATA(insert OID = 1523 ( "<->" PGNSP PGUID b f f 718 604 701 0 0 dist_cpoly - - ));
DESCR("distance between");
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index a1e2442..f0a8cd7 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -856,6 +856,8 @@ DATA(insert OID = 727 ( dist_sl PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 70
DATA(insert OID = 728 ( dist_cpoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 604" _null_ _null_ _null_ _null_ _null_ dist_cpoly _null_ _null_ _null_ ));
DATA(insert OID = 729 ( poly_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 604" _null_ _null_ _null_ _null_ _null_ poly_distance _null_ _null_ _null_ ));
DATA(insert OID = 3275 ( dist_ppoly PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "600 604" _null_ _null_ _null_ _null_ _null_ dist_ppoly _null_ _null_ _null_ ));
+DATA(insert OID = 4587 ( dist_polyp PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "604 600" _null_ _null_ _null_ _null_ _null_ dist_polyp _null_ _null_ _null_ ));
+DATA(insert OID = 4585 ( dist_cpoint PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 701 "718 600" _null_ _null_ _null_ _null_ _null_ dist_cpoint _null_ _null_ _null_ ));
DATA(insert OID = 740 ( text_lt PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ _null_ text_lt _null_ _null_ _null_ ));
DATA(insert OID = 741 ( text_le PGNSP PGUID 12 1 0 0 0 f f f f t f i 2 0 16 "25 25" _null_ _null_ _null_ _null_ _null_ text_le _null_ _null_ _null_ ));
@@ -4165,6 +4167,8 @@ DATA(insert OID = 2179 ( gist_point_consistent PGNSP PGUID 12 1 0 0 0 f f f f t
DESCR("GiST support");
DATA(insert OID = 3064 ( gist_point_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ _null_ gist_point_distance _null_ _null_ _null_ ));
DESCR("GiST support");
+DATA(insert OID = 4589 ( gist_bbox_distance PGNSP PGUID 12 1 0 0 0 f f f f t f i 4 0 701 "2281 600 23 26" _null_ _null_ _null_ _null_ _null_ gist_bbox_distance _null_ _null_ _null_ ));
+DESCR("GiST support");
/* GIN */
DATA(insert OID = 2731 ( gingetbitmap PGNSP PGUID 12 1 0 0 0 f f f f t f v 2 0 20 "2281 2281" _null_ _null_ _null_ _null_ _null_ gingetbitmap _null_ _null_ _null_ ));
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5ad2cc2..95e8a7f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -17,6 +17,7 @@
#include "access/genam.h"
#include "access/heapam.h"
#include "executor/instrument.h"
+#include "lib/pairingheap.h"
#include "nodes/params.h"
#include "nodes/plannodes.h"
#include "utils/reltrigger.h"
@@ -1262,6 +1263,7 @@ typedef struct
* IndexScanState information
*
* indexqualorig execution state for indexqualorig expressions
+ * indexorderbyorig execution state for indexorderbyorig expressions
* ScanKeys Skey structures for index quals
* NumScanKeys number of ScanKeys
* OrderByKeys Skey structures for index ordering operators
@@ -1272,12 +1274,20 @@ typedef struct
* RuntimeContext expr context for evaling runtime Skeys
* RelationDesc index relation descriptor
* ScanDesc index scan descriptor
+ *
+ * ReorderQueue tuples that need reordering due to re-check
+ * ReachedEnd have we fetched all tuples from index already?
+ * OrderByValues values of ORDER BY exprs of last fetched tuple
+ * SortSupport for reordering ORDER BY exprs
+ * OrderByTypByVals is the datatype of order by expression pass-by-value?
+ * OrderByTypLens typlens of the datatypes of order by expressions
* ----------------
*/
typedef struct IndexScanState
{
ScanState ss; /* its first field is NodeTag */
List *indexqualorig;
+ List *indexorderbyorig;
ScanKey iss_ScanKeys;
int iss_NumScanKeys;
ScanKey iss_OrderByKeys;
@@ -1288,6 +1298,15 @@ typedef struct IndexScanState
ExprContext *iss_RuntimeContext;
Relation iss_RelationDesc;
IndexScanDesc iss_ScanDesc;
+
+ /* These are needed for re-checking ORDER BY expr ordering */
+ pairingheap *iss_ReorderQueue;
+ bool iss_ReachedEnd;
+ Datum *iss_OrderByValues;
+ bool *iss_OrderByNulls;
+ SortSupport iss_SortSupport;
+ bool *iss_OrderByTypByVals;
+ int16 *iss_OrderByTypLens;
} IndexScanState;
/* ----------------
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 1494b33..9c02e20 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -311,8 +311,13 @@ typedef Scan SeqScan;
* index column order. Only the expressions are provided, not the auxiliary
* sort-order information from the ORDER BY SortGroupClauses; it's assumed
* that the sort ordering is fully determinable from the top-level operators.
- * indexorderbyorig is unused at run time, but is needed for EXPLAIN.
- * (Note these fields are used for amcanorderbyop cases, not amcanorder cases.)
+ * indexorderbyorig is used at runtime to recheck the ordering, if the index
+ * cannot calculate an accurate ordering. It is also needed for EXPLAIN.
+ *
+ * indexorderbyops is an array of operators used to sort the ORDER BY
+ * expressions, used together with indexorderbyorig to recheck ordering at run
+ * time. (Note these fields are used for amcanorderbyop cases, not amcanorder
+ * cases.)
*
* indexorderdir specifies the scan ordering, for indexscans on amcanorder
* indexes (for other indexes it should be "don't care").
@@ -325,7 +330,8 @@ typedef struct IndexScan
List *indexqual; /* list of index quals (usually OpExprs) */
List *indexqualorig; /* the same in original form */
List *indexorderby; /* list of index ORDER BY exprs */
- List *indexorderbyorig; /* the same in original form */
+ List *indexorderbyorig; /* the same in original form */
+ Oid *indexorderbyops; /* operators to sort ORDER BY exprs */
ScanDirection indexorderdir; /* forward or backward or don't care */
} IndexScan;
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
index 4377baa..2311d35 100644
--- a/src/include/utils/geo_decls.h
+++ b/src/include/utils/geo_decls.h
@@ -394,8 +394,10 @@ extern Datum circle_diameter(PG_FUNCTION_ARGS);
extern Datum circle_radius(PG_FUNCTION_ARGS);
extern Datum circle_distance(PG_FUNCTION_ARGS);
extern Datum dist_pc(PG_FUNCTION_ARGS);
+extern Datum dist_cpoint(PG_FUNCTION_ARGS);
extern Datum dist_cpoly(PG_FUNCTION_ARGS);
extern Datum dist_ppoly(PG_FUNCTION_ARGS);
+extern Datum dist_polyp(PG_FUNCTION_ARGS);
extern Datum circle_center(PG_FUNCTION_ARGS);
extern Datum cr_circle(PG_FUNCTION_ARGS);
extern Datum box_circle(PG_FUNCTION_ARGS);
@@ -420,6 +422,7 @@ extern Datum gist_circle_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_compress(PG_FUNCTION_ARGS);
extern Datum gist_point_consistent(PG_FUNCTION_ARGS);
extern Datum gist_point_distance(PG_FUNCTION_ARGS);
+extern Datum gist_bbox_distance(PG_FUNCTION_ARGS);
extern Datum gist_point_fetch(PG_FUNCTION_ARGS);
diff --git a/src/test/regress/expected/create_index.out b/src/test/regress/expected/create_index.out
index abe64e5..a95fe29 100644
--- a/src/test/regress/expected/create_index.out
+++ b/src/test/regress/expected/create_index.out
@@ -372,6 +372,36 @@ SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth
48
(1 row)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+-------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+(10 rows)
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+-----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+(10 rows)
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
@@ -1152,6 +1182,54 @@ SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth
48
(1 row)
+EXPLAIN (COSTS OFF)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ QUERY PLAN
+-----------------------------------------------------
+ Limit
+ -> Index Scan using ggpolygonind on gpolygon_tbl
+ Order By: (f1 <-> '(0,0)'::point)
+(3 rows)
+
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+ f1
+-------------------------------------------------
+ ((240,359),(240,455),(337,455),(337,359))
+ ((662,163),(662,187),(759,187),(759,163))
+ ((1000,0),(0,1000))
+ ((0,1000),(1000,1000))
+ ((1346,344),(1346,403),(1444,403),(1444,344))
+ ((278,1409),(278,1457),(369,1457),(369,1409))
+ ((907,1156),(907,1201),(948,1201),(948,1156))
+ ((1517,971),(1517,1043),(1594,1043),(1594,971))
+ ((175,1820),(175,1850),(259,1850),(259,1820))
+ ((2424,81),(2424,160),(2424,160),(2424,81))
+(10 rows)
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ QUERY PLAN
+---------------------------------------------------
+ Limit
+ -> Index Scan using ggcircleind on gcircle_tbl
+ Order By: (f1 <-> '(200,300)'::point)
+(3 rows)
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+ f1
+-----------------------------------
+ <(288.5,407),68.2367203197809>
+ <(710.5,175),49.9624859269432>
+ <(323.5,1433),51.4417145903983>
+ <(927.5,1178.5),30.4384625104489>
+ <(1395,373.5),57.1948424248201>
+ <(1555.5,1007),52.7091073724456>
+ <(217,1835),44.5982062419555>
+ <(489,2421.5),22.3886131772381>
+ <(2424,120.5),39.5>
+ <(751.5,2655),20.4022057631032>
+(10 rows)
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
diff --git a/src/test/regress/sql/create_index.sql b/src/test/regress/sql/create_index.sql
index f779fa0..5df9008 100644
--- a/src/test/regress/sql/create_index.sql
+++ b/src/test/regress/sql/create_index.sql
@@ -224,6 +224,10 @@ SELECT count(*) FROM radix_text_tbl WHERE t > 'Worth
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from plain indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = ON;
@@ -437,6 +441,14 @@ EXPLAIN (COSTS OFF)
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
SELECT count(*) FROM radix_text_tbl WHERE t ~>~ 'Worth St ';
+EXPLAIN (COSTS OFF)
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+SELECT * FROM gpolygon_tbl ORDER BY f1 <-> '(0,0)'::point LIMIT 10;
+
+EXPLAIN (COSTS OFF)
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+SELECT * FROM gcircle_tbl ORDER BY f1 <-> '(200,300)'::point LIMIT 10;
+
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
--
2.1.4
On Fri, May 15, 2015 at 2:30 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.Forgot to attach the latest patch, here you go.
Looks good for me.
------
With best regards,
Alexander Korotkov.
On 05/15/2015 11:31 AM, Alexander Korotkov wrote:
On Fri, May 15, 2015 at 2:30 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.Forgot to attach the latest patch, here you go.
Looks good for me.
Ok, pushed after some further minor cleanup.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 15, 2015 at 2:48 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/15/2015 11:31 AM, Alexander Korotkov wrote:
On Fri, May 15, 2015 at 2:30 AM, Heikki Linnakangas <hlinnaka@iki.fi>
wrote:On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.Forgot to attach the latest patch, here you go.
Looks good for me.
Ok, pushed after some further minor cleanup.
Great! Thank you!
------
With best regards,
Alexander Korotkov.
On Fri, May 15, 2015 at 02:48:29PM +0300, Heikki Linnakangas wrote:
On 05/15/2015 11:31 AM, Alexander Korotkov wrote:
On Fri, May 15, 2015 at 2:30 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.Forgot to attach the latest patch, here you go.
Looks good for me.
Ok, pushed after some further minor cleanup.
Great! That PostGIS workaround they had to use for accurate distances
with CTEs and LIMIT 100 was an ugly hack.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ Everyone has their own god. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 5/14/15 6:30 PM, Heikki Linnakangas wrote:
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.
If anyone feels motivated to fix, there's a typo in the comment for
IndexNextWithReorder (s/his/this/):
+ * Like IndexNext, but his version can also re-check any
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, May 15, 2015 at 2:49 PM, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
On Fri, May 15, 2015 at 2:48 PM, Heikki Linnakangas <hlinnaka@iki.fi>
wrote:On 05/15/2015 11:31 AM, Alexander Korotkov wrote:
On Fri, May 15, 2015 at 2:30 AM, Heikki Linnakangas <hlinnaka@iki.fi>
wrote:On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.Forgot to attach the latest patch, here you go.
Looks good for me.
Ok, pushed after some further minor cleanup.
Great! Thank you!
BTW, I found that now IndexScan node lackof copy and output support for
indexorderbyops.
Attached patch fixes that. Copy and output functions assume that
indexorderbyops has the same length as indexorderby. In order to make this
more evident I move check for best_path->path.pathkeys in create_plan from
"if" into assertion. AFAICS, pathkeys should always present where there are
indexorderby.
------
With best regards,
Alexander Korotkov.
Attachments:
fix-indexscan-node.patchapplication/octet-stream; name=fix-indexscan-node.patchDownload
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
new file mode 100644
index 25839ee..522d8e1
*** a/src/backend/nodes/copyfuncs.c
--- b/src/backend/nodes/copyfuncs.c
*************** static IndexScan *
*** 367,372 ****
--- 367,373 ----
_copyIndexScan(const IndexScan *from)
{
IndexScan *newnode = makeNode(IndexScan);
+ int numOrderbys;
/*
* copy node superclass fields
*************** _copyIndexScan(const IndexScan *from)
*** 381,386 ****
--- 382,390 ----
COPY_NODE_FIELD(indexqualorig);
COPY_NODE_FIELD(indexorderby);
COPY_NODE_FIELD(indexorderbyorig);
+ numOrderbys = list_length(from->indexorderby);
+ if (numOrderbys > 0)
+ COPY_POINTER_FIELD(indexorderbyops, numOrderbys * sizeof(Oid));
COPY_SCALAR_FIELD(indexorderdir);
return newnode;
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
new file mode 100644
index fe868b8..9b71d2b
*** a/src/backend/nodes/outfuncs.c
--- b/src/backend/nodes/outfuncs.c
*************** _outSeqScan(StringInfo str, const SeqSca
*** 448,453 ****
--- 448,455 ----
static void
_outIndexScan(StringInfo str, const IndexScan *node)
{
+ int numOrderbys, i;
+
WRITE_NODE_TYPE("INDEXSCAN");
_outScanInfo(str, (const Scan *) node);
*************** _outIndexScan(StringInfo str, const Inde
*** 457,462 ****
--- 459,471 ----
WRITE_NODE_FIELD(indexqualorig);
WRITE_NODE_FIELD(indexorderby);
WRITE_NODE_FIELD(indexorderbyorig);
+
+ numOrderbys = list_length(node->indexorderby);
+ appendStringInfoString(str, " :indexorderbyops");
+ for (i = 0; i < numOrderbys; i++)
+ appendStringInfo(str, " %u", node->indexorderbyops[i]);
+
+ WRITE_NODE_FIELD(indexorderby);
WRITE_ENUM_FIELD(indexorderdir, ScanDirection);
}
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
new file mode 100644
index 783e34b..1b28b4b
*** a/src/backend/optimizer/plan/createplan.c
--- b/src/backend/optimizer/plan/createplan.c
*************** create_indexscan_plan(PlannerInfo *root,
*** 1275,1281 ****
* If there are ORDER BY expressions, look up the sort operators for
* their datatypes.
*/
! if (best_path->path.pathkeys && indexorderbys)
{
int numOrderBys = list_length(indexorderbys);
int i;
--- 1275,1281 ----
* If there are ORDER BY expressions, look up the sort operators for
* their datatypes.
*/
! if (indexorderbys)
{
int numOrderBys = list_length(indexorderbys);
int i;
*************** create_indexscan_plan(PlannerInfo *root,
*** 1285,1290 ****
--- 1285,1293 ----
Expr *expr;
EquivalenceMember *em;
+ /* indexorderbys should present only simultaneously with pathkeys */
+ Assert(best_path->path.pathkeys);
+
indexorderbyops = (Oid *) palloc(numOrderBys * sizeof(Oid));
/*
On 05/16/2015 12:42 AM, Jim Nasby wrote:
On 5/14/15 6:30 PM, Heikki Linnakangas wrote:
On 05/15/2015 02:28 AM, Heikki Linnakangas wrote:
I think this is now ready for committing, but I'm pretty tired now so
I'll read through this one more time in the morning, so that I won't
wake up to a red buildfarm.If anyone feels motivated to fix, there's a typo in the comment for
IndexNextWithReorder (s/his/this/):
+ * Like IndexNext, but his version can also re-check any
Fixed, thanks.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers