Re: compress method for spgist - 2

Started by Teodor Sigaevabout 11 years ago39 messages
#1Teodor Sigaev
teodor@sigaev.ru
1 attachment(s)

For some datatypes, the compress method might be useful even if the leaf
type is the same as the column type. For example, you could allow
indexing text datums larger than the page size, with a compress function
that just truncates the input.

Agree, and patch allows to use compress method in this case, see begining of
spgdoinsert()

Could you find some use for this in one of the built-in or contrib
types? Just to have something that exercises it as part of the
regression suite. How about creating an opclass for the built-in polygon
type that stores the bounding box, like the PostGIS guys are doing?

Will try, but I don't have nice idea. Polygon opclass will have awful
performance until PostGIS guys show the tree structure.

The documentation needs to be updated.

Added.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

spgist_compress_method-3.patch.gzapplication/x-gzip; name=spgist_compress_method-3.patch.gzDownload
#2Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Teodor Sigaev (#1)
1 attachment(s)

On 12/16/2014 07:48 PM, Teodor Sigaev wrote:

/*
* This struct is what we actually keep in index->rd_amcache. It includes
* static configuration information as well as the lastUsedPages cache.
*/
typedef struct SpGistCache
{
spgConfigOut config; /* filled in by opclass config method */

SpGistTypeDesc attType; /* type of input data and leaf values */
SpGistTypeDesc attPrefixType; /* type of inner-tuple prefix values */
SpGistTypeDesc attLabelType; /* type of node label values */

SpGistLUPCache lastUsedPages; /* local storage of last-used info */
} SpGistCache;

Now that the input data type and leaf data type can be different, which
one is "attType"? It's the leaf data type, as the patch stands. I
renamed that to attLeafType, and went fixing all the references to it.
In most places it's just a matter of search & replace, but what about
the reconstructed datum? In freeScanStackEntry, we assume that
att[Leaf]Type is the datatype for reconstructedValue, but I believe
assume elsewhere that reconstructedValue is of the same data type as the
input. At least if the opclass supports index-only scans.

I think we'll need a separate SpGistTypeDesc for the input type. Or
perhaps a separate SpGistTypeDesc for the reconstructed value and an
optional decompress method to turn the reconstructedValue back into an
actual reconstructed input datum. Or something like that.

Attached is a patch with the kibitzing I did so far.

- Heikki

Attachments:

spgist_compress_method-4-heikki.patchtext/x-diff; name=spgist_compress_method-4-heikki.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 56827e5..de158c3 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -201,20 +201,21 @@
 
  <para>
   There are five user-defined methods that an index operator class for
-  <acronym>SP-GiST</acronym> must provide.  All five follow the convention
-  of accepting two <type>internal</> arguments, the first of which is a
-  pointer to a C struct containing input values for the support method,
-  while the second argument is a pointer to a C struct where output values
-  must be placed.  Four of the methods just return <type>void</>, since
-  all their results appear in the output struct; but
+  <acronym>SP-GiST</acronym> must provide and one optional. All five mandatory
+  methos follow the convention of accepting two <type>internal</> arguments,
+  the first of which is a pointer to a C struct containing input values for 
+  the support method, while the second argument is a pointer to a C struct 
+  where output values must be placed.  Four of the methods just return 
+  <type>void</>, since all their results appear in the output struct; but
   <function>leaf_consistent</> additionally returns a <type>boolean</> result.
   The methods must not modify any fields of their input structs.  In all
   cases, the output struct is initialized to zeroes before calling the
-  user-defined method.
+  user-defined method. Optional method <function>compress</> accepts
+  datum to be indexed and returns values which actually will be indexed.  
  </para>
 
  <para>
-  The five user-defined methods are:
+  The five mandatory user-defined methods are:
  </para>
 
  <variablelist>
@@ -244,6 +245,7 @@ typedef struct spgConfigOut
 {
     Oid         prefixType;     /* Data type of inner-tuple prefixes */
     Oid         labelType;      /* Data type of inner-tuple node labels */
+    Oid         leafType;       /* Data type of leaf */
     bool        canReturnData;  /* Opclass can reconstruct original data */
     bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
 } spgConfigOut;
@@ -264,7 +266,15 @@ typedef struct spgConfigOut
       <structfield>longValuesOK</> should be set true only when the
       <structfield>attType</> is of variable length and the operator
       class is capable of segmenting long values by repeated suffixing
-      (see <xref linkend="spgist-limits">).
+      (see <xref linkend="spgist-limits">). <structfield>leafType</>
+      usually has the same value as <structfield>attType</> but if
+      it's different then optional method  <function>compress</>
+      should be provided. Method  <function>compress</> is responsible
+      for transformation from <structfield>attType</> to 
+      <structfield>leafType</>. In this case all other function should
+      accept <structfield>leafType</> values. Note: both consistent
+      functions will get <structfield>scankeys</> unchanged, without
+      <function>compress</> transformation.
      </para>
      </listitem>
     </varlistentry>
@@ -690,6 +700,24 @@ typedef struct spgLeafConsistentOut
     </varlistentry>
    </variablelist>
 
+ <para>
+  The optional user-defined method is:
+ </para>
+
+ <variablelist>
+    <varlistentry>
+     <term><function>Datum compress(Datum in)</></term>
+     <listitem>
+      <para>
+       Converts the data item into a format suitable for physical storage in 
+       an index page. It accepts <structname>spgConfigIn</>.<structfield>attType</>
+       value and return <structname>spgConfigOut</>.<structfield>leafType</>
+       value. Output value should not be toasted.
+      </para>
+     </listitem>
+    </varlistentry>
+  </variablelist>
+
   <para>
    All the SP-GiST support methods are normally called in a short-lived
    memory context; that is, <varname>CurrentMemoryContext</> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index 1a17cc4..06c0680 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1854,21 +1854,36 @@ spgdoinsert(Relation index, SpGistState *state,
 	FmgrInfo   *procinfo = NULL;
 
 	/*
-	 * Look up FmgrInfo of the user-defined choose function once, to save
-	 * cycles in the loop below.
+	 * Prepare the leaf datum to insert.
+	 *
+	 * If there is an optional "compress" method, call it to form the leaf
+	 * datum from the input datum. Otherwise we will store the input datum as
+	 * is. (We have to detoast it, though. We assume the "compress" method to
+	 * return an untoasted value.)
 	 */
 	if (!isnull)
-		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
+	{
+		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+		{
+			procinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
+			leafDatum = FunctionCall1Coll(procinfo,
+										  index->rd_indcollation[0],
+										  datum);
+		}
+		else if (state->attLeafType.attlen == -1)
+			leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+		else
+			leafDatum = datum;
+	}
+	else
+		leafDatum = (Datum) 0;
 
 	/*
-	 * Since we don't use index_form_tuple in this AM, we have to make sure
-	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
-	 * that.
+	 * Look up FmgrInfo of the user-defined choose function once, to save
+	 * cycles in the loop below.
 	 */
-	if (!isnull && state->attType.attlen == -1)
-		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
-
-	leafDatum = datum;
+	if (!isnull)
+		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
 
 	/*
 	 * Compute space needed for a leaf tuple containing the given datum.
@@ -1878,7 +1893,7 @@ spgdoinsert(Relation index, SpGistState *state,
 	 */
 	if (!isnull)
 		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-			SpGistGetTypeSize(&state->attType, leafDatum);
+			SpGistGetTypeSize(&state->attLeafType, leafDatum);
 	else
 		leafSize = SGDTSIZE + sizeof(ItemIdData);
 
@@ -2093,7 +2108,7 @@ spgdoinsert(Relation index, SpGistState *state,
 					{
 						leafDatum = out.result.matchNode.restDatum;
 						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-							SpGistGetTypeSize(&state->attType, leafDatum);
+							SpGistGetTypeSize(&state->attLeafType, leafDatum);
 					}
 
 					/*
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 35cc41b..a4c5592 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -39,7 +39,8 @@ typedef struct ScanStackEntry
 static void
 freeScanStackEntry(SpGistScanOpaque so, ScanStackEntry *stackEntry)
 {
-	if (!so->state.attType.attbyval &&
+	/* FIXME: Is attLeafType correct for reconstructedValue? */
+	if (!so->state.attLeafType.attbyval &&
 		DatumGetPointer(stackEntry->reconstructedValue) != NULL)
 		pfree(DatumGetPointer(stackEntry->reconstructedValue));
 	pfree(stackEntry);
@@ -539,11 +540,14 @@ redirect:
 					else
 						newEntry->level = stackEntry->level;
 					/* Must copy value out of temp context */
+					/*
+					 * FIXME: this assumes that the leaf data type is the same
+					 * as the reconstructedValues datatype */
 					if (out.reconstructedValues)
 						newEntry->reconstructedValue =
 							datumCopy(out.reconstructedValues[i],
-									  so->state.attType.attbyval,
-									  so->state.attType.attlen);
+									  so->state.attLeafType.attbyval,
+									  so->state.attLeafType.attlen);
 					else
 						newEntry->reconstructedValue = (Datum) 0;
 
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 1a224ef..dcbb30c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -74,7 +74,21 @@ spgGetCache(Relation index)
 						  PointerGetDatum(&cache->config));
 
 		/* Get the information we need about each relevant datatype */
-		fillTypeDesc(&cache->attType, atttype);
+		if (OidIsValid(cache->config.leafType) &&
+			cache->config.leafType != atttype)
+		{
+			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("compress method must not defined when leaf type is different from input type")));
+
+			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
+		}
+		else
+		{
+			fillTypeDesc(&cache->attLeafType, atttype);
+		}
+
 		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
 		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
 
@@ -113,7 +127,7 @@ initSpGistState(SpGistState *state, Relation index)
 	cache = spgGetCache(index);
 
 	state->config = cache->config;
-	state->attType = cache->attType;
+	state->attLeafType = cache->attLeafType;
 	state->attPrefixType = cache->attPrefixType;
 	state->attLabelType = cache->attLabelType;
 
@@ -556,7 +570,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	/* compute space needed (note result is already maxaligned) */
 	size = SGLTHDRSZ;
 	if (!isnull)
-		size += SpGistGetTypeSize(&state->attType, datum);
+		size += SpGistGetTypeSize(&state->attLeafType, datum);
 
 	/*
 	 * Ensure that we can replace the tuple with a dead tuple later.  This
@@ -572,7 +586,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	tup->nextOffset = InvalidOffsetNumber;
 	tup->heapPtr = *heapPtr;
 	if (!isnull)
-		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
+		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
 
 	return tup;
 }
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
index 3aa96bd..bbf6d89 100644
--- a/src/include/access/spgist.h
+++ b/src/include/access/spgist.h
@@ -30,7 +30,8 @@
 #define SPGIST_PICKSPLIT_PROC			3
 #define SPGIST_INNER_CONSISTENT_PROC	4
 #define SPGIST_LEAF_CONSISTENT_PROC		5
-#define SPGISTNProc						5
+#define SPGIST_COMPRESS_PROC			6
+#define SPGISTNProc						6
 
 /*
  * Argument structs for spg_config method
@@ -44,6 +45,7 @@ typedef struct spgConfigOut
 {
 	Oid			prefixType;		/* Data type of inner-tuple prefixes */
 	Oid			labelType;		/* Data type of inner-tuple node labels */
+	Oid			leafType;		/* Data type of leaf (type of SPGIST_COMPRESS_PROC output) */
 	bool		canReturnData;	/* Opclass can reconstruct original data */
 	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
 } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 4b6fdee..8ae7e75 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -115,13 +115,13 @@ typedef struct SpGistTypeDesc
 
 typedef struct SpGistState
 {
-	spgConfigOut config;		/* filled in by opclass config method */
+	spgConfigOut config;			/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
-	SpGistTypeDesc attPrefixType;		/* type of inner-tuple prefix values */
+	SpGistTypeDesc attLeafType;		/* type of leaf values */
+	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
-	char	   *deadTupleStorage;		/* workspace for spgFormDeadTuple */
+	char	   *deadTupleStorage;	/* workspace for spgFormDeadTuple */
 
 	TransactionId myXid;		/* XID to use when creating a redirect tuple */
 	bool		isBuild;		/* true if doing index build */
@@ -174,13 +174,13 @@ typedef SpGistScanOpaqueData *SpGistScanOpaque;
  */
 typedef struct SpGistCache
 {
-	spgConfigOut config;		/* filled in by opclass config method */
+	spgConfigOut config;			/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
-	SpGistTypeDesc attPrefixType;		/* type of inner-tuple prefix values */
+	SpGistTypeDesc attLeafType;		/* type of leaf values */
+	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
-	SpGistLUPCache lastUsedPages;		/* local storage of last-used info */
+	SpGistLUPCache lastUsedPages;	/* local storage of last-used info */
 } SpGistCache;
 
 
@@ -298,7 +298,7 @@ typedef SpGistLeafTupleData *SpGistLeafTuple;
 
 #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
 #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
-#define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
+#define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
 							 *(Datum *) SGLTDATAPTR(x) : \
 							 PointerGetDatum(SGLTDATAPTR(x)))
 
diff --git a/src/include/catalog/pg_am.h b/src/include/catalog/pg_am.h
index 67b57cd..6427552 100644
--- a/src/include/catalog/pg_am.h
+++ b/src/include/catalog/pg_am.h
@@ -129,7 +129,7 @@ DESCR("GiST index access method");
 DATA(insert OID = 2742 (  gin		0 6 f f f f t t f f t f f 0 gininsert ginbeginscan - gingetbitmap ginrescan ginendscan ginmarkpos ginrestrpos ginbuild ginbuildempty ginbulkdelete ginvacuumcleanup - gincostestimate ginoptions ));
 DESCR("GIN index access method");
 #define GIN_AM_OID 2742
-DATA(insert OID = 4000 (  spgist	0 5 f f f f f t f t f f f 0 spginsert spgbeginscan spggettuple spggetbitmap spgrescan spgendscan spgmarkpos spgrestrpos spgbuild spgbuildempty spgbulkdelete spgvacuumcleanup spgcanreturn spgcostestimate spgoptions ));
+DATA(insert OID = 4000 (  spgist	0 6 f f f f f t f t f f f 0 spginsert spgbeginscan spggettuple spggetbitmap spgrescan spgendscan spgmarkpos spgrestrpos spgbuild spgbuildempty spgbulkdelete spgvacuumcleanup spgcanreturn spgcostestimate spgoptions ));
 DESCR("SP-GiST index access method");
 #define SPGIST_AM_OID 4000
 DATA(insert OID = 3580 (  brin	5 14 f f f f t t f t t f f 0 brininsert brinbeginscan - bringetbitmap brinrescan brinendscan brinmarkpos brinrestrpos brinbuild brinbuildempty brinbulkdelete brinvacuumcleanup - brincostestimate brinoptions ));
#3Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#2)

Now that the input data type and leaf data type can be different, which one is
"attType"? It's the leaf data type, as the patch stands. I renamed that to
attLeafType, and went fixing all the references to it. In most places it's just
a matter of search & replace, but what about the reconstructed datum? In
freeScanStackEntry, we assume that att[Leaf]Type is the datatype for
reconstructedValue, but I believe assume elsewhere that reconstructedValue is of
the same data type as the input. At least if the opclass supports index-only scans.

Agree with rename. I doubt that there is a real-world example of datatype which
can be a) effectivly compressed and b) restored to original form. If so, why
don't store it in compressed state in database? In GiST all compress methods
uses one-way compress. In PostGIS example, polygons are "compressed" into
bounding box, and, obviously, they cannot be restored.

I think we'll need a separate SpGistTypeDesc for the input type. Or perhaps a
separate SpGistTypeDesc for the reconstructed value and an optional decompress
method to turn the reconstructedValue back into an actual reconstructed input
datum. Or something like that.

I suppose that compress and reconstruct are mutual exclusive options.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Teodor Sigaev (#3)

On 12/23/2014 03:02 PM, Teodor Sigaev wrote:

I think we'll need a separate SpGistTypeDesc for the input type. Or perhaps a
separate SpGistTypeDesc for the reconstructed value and an optional decompress
method to turn the reconstructedValue back into an actual reconstructed input
datum. Or something like that.

I suppose that compress and reconstruct are mutual exclusive options.

I would rather not assume that. You might well want to store something
in the leaf nodes that's different from the original Datum, but
nevertheless contains enough information to reconstruct the original Datum.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Michael Paquier
michael.paquier@gmail.com
In reply to: Heikki Linnakangas (#4)

Marking this patch as returned with feedback because it is waiting for
input from the author for now a couple of weeks. Heikki, the
refactoring patch has some value, are you planning to push it?
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Michael Paquier (#5)

On 01/15/2015 09:28 AM, Michael Paquier wrote:

Marking this patch as returned with feedback because it is waiting for
input from the author for now a couple of weeks. Heikki, the
refactoring patch has some value, are you planning to push it?

I think you're mixing up with the other thread, "btree_gin and ranges".
I pushed the refactoring patch I posted to that thread
(/messages/by-id/54983CF2.80605@vmware.com)
already. I haven't proposed any refactoring related to spgist.

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#2)
1 attachment(s)

Now that the input data type and leaf data type can be different, which one is
"attType"? It's the leaf data type, as the patch stands. I renamed that to
attLeafType, and went fixing all the references to it. In most places it's just
a matter of search & replace, but what about the reconstructed datum? In

Done. Now there is separate attType and attLeafType which describe input/output
and leaf types.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

spgist_compress_method-5.patch.gzapplication/x-gzip; name=spgist_compress_method-5.patch.gzDownload
#8Heikki Linnakangas
hlinnakangas@vmware.com
In reply to: Teodor Sigaev (#7)

On 02/13/2015 06:17 PM, Teodor Sigaev wrote:

Now that the input data type and leaf data type can be different, which one is
"attType"? It's the leaf data type, as the patch stands. I renamed that to
attLeafType, and went fixing all the references to it. In most places it's just
a matter of search & replace, but what about the reconstructed datum? In

Done. Now there is separate attType and attLeafType which describe input/output
and leaf types.

Thanks.

Did you try finding a use case for this patch in one of the built-in or
contrib datatypes? That would allow writing a regression test for this.

In the original post on this, you mentioned that the PostGIS guys
planned to use this to store polygons, as bounding boxes
(/messages/by-id/5447B3FF.2080406@sigaev.ru). Any
idea how would that work?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Paul Ramsey
pramsey@cleverelephant.ca
In reply to: Heikki Linnakangas (#8)

On Wed, Feb 25, 2015 at 6:13 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

In the original post on this, you mentioned that the PostGIS guys planned to
use this to store polygons, as bounding boxes
(/messages/by-id/5447B3FF.2080406@sigaev.ru). Any idea
how would that work?

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was
sufficiently rare compared to the overall volume of data, to not be an
issue. Oddly enough this big hammer has worked in other
implementations at least passable well
(https://github.com/mapserver/mapserver/blob/branch-7-0/maptree.c#L261)

P.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Paul Ramsey (#9)

On 03/04/2015 06:58 PM, Paul Ramsey wrote:

On Wed, Feb 25, 2015 at 6:13 AM, Heikki Linnakangas
<hlinnakangas@vmware.com> wrote:

In the original post on this, you mentioned that the PostGIS guys planned to
use this to store polygons, as bounding boxes
(/messages/by-id/5447B3FF.2080406@sigaev.ru). Any idea
how would that work?

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was
sufficiently rare compared to the overall volume of data, to not be an
issue. Oddly enough this big hammer has worked in other
implementations at least passable well
(https://github.com/mapserver/mapserver/blob/branch-7-0/maptree.c#L261)

Ok, I see, but that's not really what I was wondering. My question is
this: SP-GiST partitions the space into non-overlapping sections. How
can you store polygons - which can overlap - in an SP-GiST index? And
how does the compress method help with that?

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Teodor Sigaev
teodor@sigaev.ru
In reply to: Heikki Linnakangas (#10)

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was

Ok, I see, but that's not really what I was wondering. My question is this:
SP-GiST partitions the space into non-overlapping sections. How can you store
polygons - which can overlap - in an SP-GiST index? And how does the compress
method help with that?

I believe if we found a way to index boxes then we will need a compress method
to build index over polygons.

BTW, we are working on investigation a index structure for box where 2d-box is
treated as 4d-point.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Michael Paquier
michael.paquier@gmail.com
In reply to: Teodor Sigaev (#11)

On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was

Ok, I see, but that's not really what I was wondering. My question is
this:
SP-GiST partitions the space into non-overlapping sections. How can you
store
polygons - which can overlap - in an SP-GiST index? And how does the
compress
method help with that?

I believe if we found a way to index boxes then we will need a compress
method to build index over polygons.

BTW, we are working on investigation a index structure for box where 2d-box
is treated as 4d-point.

There has been no activity on this patch for some time now, and a new
patch version has not been submitted, so I am marking it as return
with feedback.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Michael Paquier (#12)

On Tue, Aug 25, 2015 at 4:05 PM, Michael Paquier <michael.paquier@gmail.com>
wrote:

On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was

Ok, I see, but that's not really what I was wondering. My question is
this:
SP-GiST partitions the space into non-overlapping sections. How can you
store
polygons - which can overlap - in an SP-GiST index? And how does the
compress
method help with that?

I believe if we found a way to index boxes then we will need a compress
method to build index over polygons.

BTW, we are working on investigation a index structure for box where

2d-box

is treated as 4d-point.

There has been no activity on this patch for some time now, and a new
patch version has not been submitted, so I am marking it as return
with feedback.

There is interest to this patch from PostGIS users. It would be nice to
pickup this patch.
AFAICS, the progress on this patch was suspended because we have no example
for SP-GiST compress method in core/contrib.
However, now we have acdf2a8b committed with 2d to 4d indexing of boxes
using SP-GiST. So, extending this 2d to 4d approach to polygons would be
good example of SP-GiST compress method in core. Would anyone be
volunteering for writing such patch?
If nobody, then I could do it....

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#14Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Alexander Korotkov (#13)
2 attachment(s)

On Mon, Sep 18, 2017 at 6:21 PM, Alexander Korotkov <
a.korotkov@postgrespro.ru> wrote:

On Tue, Aug 25, 2015 at 4:05 PM, Michael Paquier <
michael.paquier@gmail.com> wrote:

On Thu, Jul 23, 2015 at 6:18 PM, Teodor Sigaev <teodor@sigaev.ru> wrote:

Poorly, by hanging boxes that straddled dividing lines off the parent
node in a big linear list. The hope would be that the case was

Ok, I see, but that's not really what I was wondering. My question is
this:
SP-GiST partitions the space into non-overlapping sections. How can you
store
polygons - which can overlap - in an SP-GiST index? And how does the
compress
method help with that?

I believe if we found a way to index boxes then we will need a compress
method to build index over polygons.

BTW, we are working on investigation a index structure for box where

2d-box

is treated as 4d-point.

There has been no activity on this patch for some time now, and a new
patch version has not been submitted, so I am marking it as return
with feedback.

There is interest to this patch from PostGIS users. It would be nice to
pickup this patch.
AFAICS, the progress on this patch was suspended because we have no
example for SP-GiST compress method in core/contrib.
However, now we have acdf2a8b committed with 2d to 4d indexing of boxes
using SP-GiST. So, extending this 2d to 4d approach to polygons would be
good example of SP-GiST compress method in core. Would anyone be
volunteering for writing such patch?
If nobody, then I could do it....

Nobody answered yet. And I decided to nail down this long term issue.
Please, find following attached patches.

0001-spgist-compress-method-6.patch

Patch with SP-GiST compress method rebased on current master. Index AM
interface was changed since that time. I've added validation for compress
method: it validates input and output types of compress method. That
required to call config method before. That is controversial solution. In
particular, no collation is provided in config method call. It would be
weird if collation could affect data types in SP-GiST config method output,
but anyway...

0002-spgist-circle-polygon-6.patch

This patch provides example of SP-GiST compress method usage. It adds
SP-GiST indexing for circles and polygons using mapping of their bounding
boxes to 4d. This patch is based on prior work by Nikita Glukhov for
SP-GiST KNN.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-6.patchapplication/octet-stream; name=0001-spgist-compress-method-6.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
new file mode 100644
index cd4a8d0..dcdc297
*** a/doc/src/sgml/spgist.sgml
--- b/doc/src/sgml/spgist.sgml
***************
*** 240,259 ****
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide.  All five follow the convention
!   of accepting two <type>internal</> arguments, the first of which is a
!   pointer to a C struct containing input values for the support method,
!   while the second argument is a pointer to a C struct where output values
!   must be placed.  Four of the methods just return <type>void</>, since
!   all their results appear in the output struct; but
    <function>leaf_consistent</> additionally returns a <type>boolean</> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method.
   </para>
  
   <para>
!   The five user-defined methods are:
   </para>
  
   <variablelist>
--- 240,260 ----
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide and one optional. All five mandatory
!   methos follow the convention of accepting two <type>internal</> arguments,
!   the first of which is a pointer to a C struct containing input values for 
!   the support method, while the second argument is a pointer to a C struct 
!   where output values must be placed.  Four of the methods just return 
!   <type>void</>, since all their results appear in the output struct; but
    <function>leaf_consistent</> additionally returns a <type>boolean</> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method. Optional method <function>compress</> accepts
!   datum to be indexed and returns values which actually will be indexed.  
   </para>
  
   <para>
!   The five mandatory user-defined methods are:
   </para>
  
   <variablelist>
*************** typedef struct spgConfigOut
*** 283,288 ****
--- 284,290 ----
  {
      Oid         prefixType;     /* Data type of inner-tuple prefixes */
      Oid         labelType;      /* Data type of inner-tuple node labels */
+     Oid         leafType;       /* Data type of leaf */
      bool        canReturnData;  /* Opclass can reconstruct original data */
      bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
  } spgConfigOut;
*************** typedef struct spgConfigOut
*** 303,309 ****
        <structfield>longValuesOK</> should be set true only when the
        <structfield>attType</> is of variable length and the operator
        class is capable of segmenting long values by repeated suffixing
!       (see <xref linkend="spgist-limits">).
       </para>
       </listitem>
      </varlistentry>
--- 305,319 ----
        <structfield>longValuesOK</> should be set true only when the
        <structfield>attType</> is of variable length and the operator
        class is capable of segmenting long values by repeated suffixing
!       (see <xref linkend="spgist-limits">). <structfield>leafType</>
!       usually has the same value as <structfield>attType</> but if
!       it's different then optional method  <function>compress</>
!       should be provided. Method  <function>compress</> is responsible
!       for transformation from <structfield>attType</> to 
!       <structfield>leafType</>. In this case all other function should
!       accept <structfield>leafType</> values. Note: both consistent
!       functions will get <structfield>scankeys</> unchanged, without
!       <function>compress</> transformation.
       </para>
       </listitem>
      </varlistentry>
*************** typedef struct spgInnerConsistentOut
*** 624,630 ****
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
--- 634,641 ----
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</> should be always a
!        <structname>spgConfigOut</>.<structfield>leafType</> type. 
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgLeafConsistentOut
*** 730,736 ****
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
--- 741,748 ----
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</> should be always a
!        <structname>spgConfigOut</>.<structfield>leafType</> type. 
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgLeafConsistentOut
*** 757,762 ****
--- 769,792 ----
      </varlistentry>
     </variablelist>
  
+  <para>
+   The optional user-defined method is:
+  </para>
+ 
+  <variablelist>
+     <varlistentry>
+      <term><function>Datum compress(Datum in)</></term>
+      <listitem>
+       <para>
+        Converts the data item into a format suitable for physical storage in 
+        an index page. It accepts <structname>spgConfigIn</>.<structfield>attType</>
+        value and return <structname>spgConfigOut</>.<structfield>leafType</>
+        value. Output value should not be toasted.
+       </para>
+      </listitem>
+     </varlistentry>
+   </variablelist>
+ 
    <para>
     All the SP-GiST support methods are normally called in a short-lived
     memory context; that is, <varname>CurrentMemoryContext</> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
new file mode 100644
index b0702a7..68c3f45
*** a/src/backend/access/spgist/spgdoinsert.c
--- b/src/backend/access/spgist/spgdoinsert.c
*************** spgdoinsert(Relation index, SpGistState 
*** 1899,1919 ****
  	FmgrInfo   *procinfo = NULL;
  
  	/*
! 	 * Look up FmgrInfo of the user-defined choose function once, to save
! 	 * cycles in the loop below.
  	 */
  	if (!isnull)
! 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
! 	 * Since we don't use index_form_tuple in this AM, we have to make sure
! 	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
! 	 * that.
  	 */
! 	if (!isnull && state->attType.attlen == -1)
! 		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
! 
! 	leafDatum = datum;
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
--- 1899,1939 ----
  	FmgrInfo   *procinfo = NULL;
  
  	/*
! 	 * Prepare the leaf datum to insert.
! 	 *
! 	 * If there is an optional "compress" method, call it to form the leaf
! 	 * datum from the input datum. Otherwise we will store the input datum as
! 	 * is. (We have to detoast it, though. We assume the "compress" method to
! 	 * return an untoasted value.)
  	 */
  	if (!isnull)
! 	{
! 		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 		{
! 			procinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
! 			leafDatum = FunctionCall1Coll(procinfo,
! 										  index->rd_indcollation[0],
! 										  datum);
! 		}
! 		else
! 		{
! 			Assert(state->attLeafType.type == state->attType.type);
! 
! 			if (state->attType.attlen == -1)
! 				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
! 			else
! 				leafDatum = datum;
! 		}
! 	}
! 	else
! 		leafDatum = (Datum) 0;
  
  	/*
! 	 * Look up FmgrInfo of the user-defined choose function once, to save
! 	 * cycles in the loop below.
  	 */
! 	if (!isnull)
! 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
*************** spgdoinsert(Relation index, SpGistState 
*** 1923,1929 ****
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
--- 1943,1949 ----
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attLeafType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
*************** spgdoinsert(Relation index, SpGistState 
*** 2138,2144 ****
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attType, leafDatum);
  					}
  
  					/*
--- 2158,2164 ----
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attLeafType, leafDatum);
  					}
  
  					/*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
new file mode 100644
index 22f64b0..8a1311d
*** a/src/backend/access/spgist/spgutils.c
--- b/src/backend/access/spgist/spgutils.c
*************** spgGetCache(Relation index)
*** 124,130 ****
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		fillTypeDesc(&cache->attType, atttype);
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
--- 124,146 ----
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		if (OidIsValid(cache->config.leafType) &&
! 			cache->config.leafType != atttype)
! 		{
! 			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 				ereport(ERROR,
! 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 						 errmsg("compress method must not defined when leaf type is different from input type")));
! 
! 			fillTypeDesc(&cache->attType, atttype);
! 			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
! 		}
! 		else
! 		{
! 			fillTypeDesc(&cache->attType, atttype);
! 			cache->attLeafType = cache->attType;
! 		}
! 
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
*************** initSpGistState(SpGistState *state, Rela
*** 164,169 ****
--- 180,186 ----
  
  	state->config = cache->config;
  	state->attType = cache->attType;
+ 	state->attLeafType = cache->attLeafType;
  	state->attPrefixType = cache->attPrefixType;
  	state->attLabelType = cache->attLabelType;
  
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 598,604 ****
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
--- 615,621 ----
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attLeafType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 614,620 ****
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
  
  	return tup;
  }
--- 631,637 ----
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
  
  	return tup;
  }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
new file mode 100644
index 157cf2a..514da47
*** a/src/backend/access/spgist/spgvalidate.c
--- b/src/backend/access/spgist/spgvalidate.c
*************** spgvalidate(Oid opclassoid)
*** 52,57 ****
--- 52,61 ----
  	OpFamilyOpFuncGroup *opclassgroup;
  	int			i;
  	ListCell   *lc;
+ 	spgConfigIn	configIn;
+ 	spgConfigOut configOut;
+ 	Oid			configOutLefttype = InvalidOid;
+ 	Oid			configOutRighttype = InvalidOid;
  
  	/* Fetch opclass information */
  	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
*************** spgvalidate(Oid opclassoid)
*** 100,105 ****
--- 104,118 ----
  		switch (procform->amprocnum)
  		{
  			case SPGIST_CONFIG_PROC:
+ 				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+ 											2, 2, INTERNALOID, INTERNALOID);
+ 				configIn.attType = procform->amproclefttype;
+ 				OidFunctionCall2(procform->amproc,
+ 								 PointerGetDatum(&configIn),
+ 								 PointerGetDatum(&configOut));
+ 				configOutLefttype = procform->amproclefttype;
+ 				configOutRighttype = procform->amprocrighttype;
+ 				break;
  			case SPGIST_CHOOSE_PROC:
  			case SPGIST_PICKSPLIT_PROC:
  			case SPGIST_INNER_CONSISTENT_PROC:
*************** spgvalidate(Oid opclassoid)
*** 110,115 ****
--- 123,137 ----
  				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
  											2, 2, INTERNALOID, INTERNALOID);
  				break;
+ 			case SPGIST_COMPRESS_PROC:
+ 				if (configOutLefttype != procform->amproclefttype ||
+ 					configOutRighttype != procform->amprocrighttype)
+ 					ok = false;
+ 				else
+ 					ok = check_amproc_signature(procform->amproc,
+ 												configOut.leafType, true,
+ 												1, 1, procform->amproclefttype);
+ 				break;
  			default:
  				ereport(INFO,
  						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
*************** spgvalidate(Oid opclassoid)
*** 212,218 ****
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
--- 234,240 ----
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNRequiredProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
new file mode 100644
index d1bc396..a477278
*** a/src/include/access/spgist.h
--- b/src/include/access/spgist.h
***************
*** 30,36 ****
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGISTNProc						5
  
  /*
   * Argument structs for spg_config method
--- 30,38 ----
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGIST_COMPRESS_PROC			6
! #define SPGISTNRequiredProc				5
! #define SPGISTNProc						6
  
  /*
   * Argument structs for spg_config method
*************** typedef struct spgConfigOut
*** 44,49 ****
--- 46,52 ----
  {
  	Oid			prefixType;		/* Data type of inner-tuple prefixes */
  	Oid			labelType;		/* Data type of inner-tuple node labels */
+ 	Oid			leafType;		/* Data type of leaf (type of SPGIST_COMPRESS_PROC output) */
  	bool		canReturnData;	/* Opclass can reconstruct original data */
  	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
  } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
new file mode 100644
index 1c4b321..69dc2ba
*** a/src/include/access/spgist_private.h
--- b/src/include/access/spgist_private.h
*************** typedef struct SpGistState
*** 119,125 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 119,126 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef struct SpGistCache
*** 178,184 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 179,186 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef SpGistLeafTupleData *SpGistLeafT
*** 300,306 ****
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
--- 302,308 ----
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
0002-spgist-circle-polygon-6.patchapplication/octet-stream; name=0002-spgist-circle-polygon-6.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
new file mode 100644
index dcdc297..bac9979
*** a/doc/src/sgml/spgist.sgml
--- b/doc/src/sgml/spgist.sgml
***************
*** 131,136 ****
--- 131,172 ----
        </entry>
       </row>
       <row>
+       <entry><literal>circle_ops</></entry>
+       <entry><type>circle</></entry>
+       <entry>
+        <literal>&lt;&lt;</>
+        <literal>&amp;&lt;</>
+        <literal>&amp;&amp;</>
+        <literal>&amp;&gt;</>
+        <literal>&gt;&gt;</>
+        <literal>~=</>
+        <literal>@&gt;</>
+        <literal>&lt;@</>
+        <literal>&amp;&lt;|</>
+        <literal>&lt;&lt;|</>
+        <literal>|&gt;&gt;</>
+        <literal>|&amp;&gt;</>
+       </entry>
+      </row>
+      <row>
+       <entry><literal>poly_ops</></entry>
+       <entry><type>polygon</></entry>
+       <entry>
+        <literal>&lt;&lt;</>
+        <literal>&amp;&lt;</>
+        <literal>&amp;&amp;</>
+        <literal>&amp;&gt;</>
+        <literal>&gt;&gt;</>
+        <literal>~=</>
+        <literal>@&gt;</>
+        <literal>&lt;@</>
+        <literal>&amp;&lt;|</>
+        <literal>&lt;&lt;|</>
+        <literal>|&gt;&gt;</>
+        <literal>|&amp;&gt;</>
+       </entry>
+      </row>
+      <row>
        <entry><literal>text_ops</></entry>
        <entry><type>text</></entry>
        <entry>
diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
new file mode 100644
index d1919fc..1c11f1f
*** a/src/backend/access/gist/gistproc.c
--- b/src/backend/access/gist/gistproc.c
*************** gist_circle_compress(PG_FUNCTION_ARGS)
*** 1115,1126 ****
  		CIRCLE	   *in = DatumGetCircleP(entry->key);
  		BOX		   *r;
  
! 		r = (BOX *) palloc(sizeof(BOX));
! 		r->high.x = in->center.x + in->radius;
! 		r->low.x = in->center.x - in->radius;
! 		r->high.y = in->center.y + in->radius;
! 		r->low.y = in->center.y - in->radius;
! 
  		retval = (GISTENTRY *) palloc(sizeof(GISTENTRY));
  		gistentryinit(*retval, PointerGetDatum(r),
  					  entry->rel, entry->page,
--- 1115,1121 ----
  		CIRCLE	   *in = DatumGetCircleP(entry->key);
  		BOX		   *r;
  
! 		r = circle_bbox(in);
  		retval = (GISTENTRY *) palloc(sizeof(GISTENTRY));
  		gistentryinit(*retval, PointerGetDatum(r),
  					  entry->rel, entry->page,
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 0348855..fce9f73
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** enum path_delim
*** 41,47 ****
  static int	point_inside(Point *p, int npts, Point *plist);
  static int	lseg_crossing(double x, double y, double px, double py);
  static BOX *box_construct(double x1, double x2, double y1, double y2);
- static BOX *box_copy(BOX *box);
  static BOX *box_fill(BOX *result, double x1, double x2, double y1, double y2);
  static bool box_ov(BOX *box1, BOX *box2);
  static double box_ht(BOX *box);
--- 41,46 ----
*************** box_fill(BOX *result, double x1, double 
*** 482,488 ****
  
  /*		box_copy		-		copy a box
   */
! static BOX *
  box_copy(BOX *box)
  {
  	BOX		   *result = (BOX *) palloc(sizeof(BOX));
--- 481,487 ----
  
  /*		box_copy		-		copy a box
   */
! BOX *
  box_copy(BOX *box)
  {
  	BOX		   *result = (BOX *) palloc(sizeof(BOX));
*************** circle_ar(CIRCLE *circle)
*** 5089,5094 ****
--- 5088,5122 ----
  	return M_PI * (circle->radius * circle->radius);
  }
  
+ /*		circle_bbox		-		returns bounding box of the circle.
+  */
+ BOX *
+ circle_bbox(CIRCLE *circle)
+ {
+ 	BOX		   *bbox = (BOX *) palloc(sizeof(BOX));
+ 
+ 	bbox->high.x = circle->center.x + circle->radius;
+ 	bbox->low.x = circle->center.x - circle->radius;
+ 	bbox->high.y = circle->center.y + circle->radius;
+ 	bbox->low.y = circle->center.y - circle->radius;
+ 
+ 	if (isnan(bbox->low.x))
+ 	{
+ 		double tmp = bbox->low.x;
+ 		bbox->low.x = bbox->high.x;
+ 		bbox->high.x = tmp;
+ 	}
+ 
+ 	if (isnan(bbox->low.y))
+ 	{
+ 		double tmp = bbox->low.y;
+ 		bbox->low.y = bbox->high.y;
+ 		bbox->high.y = tmp;
+ 	}
+ 
+ 	return bbox;
+ }
+ 
  
  /*----------------------------------------------------------
   *	Conversion operators.
diff --git a/src/backend/utils/adt/geo_spgist.c b/src/backend/utils/adt/geo_spgist.c
new file mode 100644
index f6334ba..1237bd2
*** a/src/backend/utils/adt/geo_spgist.c
--- b/src/backend/utils/adt/geo_spgist.c
*************** spg_box_quad_choose(PG_FUNCTION_ARGS)
*** 391,397 ****
  	spgChooseIn *in = (spgChooseIn *) PG_GETARG_POINTER(0);
  	spgChooseOut *out = (spgChooseOut *) PG_GETARG_POINTER(1);
  	BOX		   *centroid = DatumGetBoxP(in->prefixDatum),
! 			   *box = DatumGetBoxP(in->datum);
  
  	out->resultType = spgMatchNode;
  	out->result.matchNode.restDatum = BoxPGetDatum(box);
--- 391,397 ----
  	spgChooseIn *in = (spgChooseIn *) PG_GETARG_POINTER(0);
  	spgChooseOut *out = (spgChooseOut *) PG_GETARG_POINTER(1);
  	BOX		   *centroid = DatumGetBoxP(in->prefixDatum),
! 			   *box = DatumGetBoxP(in->leafDatum);
  
  	out->resultType = spgMatchNode;
  	out->result.matchNode.restDatum = BoxPGetDatum(box);
*************** spg_box_quad_picksplit(PG_FUNCTION_ARGS)
*** 474,479 ****
--- 474,529 ----
  }
  
  /*
+  * Check if result of consistent method based on bounding box is exact.
+  */
+ static bool
+ is_bounding_box_test_exact(StrategyNumber strategy)
+ {
+ 	switch (strategy)
+ 	{
+ 		case RTLeftStrategyNumber:
+ 		case RTOverLeftStrategyNumber:
+ 		case RTOverRightStrategyNumber:
+ 		case RTRightStrategyNumber:
+ 		case RTOverBelowStrategyNumber:
+ 		case RTBelowStrategyNumber:
+ 		case RTAboveStrategyNumber:
+ 		case RTOverAboveStrategyNumber:
+ 			return true;
+ 
+ 		default:
+ 			return false;
+ 	}
+ }
+ 
+ /*
+  * Get bounding box for ScanKey.
+  */
+ static BOX *
+ spg_box_quad_get_scankey_bbox(ScanKey sk, bool *recheck)
+ {
+ 	switch (sk->sk_subtype)
+ 	{
+ 		case BOXOID:
+ 			return DatumGetBoxP(sk->sk_argument);
+ 
+ 		case CIRCLEOID:
+ 			if (recheck && !is_bounding_box_test_exact(sk->sk_strategy))
+ 				*recheck = true;
+ 			return circle_bbox(DatumGetCircleP(sk->sk_argument));
+ 
+ 		case POLYGONOID:
+ 			if (recheck && !is_bounding_box_test_exact(sk->sk_strategy))
+ 				*recheck = true;
+ 			return &DatumGetPolygonP(sk->sk_argument)->boundbox;
+ 
+ 		default:
+ 			elog(ERROR, "unrecognized scankey subtype: %d", sk->sk_subtype);
+ 			return NULL;
+ 	}
+ }
+ 
+ /*
   * SP-GiST inner consistent function
   */
  Datum
*************** spg_box_quad_inner_consistent(PG_FUNCTIO
*** 515,521 ****
  	centroid = getRangeBox(DatumGetBoxP(in->prefixDatum));
  	queries = (RangeBox **) palloc(in->nkeys * sizeof(RangeBox *));
  	for (i = 0; i < in->nkeys; i++)
! 		queries[i] = getRangeBox(DatumGetBoxP(in->scankeys[i].sk_argument));
  
  	/* Allocate enough memory for nodes */
  	out->nNodes = 0;
--- 565,575 ----
  	centroid = getRangeBox(DatumGetBoxP(in->prefixDatum));
  	queries = (RangeBox **) palloc(in->nkeys * sizeof(RangeBox *));
  	for (i = 0; i < in->nkeys; i++)
! 	{
! 		BOX		   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i], NULL);
! 
! 		queries[i] = getRangeBox(box);
! 	}
  
  	/* Allocate enough memory for nodes */
  	out->nNodes = 0;
*************** spg_box_quad_leaf_consistent(PG_FUNCTION
*** 637,644 ****
  	/* Perform the required comparison(s) */
  	for (i = 0; i < in->nkeys; i++)
  	{
! 		StrategyNumber strategy = in->scankeys[i].sk_strategy;
! 		Datum		query = in->scankeys[i].sk_argument;
  
  		switch (strategy)
  		{
--- 691,700 ----
  	/* Perform the required comparison(s) */
  	for (i = 0; i < in->nkeys; i++)
  	{
! 		StrategyNumber	strategy = in->scankeys[i].sk_strategy;
! 		BOX			   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i],
! 															&out->recheck);
! 		Datum			query = BoxPGetDatum(box);
  
  		switch (strategy)
  		{
*************** spg_box_quad_leaf_consistent(PG_FUNCTION
*** 713,715 ****
--- 769,818 ----
  
  	PG_RETURN_BOOL(flag);
  }
+ 
+ 
+ /*
+  * SP-GiST config function for 2-D types that are lossy represented by their
+  * bounding boxes
+  */
+ Datum
+ spg_bbox_quad_config(PG_FUNCTION_ARGS)
+ {
+ 	spgConfigOut *cfg = (spgConfigOut *) PG_GETARG_POINTER(1);
+ 
+ 	cfg->prefixType = BOXOID;	/* A type represented by its bounding box */
+ 	cfg->labelType = VOIDOID;	/* We don't need node labels. */
+ 	cfg->leafType = BOXOID;
+ 	cfg->canReturnData = false;
+ 	cfg->longValuesOK = false;
+ 
+ 	PG_RETURN_VOID();
+ }
+ 
+ /*
+  * SP-GiST compress function for circles
+  */
+ Datum
+ spg_circle_quad_compress(PG_FUNCTION_ARGS)
+ {
+ 	CIRCLE	   *circle = PG_GETARG_CIRCLE_P(0);
+ 	BOX		   *box;
+ 
+ 	box = circle_bbox(circle);
+ 
+ 	PG_RETURN_BOX_P(box);
+ }
+ 
+ /*
+  * SP-GiST compress function for polygons
+  */
+ Datum
+ spg_poly_quad_compress(PG_FUNCTION_ARGS)
+ {
+ 	POLYGON	   *polygon = PG_GETARG_POLYGON_P(0);
+ 	BOX		   *box;
+ 
+ 	box = box_copy(&polygon->boundbox);
+ 
+ 	PG_RETURN_BOX_P(box);
+ }
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index f850be4..d1527db
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert (	5000	603  603 11 s	2573	40
*** 858,863 ****
--- 858,895 ----
  DATA(insert (	5000	603  603 12 s	2572	4000 0 ));
  
  /*
+  * SP-GiST circle_ops
+  */
+ DATA(insert (	5007	718  718  1 s	1506	4000 0 ));
+ DATA(insert (	5007	718  718  2 s	1507	4000 0 ));
+ DATA(insert (	5007	718  718  3 s	1513	4000 0 ));
+ DATA(insert (	5007	718  718  4 s	1508	4000 0 ));
+ DATA(insert (	5007	718  718  5 s	1509	4000 0 ));
+ DATA(insert (	5007	718  718  6 s	1512	4000 0 ));
+ DATA(insert (	5007	718  718  7 s	1511	4000 0 ));
+ DATA(insert (	5007	718  718  8 s	1510	4000 0 ));
+ DATA(insert (	5007	718  718  9 s	2589	4000 0 ));
+ DATA(insert (	5007	718  718 10 s	1515	4000 0 ));
+ DATA(insert (	5007	718  718 11 s	1514	4000 0 ));
+ DATA(insert (	5007	718  718 12 s	2590	4000 0 ));
+ 
+ /*
+  * SP-GiST poly_ops (supports polygons)
+  */
+ DATA(insert (	5008   604	604  1 s	 485	4000 0 ));
+ DATA(insert (	5008   604	604  2 s	 486	4000 0 ));
+ DATA(insert (	5008   604	604  3 s	 492	4000 0 ));
+ DATA(insert (	5008   604	604  4 s	 487	4000 0 ));
+ DATA(insert (	5008   604	604  5 s	 488	4000 0 ));
+ DATA(insert (	5008   604	604  6 s	 491	4000 0 ));
+ DATA(insert (	5008   604	604  7 s	 490	4000 0 ));
+ DATA(insert (	5008   604	604  8 s	 489	4000 0 ));
+ DATA(insert (	5008   604	604  9 s	2575	4000 0 ));
+ DATA(insert (	5008   604	604 10 s	2574	4000 0 ));
+ DATA(insert (	5008   604	604 11 s	2577	4000 0 ));
+ DATA(insert (	5008   604	604 12 s	2576	4000 0 ));
+ 
+ /*
   * GiST inet_ops
   */
  DATA(insert (	3550	869 869 3 s		3552 783 0 ));
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 1c95846..e21bc8a
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert (	5000   603 603 2 5013 ));
*** 334,339 ****
--- 334,351 ----
  DATA(insert (	5000   603 603 3 5014 ));
  DATA(insert (	5000   603 603 4 5015 ));
  DATA(insert (	5000   603 603 5 5016 ));
+ DATA(insert (	5007   718 718 1 5009 ));
+ DATA(insert (	5007   718 718 2 5013 ));
+ DATA(insert (	5007   718 718 3 5014 ));
+ DATA(insert (	5007   718 718 4 5015 ));
+ DATA(insert (	5007   718 718 5 5016 ));
+ DATA(insert (	5007   718 718 6 5010 ));
+ DATA(insert (	5008   604 604 1 5009 ));
+ DATA(insert (	5008   604 604 2 5013 ));
+ DATA(insert (	5008   604 604 3 5014 ));
+ DATA(insert (	5008   604 604 4 5015 ));
+ DATA(insert (	5008   604 604 5 5016 ));
+ DATA(insert (	5008   604 604 6 5011 ));
  
  /* BRIN opclasses */
  /* minmax bytea */
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
new file mode 100644
index 28dbc74..ec803d1
*** a/src/include/catalog/pg_opclass.h
--- b/src/include/catalog/pg_opclass.h
*************** DATA(insert (	4000	box_ops				PGNSP PGUI
*** 205,210 ****
--- 205,212 ----
  DATA(insert (	4000	quad_point_ops		PGNSP PGUID 4015  600 t 0 ));
  DATA(insert (	4000	kd_point_ops		PGNSP PGUID 4016  600 f 0 ));
  DATA(insert (	4000	text_ops			PGNSP PGUID 4017  25 t 0 ));
+ DATA(insert (	4000	circle_ops			PGNSP PGUID 5007  718 t 603 ));
+ DATA(insert (	4000	poly_ops			PGNSP PGUID 5008  604 t 603 ));
  DATA(insert (	403		jsonb_ops			PGNSP PGUID 4033  3802 t 0 ));
  DATA(insert (	405		jsonb_ops			PGNSP PGUID 4034  3802 t 0 ));
  DATA(insert (	2742	jsonb_ops			PGNSP PGUID 4036  3802 t 25 ));
diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h
new file mode 100644
index 0d0ba7c..5a6bc1d
*** a/src/include/catalog/pg_opfamily.h
--- b/src/include/catalog/pg_opfamily.h
*************** DATA(insert OID = 4103 (	3580	range_incl
*** 186,190 ****
--- 186,192 ----
  DATA(insert OID = 4082 (	3580	pg_lsn_minmax_ops		PGNSP PGUID ));
  DATA(insert OID = 4104 (	3580	box_inclusion_ops		PGNSP PGUID ));
  DATA(insert OID = 5000 (	4000	box_ops		PGNSP PGUID ));
+ DATA(insert OID = 5007 (	4000	circle_ops				PGNSP PGUID ));
+ DATA(insert OID = 5008 (	4000	poly_ops				PGNSP PGUID ));
  
  #endif							/* PG_OPFAMILY_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 93c031a..95e2f34
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DESCR("SP-GiST support for quad tree ove
*** 5335,5340 ****
--- 5335,5347 ----
  DATA(insert OID = 5016 (  spg_box_quad_leaf_consistent	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_box_quad_leaf_consistent _null_ _null_ _null_ ));
  DESCR("SP-GiST support for quad tree over box");
  
+ DATA(insert OID = 5009 (  spg_bbox_quad_config PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2278 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_bbox_quad_config _null_ _null_ _null_ ));
+ DESCR("SP-GiST support for quad tree over 2-D types represented by their bounding boxes");
+ DATA(insert OID = 5010 (  spg_circle_quad_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 603 "718" _null_ _null_ _null_ _null_  _null_ spg_circle_quad_compress _null_ _null_ _null_ ));
+ DESCR("SP-GiST support for quad tree over circle");
+ DATA(insert OID = 5011 (  spg_poly_quad_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 603 "604" _null_ _null_ _null_ _null_  _null_ spg_poly_quad_compress _null_ _null_ _null_ ));
+ DESCR("SP-GiST support for quad tree over polygons");
+ 
  /* replication slots */
  DATA(insert OID = 3779 (  pg_create_physical_replication_slot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 3 0 2249 "19 16 16" "{19,16,16,19,3220}" "{i,i,i,o,o}" "{slot_name,immediately_reserve,temporary,slot_name,lsn}" _null_ _null_ pg_create_physical_replication_slot _null_ _null_ _null_ ));
  DESCR("create a physical replication slot");
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 44c6381..1804643
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** typedef struct
*** 178,186 ****
   * in geo_ops.c
   */
  
! /* private point routines */
  extern double point_dt(Point *pt1, Point *pt2);
  extern double point_sl(Point *pt1, Point *pt2);
  extern double pg_hypot(double x, double y);
  
  #endif							/* GEO_DECLS_H */
--- 178,188 ----
   * in geo_ops.c
   */
  
! /* private routines */
  extern double point_dt(Point *pt1, Point *pt2);
  extern double point_sl(Point *pt1, Point *pt2);
  extern double pg_hypot(double x, double y);
+ extern BOX *box_copy(BOX *box);
+ extern BOX *circle_bbox(CIRCLE *circle);
  
  #endif							/* GEO_DECLS_H */
diff --git a/src/test/regress/expected/circle.out b/src/test/regress/expected/circle.out
new file mode 100644
index 9ba4a04..7e87b56
*** a/src/test/regress/expected/circle.out
--- b/src/test/regress/expected/circle.out
*************** SELECT '' as five, c1.f1 AS one, c2.f1 A
*** 97,99 ****
--- 97,339 ----
        | <(1,2),3>      | <(100,200),10> | 208.370729772479
  (5 rows)
  
+ --
+ -- Test the SP-GiST index
+ --
+ CREATE TEMPORARY TABLE quad_circle_tbl (id int, c circle);
+ INSERT INTO quad_circle_tbl
+ 	SELECT (x - 1) * 100 + y, circle(point(x * 10, y * 10), 1 + (x + y) % 10)
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ INSERT INTO quad_circle_tbl
+ 	SELECT i, '<(200, 300), 5>'
+ 	FROM generate_series(10001, 11000) AS i;
+ INSERT INTO quad_circle_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, '<(0,100), infinity>'),
+ 		(11004, '<(-infinity,0),1000>'),
+ 		(11005, '<(infinity,-infinity),infinity>');
+ CREATE INDEX quad_circle_tbl_idx ON quad_circle_tbl USING spgist(c);
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ CREATE TEMP TABLE quad_circle_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+ FROM quad_circle_tbl;
+ CREATE TEMP TABLE quad_circle_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+ FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c << circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c << '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c << '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c << circle '<(300,400),200>';
+  count 
+ -------
+    891
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &< circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c &< '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c &< '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c &< circle '<(300,400),200>';
+  count 
+ -------
+   5901
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c && circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c && '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c && '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c && circle '<(300,400),200>';
+  count 
+ -------
+   2334
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &> circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c &> '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c &> '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c &> circle '<(300,400),200>';
+  count 
+ -------
+  10000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c >> circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c >> '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c >> '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c >> circle '<(300,400),200>';
+  count 
+ -------
+   4990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c <<| circle '<(300,400),200>';
+                          QUERY PLAN                          
+ -------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c <<| '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c <<| '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c <<| circle '<(300,400),200>';
+  count 
+ -------
+   1890
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &<| circle '<(300,400),200>';
+                          QUERY PLAN                          
+ -------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c &<| '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c &<| '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c &<| circle '<(300,400),200>';
+  count 
+ -------
+   6900
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c |&> circle '<(300,400),200>';
+                          QUERY PLAN                          
+ -------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c |&> '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c |&> '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c |&> circle '<(300,400),200>';
+  count 
+ -------
+   9000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c |>> circle '<(300,400),200>';
+                          QUERY PLAN                          
+ -------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c |>> '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c |>> '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c |>> circle '<(300,400),200>';
+  count 
+ -------
+   3990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c @> circle '<(300,400),1>';
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c @> '<(300,400),1>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c @> '<(300,400),1>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c @> circle '<(300,400),1>';
+  count 
+ -------
+      2
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+                          QUERY PLAN                         
+ ------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c <@ '<(300,400),200>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c <@ '<(300,400),200>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+  count 
+ -------
+   2181
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c ~= circle '<(300,400),1>';
+                         QUERY PLAN                        
+ ----------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_circle_tbl
+          Recheck Cond: (c ~= '<(300,400),1>'::circle)
+          ->  Bitmap Index Scan on quad_circle_tbl_idx
+                Index Cond: (c ~= '<(300,400),1>'::circle)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_circle_tbl WHERE c ~= circle '<(300,400),1>';
+  count 
+ -------
+      1
+ (1 row)
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
diff --git a/src/test/regress/expected/polygon.out b/src/test/regress/expected/polygon.out
new file mode 100644
index 2361274..a9e7752
*** a/src/test/regress/expected/polygon.out
--- b/src/test/regress/expected/polygon.out
*************** SELECT	'(0,0)'::point <-> '((0,0),(1,2),
*** 227,229 ****
--- 227,467 ----
           0 |          0 |      0 | 1.4142135623731 |          3.2
  (1 row)
  
+ --
+ -- Test the SP-GiST index
+ --
+ CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+ INSERT INTO quad_poly_tbl
+ 	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ INSERT INTO quad_poly_tbl
+ 	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+ 	FROM generate_series(10001, 11000) AS i;
+ INSERT INTO quad_poly_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, NULL);
+ CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl;
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   3890
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   7900
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+    977
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   7000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   2990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   1890
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   6900
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   9000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   3990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+    831
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+                                  QUERY PLAN                                  
+ -----------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+  count 
+ -------
+      1
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+                                  QUERY PLAN                                  
+ -----------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+  count 
+ -------
+   1000
+ (1 row)
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
diff --git a/src/test/regress/sql/circle.sql b/src/test/regress/sql/circle.sql
new file mode 100644
index c0284b2..63da10c
*** a/src/test/regress/sql/circle.sql
--- b/src/test/regress/sql/circle.sql
*************** SELECT '' as five, c1.f1 AS one, c2.f1 A
*** 43,45 ****
--- 43,140 ----
    FROM CIRCLE_TBL c1, CIRCLE_TBL c2
    WHERE (c1.f1 < c2.f1) AND ((c1.f1 <-> c2.f1) > 0)
    ORDER BY distance, area(c1.f1), area(c2.f1);
+ 
+ --
+ -- Test the SP-GiST index
+ --
+ 
+ CREATE TEMPORARY TABLE quad_circle_tbl (id int, c circle);
+ 
+ INSERT INTO quad_circle_tbl
+ 	SELECT (x - 1) * 100 + y, circle(point(x * 10, y * 10), 1 + (x + y) % 10)
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ 
+ INSERT INTO quad_circle_tbl
+ 	SELECT i, '<(200, 300), 5>'
+ 	FROM generate_series(10001, 11000) AS i;
+ 
+ INSERT INTO quad_circle_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, '<(0,100), infinity>'),
+ 		(11004, '<(-infinity,0),1000>'),
+ 		(11005, '<(infinity,-infinity),infinity>');
+ 
+ CREATE INDEX quad_circle_tbl_idx ON quad_circle_tbl USING spgist(c);
+ 
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ 
+ CREATE TEMP TABLE quad_circle_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+ FROM quad_circle_tbl;
+ 
+ CREATE TEMP TABLE quad_circle_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+ FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+ 
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c << circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c << circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &< circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c &< circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c && circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c && circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &> circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c &> circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c >> circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c >> circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c <<| circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c <<| circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c &<| circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c &<| circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c |&> circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c |&> circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c |>> circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c |>> circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c @> circle '<(300,400),1>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c @> circle '<(300,400),1>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c <@ circle '<(300,400),200>';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_circle_tbl WHERE c ~= circle '<(300,400),1>';
+ SELECT count(*) FROM quad_circle_tbl WHERE c ~= circle '<(300,400),1>';
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
diff --git a/src/test/regress/sql/polygon.sql b/src/test/regress/sql/polygon.sql
new file mode 100644
index 7ac8079..c58277b
*** a/src/test/regress/sql/polygon.sql
--- b/src/test/regress/sql/polygon.sql
*************** SELECT	'(0,0)'::point <-> '((0,0),(1,2),
*** 116,118 ****
--- 116,211 ----
  	'(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
  	'(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
  	'(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
+ 
+ --
+ -- Test the SP-GiST index
+ --
+ 
+ CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+ 
+ INSERT INTO quad_poly_tbl
+ 	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ 
+ INSERT INTO quad_poly_tbl
+ 	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+ 	FROM generate_series(10001, 11000) AS i;
+ 
+ INSERT INTO quad_poly_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, NULL);
+ 
+ CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+ 
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ 
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl;
+ 
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
In reply to: Alexander Korotkov (#14)

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

Hi,

I like the SP-GiST part of the patch. Looking forward to it, so PostGIS can benefit from SP-GiST infrastructure.

I have some questions about the circles example though.

* What is the reason for isnan check and swap of box ordinates for circle? It wasn't in the code previously.
* There are tests for infinities in circles, but checks are for NaNs.
* It seems to me that circle can be implemented without recheck, by making direct exact calculations.
How about removing circle from the scope of this patch, so it is smaller and cleaner?
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Alexander Korotkov
aekorotkov@gmail.com
In reply to: Darafei Praliaskouski (#15)

On Wed, Sep 20, 2017 at 10:00 PM, Darafei Praliaskouski <me@komzpa.net>
wrote:

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

Hi,

I like the SP-GiST part of the patch. Looking forward to it, so PostGIS
can benefit from SP-GiST infrastructure.

I have some questions about the circles example though.

* What is the reason for isnan check and swap of box ordinates for
circle? It wasn't in the code previously.

* There are tests for infinities in circles, but checks are for NaNs.

This code was migrated from original patch by Nikita. I can assume he
means that nan should be treated as greatest possible floating point value
(like float4_cmp_internal() does). However, our current implementation of
geometrical datatypes don't correctly handles all the combinations of infs
as nans. Most of code was written without taking infs and nans into
account. Also, I'm not sure if this code fixes all possible issues with
infs and nans in SP-GiST opclass for circles. This is why I'm going to
remove nans handling from this place.

* It seems to me that circle can be implemented without recheck, by
making direct exact calculations.

Right. Holding circles in the leafs instead of bounding boxes would both
allow exact calculations and take less space.

How about removing circle from the scope of this patch, so it is smaller
and cleaner?

Good point. Polygons are enough for compress function example. Opclass
for circles could be submitted as separate patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Darafei Praliaskouski (#15)

Darafei Praliaskouski <me@komzpa.net> writes:

I have some questions about the circles example though.

* What is the reason for isnan check and swap of box ordinates for circle? It wasn't in the code previously.

I hadn't paid any attention to this patch previously, but this comment
excited my curiosity, so I went and looked:

+ 	bbox->high.x = circle->center.x + circle->radius;
+ 	bbox->low.x = circle->center.x - circle->radius;
+ 	bbox->high.y = circle->center.y + circle->radius;
+ 	bbox->low.y = circle->center.y - circle->radius;
+ 
+ 	if (isnan(bbox->low.x))
+ 	{
+ 		double tmp = bbox->low.x;
+ 		bbox->low.x = bbox->high.x;
+ 		bbox->high.x = tmp;
+ 	}

Maybe I'm missing something, but it appears to me that it's impossible for
bbox->low.x to be NaN unless circle->center.x and/or circle->radius is a
NaN, in which case bbox->high.x would also have been computed as a NaN,
making the swap entirely useless. Likewise for the Y case. There may be
something useful to do about NaNs here, but this doesn't seem like it.

How about removing circle from the scope of this patch, so it is smaller and cleaner?

Neither of those patches would be particularly large, and since they'd need
to touch adjacent code in some places, the diffs wouldn't be independent.
I think splitting them is just make-work.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#17)
2 attachment(s)

On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Darafei Praliaskouski <me@komzpa.net> writes:

I have some questions about the circles example though.

* What is the reason for isnan check and swap of box ordinates for

circle? It wasn't in the code previously.

I hadn't paid any attention to this patch previously, but this comment
excited my curiosity, so I went and looked:

+       bbox->high.x = circle->center.x + circle->radius;
+       bbox->low.x = circle->center.x - circle->radius;
+       bbox->high.y = circle->center.y + circle->radius;
+       bbox->low.y = circle->center.y - circle->radius;
+
+       if (isnan(bbox->low.x))
+       {
+               double tmp = bbox->low.x;
+               bbox->low.x = bbox->high.x;
+               bbox->high.x = tmp;
+       }

Maybe I'm missing something, but it appears to me that it's impossible for
bbox->low.x to be NaN unless circle->center.x and/or circle->radius is a
NaN, in which case bbox->high.x would also have been computed as a NaN,
making the swap entirely useless. Likewise for the Y case. There may be
something useful to do about NaNs here, but this doesn't seem like it.

Yeah, +1.

How about removing circle from the scope of this patch, so it is smaller
and cleaner?

Neither of those patches would be particularly large, and since they'd need
to touch adjacent code in some places, the diffs wouldn't be independent.
I think splitting them is just make-work.

I've extracted polygon opclass into separate patch (attached). I'll rework
and resubmit circle patch later.
I'm not particularly sure that polygon.sql is a good place for testing
sp-gist opclass for polygons... But we've already done so for box.sql.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-7.patchapplication/octet-stream; name=0001-spgist-compress-method-7.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
new file mode 100644
index cd4a8d0..dcdc297
*** a/doc/src/sgml/spgist.sgml
--- b/doc/src/sgml/spgist.sgml
***************
*** 240,259 ****
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide.  All five follow the convention
!   of accepting two <type>internal</> arguments, the first of which is a
!   pointer to a C struct containing input values for the support method,
!   while the second argument is a pointer to a C struct where output values
!   must be placed.  Four of the methods just return <type>void</>, since
!   all their results appear in the output struct; but
    <function>leaf_consistent</> additionally returns a <type>boolean</> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method.
   </para>
  
   <para>
!   The five user-defined methods are:
   </para>
  
   <variablelist>
--- 240,260 ----
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide and one optional. All five mandatory
!   methos follow the convention of accepting two <type>internal</> arguments,
!   the first of which is a pointer to a C struct containing input values for 
!   the support method, while the second argument is a pointer to a C struct 
!   where output values must be placed.  Four of the methods just return 
!   <type>void</>, since all their results appear in the output struct; but
    <function>leaf_consistent</> additionally returns a <type>boolean</> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method. Optional method <function>compress</> accepts
!   datum to be indexed and returns values which actually will be indexed.  
   </para>
  
   <para>
!   The five mandatory user-defined methods are:
   </para>
  
   <variablelist>
*************** typedef struct spgConfigOut
*** 283,288 ****
--- 284,290 ----
  {
      Oid         prefixType;     /* Data type of inner-tuple prefixes */
      Oid         labelType;      /* Data type of inner-tuple node labels */
+     Oid         leafType;       /* Data type of leaf */
      bool        canReturnData;  /* Opclass can reconstruct original data */
      bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
  } spgConfigOut;
*************** typedef struct spgConfigOut
*** 303,309 ****
        <structfield>longValuesOK</> should be set true only when the
        <structfield>attType</> is of variable length and the operator
        class is capable of segmenting long values by repeated suffixing
!       (see <xref linkend="spgist-limits">).
       </para>
       </listitem>
      </varlistentry>
--- 305,319 ----
        <structfield>longValuesOK</> should be set true only when the
        <structfield>attType</> is of variable length and the operator
        class is capable of segmenting long values by repeated suffixing
!       (see <xref linkend="spgist-limits">). <structfield>leafType</>
!       usually has the same value as <structfield>attType</> but if
!       it's different then optional method  <function>compress</>
!       should be provided. Method  <function>compress</> is responsible
!       for transformation from <structfield>attType</> to 
!       <structfield>leafType</>. In this case all other function should
!       accept <structfield>leafType</> values. Note: both consistent
!       functions will get <structfield>scankeys</> unchanged, without
!       <function>compress</> transformation.
       </para>
       </listitem>
      </varlistentry>
*************** typedef struct spgInnerConsistentOut
*** 624,630 ****
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
--- 634,641 ----
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</> should be always a
!        <structname>spgConfigOut</>.<structfield>leafType</> type. 
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgLeafConsistentOut
*** 730,736 ****
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
--- 741,748 ----
         <structfield>reconstructedValue</> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</> at the root level or if the
         <function>inner_consistent</> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</> should be always a
!        <structname>spgConfigOut</>.<structfield>leafType</> type. 
         <structfield>traversalValue</> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgLeafConsistentOut
*** 757,762 ****
--- 769,792 ----
      </varlistentry>
     </variablelist>
  
+  <para>
+   The optional user-defined method is:
+  </para>
+ 
+  <variablelist>
+     <varlistentry>
+      <term><function>Datum compress(Datum in)</></term>
+      <listitem>
+       <para>
+        Converts the data item into a format suitable for physical storage in 
+        an index page. It accepts <structname>spgConfigIn</>.<structfield>attType</>
+        value and return <structname>spgConfigOut</>.<structfield>leafType</>
+        value. Output value should not be toasted.
+       </para>
+      </listitem>
+     </varlistentry>
+   </variablelist>
+ 
    <para>
     All the SP-GiST support methods are normally called in a short-lived
     memory context; that is, <varname>CurrentMemoryContext</> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
new file mode 100644
index b0702a7..68c3f45
*** a/src/backend/access/spgist/spgdoinsert.c
--- b/src/backend/access/spgist/spgdoinsert.c
*************** spgdoinsert(Relation index, SpGistState 
*** 1899,1919 ****
  	FmgrInfo   *procinfo = NULL;
  
  	/*
! 	 * Look up FmgrInfo of the user-defined choose function once, to save
! 	 * cycles in the loop below.
  	 */
  	if (!isnull)
! 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
! 	 * Since we don't use index_form_tuple in this AM, we have to make sure
! 	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
! 	 * that.
  	 */
! 	if (!isnull && state->attType.attlen == -1)
! 		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
! 
! 	leafDatum = datum;
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
--- 1899,1939 ----
  	FmgrInfo   *procinfo = NULL;
  
  	/*
! 	 * Prepare the leaf datum to insert.
! 	 *
! 	 * If there is an optional "compress" method, call it to form the leaf
! 	 * datum from the input datum. Otherwise we will store the input datum as
! 	 * is. (We have to detoast it, though. We assume the "compress" method to
! 	 * return an untoasted value.)
  	 */
  	if (!isnull)
! 	{
! 		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 		{
! 			procinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
! 			leafDatum = FunctionCall1Coll(procinfo,
! 										  index->rd_indcollation[0],
! 										  datum);
! 		}
! 		else
! 		{
! 			Assert(state->attLeafType.type == state->attType.type);
! 
! 			if (state->attType.attlen == -1)
! 				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
! 			else
! 				leafDatum = datum;
! 		}
! 	}
! 	else
! 		leafDatum = (Datum) 0;
  
  	/*
! 	 * Look up FmgrInfo of the user-defined choose function once, to save
! 	 * cycles in the loop below.
  	 */
! 	if (!isnull)
! 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
*************** spgdoinsert(Relation index, SpGistState 
*** 1923,1929 ****
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
--- 1943,1949 ----
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attLeafType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
*************** spgdoinsert(Relation index, SpGistState 
*** 2138,2144 ****
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attType, leafDatum);
  					}
  
  					/*
--- 2158,2164 ----
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attLeafType, leafDatum);
  					}
  
  					/*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
new file mode 100644
index 22f64b0..8a1311d
*** a/src/backend/access/spgist/spgutils.c
--- b/src/backend/access/spgist/spgutils.c
*************** spgGetCache(Relation index)
*** 124,130 ****
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		fillTypeDesc(&cache->attType, atttype);
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
--- 124,146 ----
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		if (OidIsValid(cache->config.leafType) &&
! 			cache->config.leafType != atttype)
! 		{
! 			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 				ereport(ERROR,
! 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 						 errmsg("compress method must not defined when leaf type is different from input type")));
! 
! 			fillTypeDesc(&cache->attType, atttype);
! 			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
! 		}
! 		else
! 		{
! 			fillTypeDesc(&cache->attType, atttype);
! 			cache->attLeafType = cache->attType;
! 		}
! 
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
*************** initSpGistState(SpGistState *state, Rela
*** 164,169 ****
--- 180,186 ----
  
  	state->config = cache->config;
  	state->attType = cache->attType;
+ 	state->attLeafType = cache->attLeafType;
  	state->attPrefixType = cache->attPrefixType;
  	state->attLabelType = cache->attLabelType;
  
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 598,604 ****
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
--- 615,621 ----
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attLeafType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 614,620 ****
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
  
  	return tup;
  }
--- 631,637 ----
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
  
  	return tup;
  }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
new file mode 100644
index 157cf2a..514da47
*** a/src/backend/access/spgist/spgvalidate.c
--- b/src/backend/access/spgist/spgvalidate.c
*************** spgvalidate(Oid opclassoid)
*** 52,57 ****
--- 52,61 ----
  	OpFamilyOpFuncGroup *opclassgroup;
  	int			i;
  	ListCell   *lc;
+ 	spgConfigIn	configIn;
+ 	spgConfigOut configOut;
+ 	Oid			configOutLefttype = InvalidOid;
+ 	Oid			configOutRighttype = InvalidOid;
  
  	/* Fetch opclass information */
  	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
*************** spgvalidate(Oid opclassoid)
*** 100,105 ****
--- 104,118 ----
  		switch (procform->amprocnum)
  		{
  			case SPGIST_CONFIG_PROC:
+ 				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+ 											2, 2, INTERNALOID, INTERNALOID);
+ 				configIn.attType = procform->amproclefttype;
+ 				OidFunctionCall2(procform->amproc,
+ 								 PointerGetDatum(&configIn),
+ 								 PointerGetDatum(&configOut));
+ 				configOutLefttype = procform->amproclefttype;
+ 				configOutRighttype = procform->amprocrighttype;
+ 				break;
  			case SPGIST_CHOOSE_PROC:
  			case SPGIST_PICKSPLIT_PROC:
  			case SPGIST_INNER_CONSISTENT_PROC:
*************** spgvalidate(Oid opclassoid)
*** 110,115 ****
--- 123,137 ----
  				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
  											2, 2, INTERNALOID, INTERNALOID);
  				break;
+ 			case SPGIST_COMPRESS_PROC:
+ 				if (configOutLefttype != procform->amproclefttype ||
+ 					configOutRighttype != procform->amprocrighttype)
+ 					ok = false;
+ 				else
+ 					ok = check_amproc_signature(procform->amproc,
+ 												configOut.leafType, true,
+ 												1, 1, procform->amproclefttype);
+ 				break;
  			default:
  				ereport(INFO,
  						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
*************** spgvalidate(Oid opclassoid)
*** 212,218 ****
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
--- 234,240 ----
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNRequiredProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
new file mode 100644
index d1bc396..a477278
*** a/src/include/access/spgist.h
--- b/src/include/access/spgist.h
***************
*** 30,36 ****
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGISTNProc						5
  
  /*
   * Argument structs for spg_config method
--- 30,38 ----
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGIST_COMPRESS_PROC			6
! #define SPGISTNRequiredProc				5
! #define SPGISTNProc						6
  
  /*
   * Argument structs for spg_config method
*************** typedef struct spgConfigOut
*** 44,49 ****
--- 46,52 ----
  {
  	Oid			prefixType;		/* Data type of inner-tuple prefixes */
  	Oid			labelType;		/* Data type of inner-tuple node labels */
+ 	Oid			leafType;		/* Data type of leaf (type of SPGIST_COMPRESS_PROC output) */
  	bool		canReturnData;	/* Opclass can reconstruct original data */
  	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
  } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
new file mode 100644
index 1c4b321..69dc2ba
*** a/src/include/access/spgist_private.h
--- b/src/include/access/spgist_private.h
*************** typedef struct SpGistState
*** 119,125 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 119,126 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef struct SpGistCache
*** 178,184 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 179,186 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef SpGistLeafTupleData *SpGistLeafT
*** 300,306 ****
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
--- 302,308 ----
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
0002-spgist-polygon-7.patchapplication/octet-stream; name=0002-spgist-polygon-7.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
new file mode 100644
index dcdc297..bac9979
*** a/doc/src/sgml/spgist.sgml
--- b/doc/src/sgml/spgist.sgml
***************
*** 131,136 ****
--- 131,172 ----
        </entry>
       </row>
       <row>
+       <entry><literal>circle_ops</></entry>
+       <entry><type>circle</></entry>
+       <entry>
+        <literal>&lt;&lt;</>
+        <literal>&amp;&lt;</>
+        <literal>&amp;&amp;</>
+        <literal>&amp;&gt;</>
+        <literal>&gt;&gt;</>
+        <literal>~=</>
+        <literal>@&gt;</>
+        <literal>&lt;@</>
+        <literal>&amp;&lt;|</>
+        <literal>&lt;&lt;|</>
+        <literal>|&gt;&gt;</>
+        <literal>|&amp;&gt;</>
+       </entry>
+      </row>
+      <row>
+       <entry><literal>poly_ops</></entry>
+       <entry><type>polygon</></entry>
+       <entry>
+        <literal>&lt;&lt;</>
+        <literal>&amp;&lt;</>
+        <literal>&amp;&amp;</>
+        <literal>&amp;&gt;</>
+        <literal>&gt;&gt;</>
+        <literal>~=</>
+        <literal>@&gt;</>
+        <literal>&lt;@</>
+        <literal>&amp;&lt;|</>
+        <literal>&lt;&lt;|</>
+        <literal>|&gt;&gt;</>
+        <literal>|&amp;&gt;</>
+       </entry>
+      </row>
+      <row>
        <entry><literal>text_ops</></entry>
        <entry><type>text</></entry>
        <entry>
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
new file mode 100644
index 0348855..4b93b88
*** a/src/backend/utils/adt/geo_ops.c
--- b/src/backend/utils/adt/geo_ops.c
*************** enum path_delim
*** 41,47 ****
  static int	point_inside(Point *p, int npts, Point *plist);
  static int	lseg_crossing(double x, double y, double px, double py);
  static BOX *box_construct(double x1, double x2, double y1, double y2);
- static BOX *box_copy(BOX *box);
  static BOX *box_fill(BOX *result, double x1, double x2, double y1, double y2);
  static bool box_ov(BOX *box1, BOX *box2);
  static double box_ht(BOX *box);
--- 41,46 ----
*************** box_fill(BOX *result, double x1, double 
*** 482,488 ****
  
  /*		box_copy		-		copy a box
   */
! static BOX *
  box_copy(BOX *box)
  {
  	BOX		   *result = (BOX *) palloc(sizeof(BOX));
--- 481,487 ----
  
  /*		box_copy		-		copy a box
   */
! BOX *
  box_copy(BOX *box)
  {
  	BOX		   *result = (BOX *) palloc(sizeof(BOX));
diff --git a/src/backend/utils/adt/geo_spgist.c b/src/backend/utils/adt/geo_spgist.c
new file mode 100644
index f6334ba..a105436
*** a/src/backend/utils/adt/geo_spgist.c
--- b/src/backend/utils/adt/geo_spgist.c
*************** spg_box_quad_choose(PG_FUNCTION_ARGS)
*** 391,397 ****
  	spgChooseIn *in = (spgChooseIn *) PG_GETARG_POINTER(0);
  	spgChooseOut *out = (spgChooseOut *) PG_GETARG_POINTER(1);
  	BOX		   *centroid = DatumGetBoxP(in->prefixDatum),
! 			   *box = DatumGetBoxP(in->datum);
  
  	out->resultType = spgMatchNode;
  	out->result.matchNode.restDatum = BoxPGetDatum(box);
--- 391,397 ----
  	spgChooseIn *in = (spgChooseIn *) PG_GETARG_POINTER(0);
  	spgChooseOut *out = (spgChooseOut *) PG_GETARG_POINTER(1);
  	BOX		   *centroid = DatumGetBoxP(in->prefixDatum),
! 			   *box = DatumGetBoxP(in->leafDatum);
  
  	out->resultType = spgMatchNode;
  	out->result.matchNode.restDatum = BoxPGetDatum(box);
*************** spg_box_quad_picksplit(PG_FUNCTION_ARGS)
*** 474,479 ****
--- 474,524 ----
  }
  
  /*
+  * Check if result of consistent method based on bounding box is exact.
+  */
+ static bool
+ is_bounding_box_test_exact(StrategyNumber strategy)
+ {
+ 	switch (strategy)
+ 	{
+ 		case RTLeftStrategyNumber:
+ 		case RTOverLeftStrategyNumber:
+ 		case RTOverRightStrategyNumber:
+ 		case RTRightStrategyNumber:
+ 		case RTOverBelowStrategyNumber:
+ 		case RTBelowStrategyNumber:
+ 		case RTAboveStrategyNumber:
+ 		case RTOverAboveStrategyNumber:
+ 			return true;
+ 
+ 		default:
+ 			return false;
+ 	}
+ }
+ 
+ /*
+  * Get bounding box for ScanKey.
+  */
+ static BOX *
+ spg_box_quad_get_scankey_bbox(ScanKey sk, bool *recheck)
+ {
+ 	switch (sk->sk_subtype)
+ 	{
+ 		case BOXOID:
+ 			return DatumGetBoxP(sk->sk_argument);
+ 
+ 		case POLYGONOID:
+ 			if (recheck && !is_bounding_box_test_exact(sk->sk_strategy))
+ 				*recheck = true;
+ 			return &DatumGetPolygonP(sk->sk_argument)->boundbox;
+ 
+ 		default:
+ 			elog(ERROR, "unrecognized scankey subtype: %d", sk->sk_subtype);
+ 			return NULL;
+ 	}
+ }
+ 
+ /*
   * SP-GiST inner consistent function
   */
  Datum
*************** spg_box_quad_inner_consistent(PG_FUNCTIO
*** 515,521 ****
  	centroid = getRangeBox(DatumGetBoxP(in->prefixDatum));
  	queries = (RangeBox **) palloc(in->nkeys * sizeof(RangeBox *));
  	for (i = 0; i < in->nkeys; i++)
! 		queries[i] = getRangeBox(DatumGetBoxP(in->scankeys[i].sk_argument));
  
  	/* Allocate enough memory for nodes */
  	out->nNodes = 0;
--- 560,570 ----
  	centroid = getRangeBox(DatumGetBoxP(in->prefixDatum));
  	queries = (RangeBox **) palloc(in->nkeys * sizeof(RangeBox *));
  	for (i = 0; i < in->nkeys; i++)
! 	{
! 		BOX		   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i], NULL);
! 
! 		queries[i] = getRangeBox(box);
! 	}
  
  	/* Allocate enough memory for nodes */
  	out->nNodes = 0;
*************** spg_box_quad_leaf_consistent(PG_FUNCTION
*** 637,644 ****
  	/* Perform the required comparison(s) */
  	for (i = 0; i < in->nkeys; i++)
  	{
! 		StrategyNumber strategy = in->scankeys[i].sk_strategy;
! 		Datum		query = in->scankeys[i].sk_argument;
  
  		switch (strategy)
  		{
--- 686,695 ----
  	/* Perform the required comparison(s) */
  	for (i = 0; i < in->nkeys; i++)
  	{
! 		StrategyNumber	strategy = in->scankeys[i].sk_strategy;
! 		BOX			   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i],
! 															&out->recheck);
! 		Datum			query = BoxPGetDatum(box);
  
  		switch (strategy)
  		{
*************** spg_box_quad_leaf_consistent(PG_FUNCTION
*** 713,715 ****
--- 764,799 ----
  
  	PG_RETURN_BOOL(flag);
  }
+ 
+ 
+ /*
+  * SP-GiST config function for 2-D types that are lossy represented by their
+  * bounding boxes
+  */
+ Datum
+ spg_bbox_quad_config(PG_FUNCTION_ARGS)
+ {
+ 	spgConfigOut *cfg = (spgConfigOut *) PG_GETARG_POINTER(1);
+ 
+ 	cfg->prefixType = BOXOID;	/* A type represented by its bounding box */
+ 	cfg->labelType = VOIDOID;	/* We don't need node labels. */
+ 	cfg->leafType = BOXOID;
+ 	cfg->canReturnData = false;
+ 	cfg->longValuesOK = false;
+ 
+ 	PG_RETURN_VOID();
+ }
+ 
+ /*
+  * SP-GiST compress function for polygons
+  */
+ Datum
+ spg_poly_quad_compress(PG_FUNCTION_ARGS)
+ {
+ 	POLYGON	   *polygon = PG_GETARG_POLYGON_P(0);
+ 	BOX		   *box;
+ 
+ 	box = box_copy(&polygon->boundbox);
+ 
+ 	PG_RETURN_BOX_P(box);
+ }
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
new file mode 100644
index f850be4..d877079
*** a/src/include/catalog/pg_amop.h
--- b/src/include/catalog/pg_amop.h
*************** DATA(insert (	5000	603  603 11 s	2573	40
*** 858,863 ****
--- 858,879 ----
  DATA(insert (	5000	603  603 12 s	2572	4000 0 ));
  
  /*
+  * SP-GiST poly_ops (supports polygons)
+  */
+ DATA(insert (	5008   604	604  1 s	 485	4000 0 ));
+ DATA(insert (	5008   604	604  2 s	 486	4000 0 ));
+ DATA(insert (	5008   604	604  3 s	 492	4000 0 ));
+ DATA(insert (	5008   604	604  4 s	 487	4000 0 ));
+ DATA(insert (	5008   604	604  5 s	 488	4000 0 ));
+ DATA(insert (	5008   604	604  6 s	 491	4000 0 ));
+ DATA(insert (	5008   604	604  7 s	 490	4000 0 ));
+ DATA(insert (	5008   604	604  8 s	 489	4000 0 ));
+ DATA(insert (	5008   604	604  9 s	2575	4000 0 ));
+ DATA(insert (	5008   604	604 10 s	2574	4000 0 ));
+ DATA(insert (	5008   604	604 11 s	2577	4000 0 ));
+ DATA(insert (	5008   604	604 12 s	2576	4000 0 ));
+ 
+ /*
   * GiST inet_ops
   */
  DATA(insert (	3550	869 869 3 s		3552 783 0 ));
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
new file mode 100644
index 1c95846..8b0c26b
*** a/src/include/catalog/pg_amproc.h
--- b/src/include/catalog/pg_amproc.h
*************** DATA(insert (	5000   603 603 2 5013 ));
*** 334,339 ****
--- 334,345 ----
  DATA(insert (	5000   603 603 3 5014 ));
  DATA(insert (	5000   603 603 4 5015 ));
  DATA(insert (	5000   603 603 5 5016 ));
+ DATA(insert (	5008   604 604 1 5009 ));
+ DATA(insert (	5008   604 604 2 5013 ));
+ DATA(insert (	5008   604 604 3 5014 ));
+ DATA(insert (	5008   604 604 4 5015 ));
+ DATA(insert (	5008   604 604 5 5016 ));
+ DATA(insert (	5008   604 604 6 5011 ));
  
  /* BRIN opclasses */
  /* minmax bytea */
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
new file mode 100644
index 28dbc74..6aabc72
*** a/src/include/catalog/pg_opclass.h
--- b/src/include/catalog/pg_opclass.h
*************** DATA(insert (	4000	box_ops				PGNSP PGUI
*** 205,210 ****
--- 205,211 ----
  DATA(insert (	4000	quad_point_ops		PGNSP PGUID 4015  600 t 0 ));
  DATA(insert (	4000	kd_point_ops		PGNSP PGUID 4016  600 f 0 ));
  DATA(insert (	4000	text_ops			PGNSP PGUID 4017  25 t 0 ));
+ DATA(insert (	4000	poly_ops			PGNSP PGUID 5008  604 t 603 ));
  DATA(insert (	403		jsonb_ops			PGNSP PGUID 4033  3802 t 0 ));
  DATA(insert (	405		jsonb_ops			PGNSP PGUID 4034  3802 t 0 ));
  DATA(insert (	2742	jsonb_ops			PGNSP PGUID 4036  3802 t 25 ));
diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h
new file mode 100644
index 0d0ba7c..838812b
*** a/src/include/catalog/pg_opfamily.h
--- b/src/include/catalog/pg_opfamily.h
*************** DATA(insert OID = 4103 (	3580	range_incl
*** 186,190 ****
--- 186,191 ----
  DATA(insert OID = 4082 (	3580	pg_lsn_minmax_ops		PGNSP PGUID ));
  DATA(insert OID = 4104 (	3580	box_inclusion_ops		PGNSP PGUID ));
  DATA(insert OID = 5000 (	4000	box_ops		PGNSP PGUID ));
+ DATA(insert OID = 5008 (	4000	poly_ops				PGNSP PGUID ));
  
  #endif							/* PG_OPFAMILY_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
new file mode 100644
index 93c031a..376f33a
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
*************** DESCR("SP-GiST support for quad tree ove
*** 5335,5340 ****
--- 5335,5345 ----
  DATA(insert OID = 5016 (  spg_box_quad_leaf_consistent	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_box_quad_leaf_consistent _null_ _null_ _null_ ));
  DESCR("SP-GiST support for quad tree over box");
  
+ DATA(insert OID = 5009 (  spg_bbox_quad_config PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2278 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_bbox_quad_config _null_ _null_ _null_ ));
+ DESCR("SP-GiST support for quad tree over 2-D types represented by their bounding boxes");
+ DATA(insert OID = 5011 (  spg_poly_quad_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 603 "604" _null_ _null_ _null_ _null_  _null_ spg_poly_quad_compress _null_ _null_ _null_ ));
+ DESCR("SP-GiST support for quad tree over polygons");
+ 
  /* replication slots */
  DATA(insert OID = 3779 (  pg_create_physical_replication_slot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 3 0 2249 "19 16 16" "{19,16,16,19,3220}" "{i,i,i,o,o}" "{slot_name,immediately_reserve,temporary,slot_name,lsn}" _null_ _null_ pg_create_physical_replication_slot _null_ _null_ _null_ ));
  DESCR("create a physical replication slot");
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
new file mode 100644
index 44c6381..c89e6c3
*** a/src/include/utils/geo_decls.h
--- b/src/include/utils/geo_decls.h
*************** typedef struct
*** 178,186 ****
   * in geo_ops.c
   */
  
! /* private point routines */
  extern double point_dt(Point *pt1, Point *pt2);
  extern double point_sl(Point *pt1, Point *pt2);
  extern double pg_hypot(double x, double y);
  
  #endif							/* GEO_DECLS_H */
--- 178,187 ----
   * in geo_ops.c
   */
  
! /* private routines */
  extern double point_dt(Point *pt1, Point *pt2);
  extern double point_sl(Point *pt1, Point *pt2);
  extern double pg_hypot(double x, double y);
+ extern BOX *box_copy(BOX *box);
  
  #endif							/* GEO_DECLS_H */
diff --git a/src/test/regress/expected/polygon.out b/src/test/regress/expected/polygon.out
new file mode 100644
index 2361274..a9e7752
*** a/src/test/regress/expected/polygon.out
--- b/src/test/regress/expected/polygon.out
*************** SELECT	'(0,0)'::point <-> '((0,0),(1,2),
*** 227,229 ****
--- 227,467 ----
           0 |          0 |      0 | 1.4142135623731 |          3.2
  (1 row)
  
+ --
+ -- Test the SP-GiST index
+ --
+ CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+ INSERT INTO quad_poly_tbl
+ 	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ INSERT INTO quad_poly_tbl
+ 	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+ 	FROM generate_series(10001, 11000) AS i;
+ INSERT INTO quad_poly_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, NULL);
+ CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl;
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   3890
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   7900
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+    977
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   7000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   2990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   1890
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   6900
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   9000
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                        QUERY PLAN                                       
+ ----------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+   3990
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+ ---------------------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+  count 
+ -------
+    831
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+                                  QUERY PLAN                                  
+ -----------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+  count 
+ -------
+      1
+ (1 row)
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+                                  QUERY PLAN                                  
+ -----------------------------------------------------------------------------
+  Aggregate
+    ->  Bitmap Heap Scan on quad_poly_tbl
+          Recheck Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+          ->  Bitmap Index Scan on quad_poly_tbl_idx
+                Index Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+ (5 rows)
+ 
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+  count 
+ -------
+   1000
+ (1 row)
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
diff --git a/src/test/regress/sql/polygon.sql b/src/test/regress/sql/polygon.sql
new file mode 100644
index 7ac8079..c58277b
*** a/src/test/regress/sql/polygon.sql
--- b/src/test/regress/sql/polygon.sql
*************** SELECT	'(0,0)'::point <-> '((0,0),(1,2),
*** 116,118 ****
--- 116,211 ----
  	'(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
  	'(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
  	'(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
+ 
+ --
+ -- Test the SP-GiST index
+ --
+ 
+ CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+ 
+ INSERT INTO quad_poly_tbl
+ 	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+ 	FROM generate_series(1, 100) x,
+ 		 generate_series(1, 100) y;
+ 
+ INSERT INTO quad_poly_tbl
+ 	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+ 	FROM generate_series(10001, 11000) AS i;
+ 
+ INSERT INTO quad_poly_tbl
+ 	VALUES
+ 		(11001, NULL),
+ 		(11002, NULL),
+ 		(11003, NULL);
+ 
+ CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+ 
+ -- get reference results for ORDER BY distance from seq scan
+ SET enable_seqscan = ON;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = OFF;
+ 
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl;
+ 
+ CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+ SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+ FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ -- check results results from index scan
+ SET enable_seqscan = OFF;
+ SET enable_indexscan = OFF;
+ SET enable_bitmapscan = ON;
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+ 
+ EXPLAIN (COSTS OFF)
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+ SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+ 
+ RESET enable_seqscan;
+ RESET enable_indexscan;
+ RESET enable_bitmapscan;
#19Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Alexander Korotkov (#18)

On 20.09.2017 23:19, Alexander Korotkov wrote:

On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Darafei Praliaskouski <me@komzpa.net <mailto:me@komzpa.net>> writes:

I have some questions about the circles example though.

  * What is the reason for isnan check and swap of box ordinates

for circle? It wasn't in the code previously.

I hadn't paid any attention to this patch previously, but this comment
excited my curiosity, so I went and looked:

+       bbox->high.x = circle->center.x + circle->radius;
+       bbox->low.x = circle->center.x - circle->radius;
+       bbox->high.y = circle->center.y + circle->radius;
+       bbox->low.y = circle->center.y - circle->radius;
+
+       if (isnan(bbox->low.x))
+       {
+               double tmp = bbox->low.x;
+               bbox->low.x = bbox->high.x;
+               bbox->high.x = tmp;
+       }

Maybe I'm missing something, but it appears to me that it's
impossible for
bbox->low.x to be NaN unless circle->center.x and/or
circle->radius is a
NaN, in which case bbox->high.x would also have been computed as a
NaN,
making the swap entirely useless.  Likewise for the Y case.  There
may be
something useful to do about NaNs here, but this doesn't seem like it.

Yeah, +1.

It is possible for bbox->low.x to be NaN when circle->center.x is and
circle->radius are both +Infinity.  Without this float-order-preserving
swapping
one regression test for KNN with ORDER BY index will be totally broken
(you can
try it: https://github.com/glukhovn/postgres/tree/knn). Unfortunately, I
do not
remember exactly why, but most likely because of the incorrect index
structure.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nikita Glukhov (#19)

Nikita Glukhov <n.gluhov@postgrespro.ru> writes:

On 20.09.2017 23:19, Alexander Korotkov wrote:

On Wed, Sep 20, 2017 at 11:07 PM, Tom Lane <tgl@sss.pgh.pa.us
<mailto:tgl@sss.pgh.pa.us>> wrote:

Maybe I'm missing something, but it appears to me that it's
impossible for bbox->low.x to be NaN unless circle->center.x and/or
circle->radius is a NaN, in which case bbox->high.x would also have been computed as a NaN,
making the swap entirely useless.

It is possible for bbox->low.x to be NaN when circle->center.x is and
circle->radius are both +Infinity.  Without this float-order-preserving
swapping
one regression test for KNN with ORDER BY index will be totally broken
(you can
try it: https://github.com/glukhovn/postgres/tree/knn).

If that's the reasoning, not having a comment explaining it is
inexcusable. Do you really think people will understand what
the code is doing?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Nikita Glukhov (#19)

It is possible for bbox->low.x to be NaN when circle->center.x is and
circle->radius are both +Infinity.

What is rationale behind this circle?
It seems to me that any circle with radius of any Infinity should become a
[-Infinity .. Infinity, -Infinity .. Infinity] box. Then you won't have
NaNs, and index structure shouldn't be broken.

If it happens because NaN > Infinity, then the check should be not for
isnan, but for if (low>high){swap(high, low)}. It probably should be a part
of box, not a part of circle, maths.

#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Darafei "Komяpa" Praliaskouski (#21)

=?UTF-8?Q?Darafei_=22Kom=D1=8Fpa=22_Praliaskouski?= <me@komzpa.net> writes:

If it happens because NaN > Infinity, then the check should be not for
isnan, but for if (low>high){swap(high, low)}.

Yeah, the same idea had occurred to me. It'd still need a comment, but
at least it's slightly more apparent what we're trying to ensure.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23Alexander Korotkov
aekorotkov@gmail.com
In reply to: Darafei "Komяpa" Praliaskouski (#21)

On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <
me@komzpa.net> wrote:

It is possible for bbox->low.x to be NaN when circle->center.x is and

circle->radius are both +Infinity.

What is rationale behind this circle?

I would prefer to rather forbid any geometries with infs and nans.
However, then upgrade process will suffer. User with such geometries would
get errors during dump/restore, pg_upgraded instances would still contain
invalid values...

It seems to me that any circle with radius of any Infinity should become a
[-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
NaNs, and index structure shouldn't be broken.

We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
box for any geometry containing inf or nan. That MBR would be founded for
any query, saying: "index can't help you for this kind value, only recheck
can deal with that". Therefore, we would at least guarantee that results
of sequential scan and index scan are the same.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#23)

Alexander Korotkov <aekorotkov@gmail.com> writes:

On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <
me@komzpa.net> wrote:

What is rationale behind this circle?

I would prefer to rather forbid any geometries with infs and nans.
However, then upgrade process will suffer. User with such geometries would
get errors during dump/restore, pg_upgraded instances would still contain
invalid values...

Yeah, that ship has sailed unfortunately.

It seems to me that any circle with radius of any Infinity should become a
[-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
NaNs, and index structure shouldn't be broken.

We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
box for any geometry containing inf or nan.

Hm, we can do better in at least some cases, eg for a box ((0,1),(1,inf))
there's no reason to give up our knowledge of finite bounds for the
other three boundaries. But certainly for a NaN circle radius
what you suggest seems the most sensible thing to do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#25Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#24)

On Thu, Sep 21, 2017 at 3:14 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <aekorotkov@gmail.com> writes:

On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski <

It seems to me that any circle with radius of any Infinity should

become a

[-Infinity .. Infinity, -Infinity .. Infinity] box.Then you won't have
NaNs, and index structure shouldn't be broken.

We probably should produce [-Infinity .. Infinity, -Infinity .. Infinity]
box for any geometry containing inf or nan.

Hm, we can do better in at least some cases, eg for a box ((0,1),(1,inf))
there's no reason to give up our knowledge of finite bounds for the
other three boundaries. But certainly for a NaN circle radius
what you suggest seems the most sensible thing to do.

OK. I'll try implement this for circles.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#26Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Alexander Korotkov (#23)
2 attachment(s)

On 21.09.2017 02:27, Alexander Korotkov wrote:

On Thu, Sep 21, 2017 at 2:06 AM, Darafei "Komяpa" Praliaskouski
<me@komzpa.net <mailto:me@komzpa.net>> wrote:

It is possible for bbox->low.x to be NaN when circle->center.x
is and
circle->radius are both +Infinity.

What is rationale behind this circle?

I would prefer to rather forbid any geometries with infs and nans. 
However, then upgrade process will suffer.  User with such geometries
would get errors during dump/restore, pg_upgraded instances would
still contain invalid values...

It seems to me that any circle with radius of any Infinity should
become a [-Infinity .. Infinity, -Infinity .. Infinity] box.Then
you won't have NaNs, and index structure shouldn't be broken.

We probably should produce [-Infinity .. Infinity, -Infinity ..
Infinity] box for any geometry containing inf or nan.  That MBR would
be founded for any query, saying: "index can't help you for this kind
value, only recheck can deal with that".  Therefore, we would at least
guarantee that results of sequential scan and index scan are the same.

I have looked at the GiST KNN code and found the same errors for NaNs,
infinities and NULLs as in my SP-GiST KNN patch.

Attached two patches:

1. Fix NaN-inconsistencies in circle's bounding boxes computed in GiST
compress
method leading to the failed Assert(box->low.x <= box->high.x) in
computeDistance() from src/backend/access/gist/gistproc.c by
transforming NaNs
into infinities (corresponding test case provided in the second patch).

2. Fix ordering of NULL, NaN and infinite distances by GiST.  This distance
values could be mixed because NULL distances were transformed into
infinities,
and there was no special processing for NaNs in KNN queue's comparison
function.
At first I tried just to set recheck flag for NULL distances, but it did not
work for index-only scans because they do not support rechecking. So I had
to add a special flag for NULL distances.

Should I start a separate thread for this issue and add patches to
commitfest?

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-Fix-circle-bounding-box-inconsistency-in-GiST-compress-method-v01.patchtext/x-patch; name=0001-Fix-circle-bounding-box-inconsistency-in-GiST-compress-method-v01.patchDownload
From 52ab493cfe1ec6260578054b71f2c48e77d4850a Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.gluhov@postgrespro.ru>
Date: Fri, 22 Sep 2017 02:06:48 +0300
Subject: [PATCH 1/2] Fix circle bounding box inconsistency in GiST compress
 method

---
 src/backend/access/gist/gistproc.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/src/backend/access/gist/gistproc.c b/src/backend/access/gist/gistproc.c
index 08990f5..2699304 100644
--- a/src/backend/access/gist/gistproc.c
+++ b/src/backend/access/gist/gistproc.c
@@ -1149,6 +1149,16 @@ gist_circle_compress(PG_FUNCTION_ARGS)
 		r->high.y = in->center.y + in->radius;
 		r->low.y = in->center.y - in->radius;
 
+		/* avoid box inconsistency by transforming NaNs into infinities */
+		if (isnan(r->high.x))
+			r->high.x = get_float8_infinity();
+		if (isnan(r->high.y))
+			r->high.y = get_float8_infinity();
+		if (isnan(r->low.x))
+			r->low.x = -get_float8_infinity();
+		if (isnan(r->low.y))
+			r->low.y = -get_float8_infinity();
+
 		retval = (GISTENTRY *) palloc(sizeof(GISTENTRY));
 		gistentryinit(*retval, PointerGetDatum(r),
 					  entry->rel, entry->page,
-- 
2.7.4

0002-Fix-GiST-ordering-by-distance-for-NULLs-and-NaNs-v01.patchtext/x-patch; name=0002-Fix-GiST-ordering-by-distance-for-NULLs-and-NaNs-v01.patchDownload
From 066ad9104ec0e967e20e820977286f001e4055a4 Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.gluhov@postgrespro.ru>
Date: Thu, 21 Sep 2017 19:09:02 +0300
Subject: [PATCH 2/2] Fix GiST ordering by distance for NULLs and NaNs

---
 src/backend/access/gist/gistget.c  | 29 ++++++++++-------
 src/backend/access/gist/gistscan.c | 36 +++++++++++++++++++--
 src/include/access/gist_private.h  | 13 ++++++--
 src/test/regress/expected/gist.out | 64 ++++++++++++++++++++++++++++++++++++++
 src/test/regress/sql/gist.sql      | 60 +++++++++++++++++++++++++++++++++++
 5 files changed, 184 insertions(+), 18 deletions(-)

diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index 760ea0c..7fed700 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -132,7 +132,7 @@ gistindex_keytest(IndexScanDesc scan,
 	GISTSTATE  *giststate = so->giststate;
 	ScanKey		key = scan->keyData;
 	int			keySize = scan->numberOfKeys;
-	double	   *distance_p;
+	GISTDistance *distance_p;
 	Relation	r = scan->indexRelation;
 
 	*recheck_p = false;
@@ -150,7 +150,10 @@ gistindex_keytest(IndexScanDesc scan,
 		if (GistPageIsLeaf(page))	/* shouldn't happen */
 			elog(ERROR, "invalid GiST tuple found on leaf page");
 		for (i = 0; i < scan->numberOfOrderBys; i++)
-			so->distances[i] = -get_float8_infinity();
+		{
+			so->distances[i].value = -get_float8_infinity();
+			so->distances[i].isnull = false;
+		}
 		return true;
 	}
 
@@ -248,7 +251,8 @@ gistindex_keytest(IndexScanDesc scan,
 		if ((key->sk_flags & SK_ISNULL) || isNull)
 		{
 			/* Assume distance computes as null and sorts to the end */
-			*distance_p = get_float8_infinity();
+			distance_p->value = get_float8_nan();
+			distance_p->isnull = true;
 		}
 		else
 		{
@@ -285,7 +289,8 @@ gistindex_keytest(IndexScanDesc scan,
 									 ObjectIdGetDatum(key->sk_subtype),
 									 PointerGetDatum(&recheck));
 			*recheck_distances_p |= recheck;
-			*distance_p = DatumGetFloat8(dist);
+			distance_p->value = DatumGetFloat8(dist);
+			distance_p->isnull = false;
 		}
 
 		key++;
@@ -319,8 +324,8 @@ gistindex_keytest(IndexScanDesc scan,
  * sibling will be processed next.
  */
 static void
-gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
-			 TIDBitmap *tbm, int64 *ntids)
+gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem,
+			 GISTDistance *myDistances, TIDBitmap *tbm, int64 *ntids)
 {
 	GISTScanOpaque so = (GISTScanOpaque) scan->opaque;
 	GISTSTATE  *giststate = so->giststate;
@@ -367,7 +372,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 
 		/* Insert it into the queue using same distances as for this page */
 		memcpy(item->distances, myDistances,
-			   sizeof(double) * scan->numberOfOrderBys);
+			   sizeof(item->distances[0]) * scan->numberOfOrderBys);
 
 		pairingheap_add(so->queue, &item->phNode);
 
@@ -497,7 +502,7 @@ gistScanPage(IndexScanDesc scan, GISTSearchItem *pageItem, double *myDistances,
 
 			/* Insert it into the queue using new distance data */
 			memcpy(item->distances, so->distances,
-				   sizeof(double) * scan->numberOfOrderBys);
+				   sizeof(item->distances[0]) * scan->numberOfOrderBys);
 
 			pairingheap_add(so->queue, &item->phNode);
 
@@ -571,8 +576,8 @@ getNextNearest(IndexScanDesc scan)
 					if (!scan->xs_orderbynulls[i])
 						pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
 #endif
-					scan->xs_orderbyvals[i] = Float8GetDatum(item->distances[i]);
-					scan->xs_orderbynulls[i] = false;
+					scan->xs_orderbyvals[i] = Float8GetDatum(item->distances[i].value);
+					scan->xs_orderbynulls[i] = item->distances[i].isnull;
 				}
 				else if (so->orderByTypes[i] == FLOAT4OID)
 				{
@@ -582,8 +587,8 @@ getNextNearest(IndexScanDesc scan)
 					if (!scan->xs_orderbynulls[i])
 						pfree(DatumGetPointer(scan->xs_orderbyvals[i]));
 #endif
-					scan->xs_orderbyvals[i] = Float4GetDatum((float4) item->distances[i]);
-					scan->xs_orderbynulls[i] = false;
+					scan->xs_orderbyvals[i] = Float4GetDatum((float4) item->distances[i].value);
+					scan->xs_orderbynulls[i] = item->distances[i].isnull;
 				}
 				else
 				{
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index 058544e..95cb554 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -14,6 +14,8 @@
  */
 #include "postgres.h"
 
+#include <math.h>
+
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "access/relscan.h"
@@ -36,8 +38,36 @@ pairingheap_GISTSearchItem_cmp(const pairingheap_node *a, const pairingheap_node
 	/* Order according to distance comparison */
 	for (i = 0; i < scan->numberOfOrderBys; i++)
 	{
-		if (sa->distances[i] != sb->distances[i])
-			return (sa->distances[i] < sb->distances[i]) ? 1 : -1;
+		double		da;
+		double		db;
+
+		if (sa->distances[i].isnull)
+		{
+			if (sb->distances[i].isnull)
+				continue;	/* NULL = NULL */
+
+			return -1;	/* NULL > non-NULL */
+		}
+
+		if (sb->distances[i].isnull)
+			return 1;	/* non-NULL < NULL */
+
+		da = sa->distances[i].value;
+		db = sb->distances[i].value;
+
+		if (isnan(da))
+		{
+			if (isnan(db))
+				continue;	/* NaN = NaN */
+
+			return -1;	/* NaN > non-NaN */
+		}
+
+		if (isnan(db))
+			return 1;	/* non-NaN < NaN */
+
+		if (da != db)
+			return (da < db) ? 1 : -1;
 	}
 
 	/* Heap items go before inner pages, to ensure a depth-first search */
@@ -81,7 +111,7 @@ gistbeginscan(Relation r, int nkeys, int norderbys)
 	so->queueCxt = giststate->scanCxt;	/* see gistrescan */
 
 	/* workspaces with size dependent on numberOfOrderBys: */
-	so->distances = palloc(sizeof(double) * scan->numberOfOrderBys);
+	so->distances = palloc(sizeof(so->distances[0]) * scan->numberOfOrderBys);
 	so->qual_ok = true;			/* in case there are zero keys */
 	if (scan->numberOfOrderBys > 0)
 	{
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index bfef2df..2ae3162 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -125,6 +125,13 @@ typedef struct GISTSearchHeapItem
 								 * LP_DEAD */
 } GISTSearchHeapItem;
 
+/* Nullable distance */
+typedef struct GISTDistance
+{
+	double		value;
+	bool		isnull;
+} GISTDistance;
+
 /* Unvisited item, either index page or heap tuple */
 typedef struct GISTSearchItem
 {
@@ -136,13 +143,13 @@ typedef struct GISTSearchItem
 		/* we must store parentlsn to detect whether a split occurred */
 		GISTSearchHeapItem heap;	/* heap info, if heap tuple */
 	}			data;
-	double		distances[FLEXIBLE_ARRAY_MEMBER];	/* numberOfOrderBys
+	GISTDistance distances[FLEXIBLE_ARRAY_MEMBER];	/* numberOfOrderBys
 													 * entries */
 } GISTSearchItem;
 
 #define GISTSearchItemIsHeap(item)	((item).blkno == InvalidBlockNumber)
 
-#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(double) * (n_distances))
+#define SizeOfGISTSearchItem(n_distances) (offsetof(GISTSearchItem, distances) + sizeof(GISTDistance) * (n_distances))
 
 /*
  * GISTScanOpaqueData: private state for a scan of a GiST index
@@ -158,7 +165,7 @@ typedef struct GISTScanOpaqueData
 	bool		firstCall;		/* true until first gistgettuple call */
 
 	/* pre-allocated workspace arrays */
-	double	   *distances;		/* output area for gistindex_keytest */
+	GISTDistance *distances;	/* output area for gistindex_keytest */
 
 	/* info about killed items if any (killedItems is NULL if never used) */
 	OffsetNumber *killedItems;	/* offset numbers of killed items */
diff --git a/src/test/regress/expected/gist.out b/src/test/regress/expected/gist.out
index 91f9998..6f165e6 100644
--- a/src/test/regress/expected/gist.out
+++ b/src/test/regress/expected/gist.out
@@ -225,3 +225,67 @@ reset enable_seqscan;
 reset enable_bitmapscan;
 reset enable_indexonlyscan;
 drop table gist_tbl;
+-- Test ORDER BY circle <-> point with Inifinities and NaNs
+CREATE TABLE gist_circle_tbl (id int, c circle);
+INSERT INTO gist_circle_tbl
+  SELECT (x - 1) * 100 + y, circle(point(x * 10, y * 10), 1 + (x + y) % 10)
+  FROM generate_series(1, 100) x,
+       generate_series(1, 100) y;
+INSERT INTO gist_circle_tbl
+  SELECT i, '<(200, 300), 5>'
+  FROM generate_series(10001, 11000) AS i;
+INSERT INTO gist_circle_tbl
+  VALUES
+    (11001, NULL),
+    (11002, NULL),
+    (11003, '<(0,100), infinity>'),
+    (11004, '<(-infinity,0),1000>'),
+    (11005, '<(infinity,-infinity),infinity>');
+CREATE INDEX gist_circle_tbl_idx ON gist_circle_tbl USING gist(c);
+-- get ordering results from seq scan
+SET enable_seqscan TO on;
+SET enable_indexscan TO off;
+EXPLAIN (COSTS OFF)
+SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+FROM gist_circle_tbl;
+                   QUERY PLAN                   
+------------------------------------------------
+ WindowAgg
+   ->  Sort
+         Sort Key: ((c <-> '(123,456)'::point))
+         ->  Seq Scan on gist_circle_tbl
+(4 rows)
+
+CREATE TABLE gist_circle_tbl_ord_seq AS
+  SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+  FROM gist_circle_tbl;
+-- get ordering results from index scan
+SET enable_seqscan TO off;
+SET enable_indexscan TO on;
+EXPLAIN (COSTS OFF)
+SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+FROM gist_circle_tbl;
+                          QUERY PLAN                           
+---------------------------------------------------------------
+ WindowAgg
+   ->  Index Scan using gist_circle_tbl_idx on gist_circle_tbl
+         Order By: (c <-> '(123,456)'::point)
+(3 rows)
+
+CREATE TABLE gist_circle_tbl_ord_idx AS
+  SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+  FROM gist_circle_tbl;
+-- compare results
+SELECT *
+FROM gist_circle_tbl_ord_seq seq FULL JOIN gist_circle_tbl_ord_idx idx USING (n, id)
+WHERE seq.id IS NULL OR idx.id IS NULL;
+ n | id | dist | dist 
+---+----+------+------
+(0 rows)
+
+-- clean up
+RESET enable_seqscan;
+RESET enable_indexscan;
+DROP TABLE gist_circle_tbl;
+DROP TABLE gist_circle_tbl_ord_seq;
+DROP TABLE gist_circle_tbl_ord_idx;
diff --git a/src/test/regress/sql/gist.sql b/src/test/regress/sql/gist.sql
index 49126fd..a603e72 100644
--- a/src/test/regress/sql/gist.sql
+++ b/src/test/regress/sql/gist.sql
@@ -120,3 +120,63 @@ reset enable_bitmapscan;
 reset enable_indexonlyscan;
 
 drop table gist_tbl;
+
+
+-- Test ORDER BY circle <-> point with Inifinities and NaNs
+CREATE TABLE gist_circle_tbl (id int, c circle);
+
+INSERT INTO gist_circle_tbl
+  SELECT (x - 1) * 100 + y, circle(point(x * 10, y * 10), 1 + (x + y) % 10)
+  FROM generate_series(1, 100) x,
+       generate_series(1, 100) y;
+
+INSERT INTO gist_circle_tbl
+  SELECT i, '<(200, 300), 5>'
+  FROM generate_series(10001, 11000) AS i;
+
+INSERT INTO gist_circle_tbl
+  VALUES
+    (11001, NULL),
+    (11002, NULL),
+    (11003, '<(0,100), infinity>'),
+    (11004, '<(-infinity,0),1000>'),
+    (11005, '<(infinity,-infinity),infinity>');
+
+CREATE INDEX gist_circle_tbl_idx ON gist_circle_tbl USING gist(c);
+
+-- get ordering results from seq scan
+SET enable_seqscan TO on;
+SET enable_indexscan TO off;
+
+EXPLAIN (COSTS OFF)
+SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+FROM gist_circle_tbl;
+
+CREATE TABLE gist_circle_tbl_ord_seq AS
+  SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+  FROM gist_circle_tbl;
+
+-- get ordering results from index scan
+SET enable_seqscan TO off;
+SET enable_indexscan TO on;
+
+EXPLAIN (COSTS OFF)
+SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+FROM gist_circle_tbl;
+
+CREATE TABLE gist_circle_tbl_ord_idx AS
+  SELECT rank() OVER (ORDER BY c <-> point '123,456') n, c <-> point '123,456' dist, id
+  FROM gist_circle_tbl;
+
+-- compare results
+SELECT *
+FROM gist_circle_tbl_ord_seq seq FULL JOIN gist_circle_tbl_ord_idx idx USING (n, id)
+WHERE seq.id IS NULL OR idx.id IS NULL;
+
+-- clean up
+RESET enable_seqscan;
+RESET enable_indexscan;
+
+DROP TABLE gist_circle_tbl;
+DROP TABLE gist_circle_tbl_ord_seq;
+DROP TABLE gist_circle_tbl_ord_idx;
-- 
2.7.4

#27Michael Paquier
michael.paquier@gmail.com
In reply to: Nikita Glukhov (#26)
Re: [HACKERS] compress method for spgist - 2

On Fri, Sep 22, 2017 at 9:03 AM, Nikita Glukhov <n.gluhov@postgrespro.ru> wrote:

Should I start a separate thread for this issue and add patches to
commitfest?

Yes, please. It would be nice if you could spawn a separate thread for
what looks like a bug, and separate topics should have their own
thread. This will attract more attention from other hackers as this is
unrelated to this thread. Adding an entry in the CF app under the
category "Bug Fix" also avoids losing any items worth fixing.

I can see as well that the patches posted at the beginning of the
thread got reviews but that those did not get answered. The set of
patches also have conflicts with HEAD so they need a rebase. For those
reasons I am marking this entry as returned with feedback.
--
Michael

#28Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Michael Paquier (#27)
2 attachment(s)
Re: [HACKERS] compress method for spgist - 2

On 30.11.2017 05:31, Michael Paquier wrote:

I can see as well that the patches posted at the beginning of the
thread got reviews but that those did not get answered. The set of
patches also have conflicts with HEAD so they need a rebase. For those
reasons I am marking this entry as returned with feedback.

Rebased patches are attached.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-8.patchtext/x-patch; name=0001-spgist-compress-method-8.patchDownload
From 1c251ff129f8e9f33d14df547d97ba549b109648 Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.gluhov@postgrespro.ru>
Date: Tue, 5 Dec 2017 02:38:50 +0300
Subject: [PATCH 1/2] spgist-compress-method-8

---
 doc/src/sgml/spgist.sgml                | 54 ++++++++++++++++++++++++++-------
 src/backend/access/spgist/spgdoinsert.c | 44 +++++++++++++++++++--------
 src/backend/access/spgist/spgutils.c    | 23 ++++++++++++--
 src/backend/access/spgist/spgvalidate.c | 24 ++++++++++++++-
 src/include/access/spgist.h             |  5 ++-
 src/include/access/spgist_private.h     |  8 +++--
 6 files changed, 127 insertions(+), 31 deletions(-)

diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 139c8ed..55c1b06 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -240,20 +240,21 @@
 
  <para>
   There are five user-defined methods that an index operator class for
-  <acronym>SP-GiST</acronym> must provide.  All five follow the convention
-  of accepting two <type>internal</type> arguments, the first of which is a
-  pointer to a C struct containing input values for the support method,
-  while the second argument is a pointer to a C struct where output values
-  must be placed.  Four of the methods just return <type>void</type>, since
-  all their results appear in the output struct; but
+  <acronym>SP-GiST</acronym> must provide and one optional. All five mandatory
+  methos follow the convention of accepting two <type>internal</type> arguments,
+  the first of which is a pointer to a C struct containing input values for 
+  the support method, while the second argument is a pointer to a C struct 
+  where output values must be placed.  Four of the methods just return 
+  <type>void</type>, since all their results appear in the output struct; but
   <function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
   The methods must not modify any fields of their input structs.  In all
   cases, the output struct is initialized to zeroes before calling the
-  user-defined method.
+  user-defined method. Optional method <function>compress</function> accepts
+  datum to be indexed and returns values which actually will be indexed.
  </para>
 
  <para>
-  The five user-defined methods are:
+  The five mandatory user-defined methods are:
  </para>
 
  <variablelist>
@@ -283,6 +284,7 @@ typedef struct spgConfigOut
 {
     Oid         prefixType;     /* Data type of inner-tuple prefixes */
     Oid         labelType;      /* Data type of inner-tuple node labels */
+    Oid         leafType;       /* Data type of leaf */
     bool        canReturnData;  /* Opclass can reconstruct original data */
     bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
 } spgConfigOut;
@@ -303,7 +305,15 @@ typedef struct spgConfigOut
       <structfield>longValuesOK</structfield> should be set true only when the
       <structfield>attType</structfield> is of variable length and the operator
       class is capable of segmenting long values by repeated suffixing
-      (see <xref linkend="spgist-limits"/>).
+      (see <xref linkend="spgist-limits"/>). <structfield>leafType</structfield>
+      usually has the same value as <structfield>attType</structfield> but if
+      it's different then optional method  <function>compress</function>
+      should be provided. Method  <function>compress</function> is responsible
+      for transformation from <structfield>attType</structfield> to 
+      <structfield>leafType</structfield>. In this case all other function
+      should accept <structfield>leafType</structfield> values. Note: both
+      consistent functions will get <structfield>scankeys</structfield>
+      unchanged, without <function>compress</function> transformation.
      </para>
      </listitem>
     </varlistentry>
@@ -624,7 +634,8 @@ typedef struct spgInnerConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> should be always a
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type.
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -730,7 +741,8 @@ typedef struct spgLeafConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> should be always a
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type. 
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -757,6 +769,26 @@ typedef struct spgLeafConsistentOut
     </varlistentry>
    </variablelist>
 
+ <para>
+  The optional user-defined method is:
+ </para>
+
+ <variablelist>
+    <varlistentry>
+     <term><function>Datum compress(Datum in)</function></term>
+     <listitem>
+      <para>
+       Converts the data item into a format suitable for physical storage in 
+       an index page. It accepts
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+       value and return
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       value. Output value should not be toasted.
+      </para>
+     </listitem>
+    </varlistentry>
+  </variablelist>
+
   <para>
    All the SP-GiST support methods are normally called in a short-lived
    memory context; that is, <varname>CurrentMemoryContext</varname> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index a5f4c40..edf86f1 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1899,21 +1899,41 @@ spgdoinsert(Relation index, SpGistState *state,
 	FmgrInfo   *procinfo = NULL;
 
 	/*
-	 * Look up FmgrInfo of the user-defined choose function once, to save
-	 * cycles in the loop below.
+	 * Prepare the leaf datum to insert.
+	 *
+	 * If there is an optional "compress" method, call it to form the leaf
+	 * datum from the input datum. Otherwise we will store the input datum as
+	 * is. (We have to detoast it, though. We assume the "compress" method to
+	 * return an untoasted value.)
 	 */
 	if (!isnull)
-		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
+	{
+		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+		{
+			procinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
+			leafDatum = FunctionCall1Coll(procinfo,
+										  index->rd_indcollation[0],
+										  datum);
+		}
+		else
+		{
+			Assert(state->attLeafType.type == state->attType.type);
+
+			if (state->attType.attlen == -1)
+				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+			else
+				leafDatum = datum;
+		}
+	}
+	else
+		leafDatum = (Datum) 0;
 
 	/*
-	 * Since we don't use index_form_tuple in this AM, we have to make sure
-	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
-	 * that.
+	 * Look up FmgrInfo of the user-defined choose function once, to save
+	 * cycles in the loop below.
 	 */
-	if (!isnull && state->attType.attlen == -1)
-		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
-
-	leafDatum = datum;
+	if (!isnull)
+		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
 
 	/*
 	 * Compute space needed for a leaf tuple containing the given datum.
@@ -1923,7 +1943,7 @@ spgdoinsert(Relation index, SpGistState *state,
 	 */
 	if (!isnull)
 		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-			SpGistGetTypeSize(&state->attType, leafDatum);
+			SpGistGetTypeSize(&state->attLeafType, leafDatum);
 	else
 		leafSize = SGDTSIZE + sizeof(ItemIdData);
 
@@ -2138,7 +2158,7 @@ spgdoinsert(Relation index, SpGistState *state,
 					{
 						leafDatum = out.result.matchNode.restDatum;
 						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-							SpGistGetTypeSize(&state->attType, leafDatum);
+							SpGistGetTypeSize(&state->attLeafType, leafDatum);
 					}
 
 					/*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index bd5301f..668e3c4 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -124,7 +124,23 @@ spgGetCache(Relation index)
 						  PointerGetDatum(&cache->config));
 
 		/* Get the information we need about each relevant datatype */
-		fillTypeDesc(&cache->attType, atttype);
+		if (OidIsValid(cache->config.leafType) &&
+			cache->config.leafType != atttype)
+		{
+			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("compress method must not defined when leaf type is different from input type")));
+
+			fillTypeDesc(&cache->attType, atttype);
+			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
+		}
+		else
+		{
+			fillTypeDesc(&cache->attType, atttype);
+			cache->attLeafType = cache->attType;
+		}
+
 		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
 		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
 
@@ -164,6 +180,7 @@ initSpGistState(SpGistState *state, Relation index)
 
 	state->config = cache->config;
 	state->attType = cache->attType;
+	state->attLeafType = cache->attLeafType;
 	state->attPrefixType = cache->attPrefixType;
 	state->attLabelType = cache->attLabelType;
 
@@ -618,7 +635,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	/* compute space needed (note result is already maxaligned) */
 	size = SGLTHDRSZ;
 	if (!isnull)
-		size += SpGistGetTypeSize(&state->attType, datum);
+		size += SpGistGetTypeSize(&state->attLeafType, datum);
 
 	/*
 	 * Ensure that we can replace the tuple with a dead tuple later.  This
@@ -634,7 +651,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	tup->nextOffset = InvalidOffsetNumber;
 	tup->heapPtr = *heapPtr;
 	if (!isnull)
-		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
+		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
 
 	return tup;
 }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
index 157cf2a..514da47 100644
--- a/src/backend/access/spgist/spgvalidate.c
+++ b/src/backend/access/spgist/spgvalidate.c
@@ -52,6 +52,10 @@ spgvalidate(Oid opclassoid)
 	OpFamilyOpFuncGroup *opclassgroup;
 	int			i;
 	ListCell   *lc;
+	spgConfigIn	configIn;
+	spgConfigOut configOut;
+	Oid			configOutLefttype = InvalidOid;
+	Oid			configOutRighttype = InvalidOid;
 
 	/* Fetch opclass information */
 	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
@@ -100,6 +104,15 @@ spgvalidate(Oid opclassoid)
 		switch (procform->amprocnum)
 		{
 			case SPGIST_CONFIG_PROC:
+				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+											2, 2, INTERNALOID, INTERNALOID);
+				configIn.attType = procform->amproclefttype;
+				OidFunctionCall2(procform->amproc,
+								 PointerGetDatum(&configIn),
+								 PointerGetDatum(&configOut));
+				configOutLefttype = procform->amproclefttype;
+				configOutRighttype = procform->amprocrighttype;
+				break;
 			case SPGIST_CHOOSE_PROC:
 			case SPGIST_PICKSPLIT_PROC:
 			case SPGIST_INNER_CONSISTENT_PROC:
@@ -110,6 +123,15 @@ spgvalidate(Oid opclassoid)
 				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
 											2, 2, INTERNALOID, INTERNALOID);
 				break;
+			case SPGIST_COMPRESS_PROC:
+				if (configOutLefttype != procform->amproclefttype ||
+					configOutRighttype != procform->amprocrighttype)
+					ok = false;
+				else
+					ok = check_amproc_signature(procform->amproc,
+												configOut.leafType, true,
+												1, 1, procform->amproclefttype);
+				break;
 			default:
 				ereport(INFO,
 						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -212,7 +234,7 @@ spgvalidate(Oid opclassoid)
 		if (thisgroup->lefttype != thisgroup->righttype)
 			continue;
 
-		for (i = 1; i <= SPGISTNProc; i++)
+		for (i = 1; i <= SPGISTNRequiredProc; i++)
 		{
 			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
 				continue;		/* got it */
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
index d1bc396..a477278 100644
--- a/src/include/access/spgist.h
+++ b/src/include/access/spgist.h
@@ -30,7 +30,9 @@
 #define SPGIST_PICKSPLIT_PROC			3
 #define SPGIST_INNER_CONSISTENT_PROC	4
 #define SPGIST_LEAF_CONSISTENT_PROC		5
-#define SPGISTNProc						5
+#define SPGIST_COMPRESS_PROC			6
+#define SPGISTNRequiredProc				5
+#define SPGISTNProc						6
 
 /*
  * Argument structs for spg_config method
@@ -44,6 +46,7 @@ typedef struct spgConfigOut
 {
 	Oid			prefixType;		/* Data type of inner-tuple prefixes */
 	Oid			labelType;		/* Data type of inner-tuple node labels */
+	Oid			leafType;		/* Data type of leaf (type of SPGIST_COMPRESS_PROC output) */
 	bool		canReturnData;	/* Opclass can reconstruct original data */
 	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
 } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 1c4b321..69dc2ba 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -119,7 +119,8 @@ typedef struct SpGistState
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -178,7 +179,8 @@ typedef struct SpGistCache
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -300,7 +302,7 @@ typedef SpGistLeafTupleData *SpGistLeafTuple;
 
 #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
 #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
-#define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
+#define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
 							 *(Datum *) SGLTDATAPTR(x) : \
 							 PointerGetDatum(SGLTDATAPTR(x)))
 
-- 
2.7.4

0002-spgist-polygon-8.patchtext/x-patch; name=0002-spgist-polygon-8.patchDownload
From f50670601756ca5ea495179627bb3166959b6481 Mon Sep 17 00:00:00 2001
From: Nikita Glukhov <n.gluhov@postgrespro.ru>
Date: Tue, 5 Dec 2017 02:50:09 +0300
Subject: [PATCH 2/2] spgist-polygon-8

---
 doc/src/sgml/spgist.sgml              |  18 +++
 src/backend/utils/adt/geo_ops.c       |   3 +-
 src/backend/utils/adt/geo_spgist.c    |  92 ++++++++++++-
 src/include/catalog/pg_amop.h         |  16 +++
 src/include/catalog/pg_amproc.h       |   6 +
 src/include/catalog/pg_opclass.h      |   1 +
 src/include/catalog/pg_opfamily.h     |   1 +
 src/include/catalog/pg_proc.h         |   5 +
 src/include/utils/geo_decls.h         |   3 +-
 src/test/regress/expected/polygon.out | 238 ++++++++++++++++++++++++++++++++++
 src/test/regress/sql/polygon.sql      |  93 +++++++++++++
 11 files changed, 469 insertions(+), 7 deletions(-)

diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 55c1b06..4f48d2b 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -131,6 +131,24 @@
       </entry>
      </row>
      <row>
+      <entry><literal>poly_ops</literal></entry>
+      <entry><type>polygon</type></entry>
+      <entry>
+       <literal>&lt;&lt;</literal>
+       <literal>&amp;&lt;</literal>
+       <literal>&amp;&amp;</literal>
+       <literal>&amp;&gt;</literal>
+       <literal>&gt;&gt;</literal>
+       <literal>~=</literal>
+       <literal>@&gt;</literal>
+       <literal>&lt;@</literal>
+       <literal>&amp;&lt;|</literal>
+       <literal>&lt;&lt;|</literal>
+       <literal>|&gt;&gt;</literal>
+       <literal>|&amp;&gt;</literal>
+      </entry>
+     </row>
+     <row>
       <entry><literal>text_ops</literal></entry>
       <entry><type>text</type></entry>
       <entry>
diff --git a/src/backend/utils/adt/geo_ops.c b/src/backend/utils/adt/geo_ops.c
index 9dbe5db..f00ea54 100644
--- a/src/backend/utils/adt/geo_ops.c
+++ b/src/backend/utils/adt/geo_ops.c
@@ -41,7 +41,6 @@ enum path_delim
 static int	point_inside(Point *p, int npts, Point *plist);
 static int	lseg_crossing(double x, double y, double px, double py);
 static BOX *box_construct(double x1, double x2, double y1, double y2);
-static BOX *box_copy(BOX *box);
 static BOX *box_fill(BOX *result, double x1, double x2, double y1, double y2);
 static bool box_ov(BOX *box1, BOX *box2);
 static double box_ht(BOX *box);
@@ -482,7 +481,7 @@ box_fill(BOX *result, double x1, double x2, double y1, double y2)
 
 /*		box_copy		-		copy a box
  */
-static BOX *
+BOX *
 box_copy(BOX *box)
 {
 	BOX		   *result = (BOX *) palloc(sizeof(BOX));
diff --git a/src/backend/utils/adt/geo_spgist.c b/src/backend/utils/adt/geo_spgist.c
index f6334ba..a105436 100644
--- a/src/backend/utils/adt/geo_spgist.c
+++ b/src/backend/utils/adt/geo_spgist.c
@@ -391,7 +391,7 @@ spg_box_quad_choose(PG_FUNCTION_ARGS)
 	spgChooseIn *in = (spgChooseIn *) PG_GETARG_POINTER(0);
 	spgChooseOut *out = (spgChooseOut *) PG_GETARG_POINTER(1);
 	BOX		   *centroid = DatumGetBoxP(in->prefixDatum),
-			   *box = DatumGetBoxP(in->datum);
+			   *box = DatumGetBoxP(in->leafDatum);
 
 	out->resultType = spgMatchNode;
 	out->result.matchNode.restDatum = BoxPGetDatum(box);
@@ -474,6 +474,51 @@ spg_box_quad_picksplit(PG_FUNCTION_ARGS)
 }
 
 /*
+ * Check if result of consistent method based on bounding box is exact.
+ */
+static bool
+is_bounding_box_test_exact(StrategyNumber strategy)
+{
+	switch (strategy)
+	{
+		case RTLeftStrategyNumber:
+		case RTOverLeftStrategyNumber:
+		case RTOverRightStrategyNumber:
+		case RTRightStrategyNumber:
+		case RTOverBelowStrategyNumber:
+		case RTBelowStrategyNumber:
+		case RTAboveStrategyNumber:
+		case RTOverAboveStrategyNumber:
+			return true;
+
+		default:
+			return false;
+	}
+}
+
+/*
+ * Get bounding box for ScanKey.
+ */
+static BOX *
+spg_box_quad_get_scankey_bbox(ScanKey sk, bool *recheck)
+{
+	switch (sk->sk_subtype)
+	{
+		case BOXOID:
+			return DatumGetBoxP(sk->sk_argument);
+
+		case POLYGONOID:
+			if (recheck && !is_bounding_box_test_exact(sk->sk_strategy))
+				*recheck = true;
+			return &DatumGetPolygonP(sk->sk_argument)->boundbox;
+
+		default:
+			elog(ERROR, "unrecognized scankey subtype: %d", sk->sk_subtype);
+			return NULL;
+	}
+}
+
+/*
  * SP-GiST inner consistent function
  */
 Datum
@@ -515,7 +560,11 @@ spg_box_quad_inner_consistent(PG_FUNCTION_ARGS)
 	centroid = getRangeBox(DatumGetBoxP(in->prefixDatum));
 	queries = (RangeBox **) palloc(in->nkeys * sizeof(RangeBox *));
 	for (i = 0; i < in->nkeys; i++)
-		queries[i] = getRangeBox(DatumGetBoxP(in->scankeys[i].sk_argument));
+	{
+		BOX		   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i], NULL);
+
+		queries[i] = getRangeBox(box);
+	}
 
 	/* Allocate enough memory for nodes */
 	out->nNodes = 0;
@@ -637,8 +686,10 @@ spg_box_quad_leaf_consistent(PG_FUNCTION_ARGS)
 	/* Perform the required comparison(s) */
 	for (i = 0; i < in->nkeys; i++)
 	{
-		StrategyNumber strategy = in->scankeys[i].sk_strategy;
-		Datum		query = in->scankeys[i].sk_argument;
+		StrategyNumber	strategy = in->scankeys[i].sk_strategy;
+		BOX			   *box = spg_box_quad_get_scankey_bbox(&in->scankeys[i],
+															&out->recheck);
+		Datum			query = BoxPGetDatum(box);
 
 		switch (strategy)
 		{
@@ -713,3 +764,36 @@ spg_box_quad_leaf_consistent(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(flag);
 }
+
+
+/*
+ * SP-GiST config function for 2-D types that are lossy represented by their
+ * bounding boxes
+ */
+Datum
+spg_bbox_quad_config(PG_FUNCTION_ARGS)
+{
+	spgConfigOut *cfg = (spgConfigOut *) PG_GETARG_POINTER(1);
+
+	cfg->prefixType = BOXOID;	/* A type represented by its bounding box */
+	cfg->labelType = VOIDOID;	/* We don't need node labels. */
+	cfg->leafType = BOXOID;
+	cfg->canReturnData = false;
+	cfg->longValuesOK = false;
+
+	PG_RETURN_VOID();
+}
+
+/*
+ * SP-GiST compress function for polygons
+ */
+Datum
+spg_poly_quad_compress(PG_FUNCTION_ARGS)
+{
+	POLYGON	   *polygon = PG_GETARG_POLYGON_P(0);
+	BOX		   *box;
+
+	box = box_copy(&polygon->boundbox);
+
+	PG_RETURN_BOX_P(box);
+}
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index f850be4..d877079 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -858,6 +858,22 @@ DATA(insert (	5000	603  603 11 s	2573	4000 0 ));
 DATA(insert (	5000	603  603 12 s	2572	4000 0 ));
 
 /*
+ * SP-GiST poly_ops (supports polygons)
+ */
+DATA(insert (	5008   604	604  1 s	 485	4000 0 ));
+DATA(insert (	5008   604	604  2 s	 486	4000 0 ));
+DATA(insert (	5008   604	604  3 s	 492	4000 0 ));
+DATA(insert (	5008   604	604  4 s	 487	4000 0 ));
+DATA(insert (	5008   604	604  5 s	 488	4000 0 ));
+DATA(insert (	5008   604	604  6 s	 491	4000 0 ));
+DATA(insert (	5008   604	604  7 s	 490	4000 0 ));
+DATA(insert (	5008   604	604  8 s	 489	4000 0 ));
+DATA(insert (	5008   604	604  9 s	2575	4000 0 ));
+DATA(insert (	5008   604	604 10 s	2574	4000 0 ));
+DATA(insert (	5008   604	604 11 s	2577	4000 0 ));
+DATA(insert (	5008   604	604 12 s	2576	4000 0 ));
+
+/*
  * GiST inet_ops
  */
 DATA(insert (	3550	869 869 3 s		3552 783 0 ));
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index 1c95846..8b0c26b 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -334,6 +334,12 @@ DATA(insert (	5000   603 603 2 5013 ));
 DATA(insert (	5000   603 603 3 5014 ));
 DATA(insert (	5000   603 603 4 5015 ));
 DATA(insert (	5000   603 603 5 5016 ));
+DATA(insert (	5008   604 604 1 5009 ));
+DATA(insert (	5008   604 604 2 5013 ));
+DATA(insert (	5008   604 604 3 5014 ));
+DATA(insert (	5008   604 604 4 5015 ));
+DATA(insert (	5008   604 604 5 5016 ));
+DATA(insert (	5008   604 604 6 5011 ));
 
 /* BRIN opclasses */
 /* minmax bytea */
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
index 28dbc74..6aabc72 100644
--- a/src/include/catalog/pg_opclass.h
+++ b/src/include/catalog/pg_opclass.h
@@ -205,6 +205,7 @@ DATA(insert (	4000	box_ops				PGNSP PGUID 5000  603  t 0 ));
 DATA(insert (	4000	quad_point_ops		PGNSP PGUID 4015  600 t 0 ));
 DATA(insert (	4000	kd_point_ops		PGNSP PGUID 4016  600 f 0 ));
 DATA(insert (	4000	text_ops			PGNSP PGUID 4017  25 t 0 ));
+DATA(insert (	4000	poly_ops			PGNSP PGUID 5008  604 t 603 ));
 DATA(insert (	403		jsonb_ops			PGNSP PGUID 4033  3802 t 0 ));
 DATA(insert (	405		jsonb_ops			PGNSP PGUID 4034  3802 t 0 ));
 DATA(insert (	2742	jsonb_ops			PGNSP PGUID 4036  3802 t 25 ));
diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h
index 0d0ba7c..838812b 100644
--- a/src/include/catalog/pg_opfamily.h
+++ b/src/include/catalog/pg_opfamily.h
@@ -186,5 +186,6 @@ DATA(insert OID = 4103 (	3580	range_inclusion_ops		PGNSP PGUID ));
 DATA(insert OID = 4082 (	3580	pg_lsn_minmax_ops		PGNSP PGUID ));
 DATA(insert OID = 4104 (	3580	box_inclusion_ops		PGNSP PGUID ));
 DATA(insert OID = 5000 (	4000	box_ops		PGNSP PGUID ));
+DATA(insert OID = 5008 (	4000	poly_ops				PGNSP PGUID ));
 
 #endif							/* PG_OPFAMILY_H */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index c969375..a87c2f2 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -5335,6 +5335,11 @@ DESCR("SP-GiST support for quad tree over box");
 DATA(insert OID = 5016 (  spg_box_quad_leaf_consistent	PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_box_quad_leaf_consistent _null_ _null_ _null_ ));
 DESCR("SP-GiST support for quad tree over box");
 
+DATA(insert OID = 5009 (  spg_bbox_quad_config PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 2278 "2281 2281" _null_ _null_ _null_ _null_  _null_ spg_bbox_quad_config _null_ _null_ _null_ ));
+DESCR("SP-GiST support for quad tree over 2-D types represented by their bounding boxes");
+DATA(insert OID = 5011 (  spg_poly_quad_compress PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 603 "604" _null_ _null_ _null_ _null_  _null_ spg_poly_quad_compress _null_ _null_ _null_ ));
+DESCR("SP-GiST support for quad tree over polygons");
+
 /* replication slots */
 DATA(insert OID = 3779 (  pg_create_physical_replication_slot PGNSP PGUID 12 1 0 0 0 f f f f t f v u 3 0 2249 "19 16 16" "{19,16,16,19,3220}" "{i,i,i,o,o}" "{slot_name,immediately_reserve,temporary,slot_name,lsn}" _null_ _null_ pg_create_physical_replication_slot _null_ _null_ _null_ ));
 DESCR("create a physical replication slot");
diff --git a/src/include/utils/geo_decls.h b/src/include/utils/geo_decls.h
index 44c6381..c89e6c3 100644
--- a/src/include/utils/geo_decls.h
+++ b/src/include/utils/geo_decls.h
@@ -178,9 +178,10 @@ typedef struct
  * in geo_ops.c
  */
 
-/* private point routines */
+/* private routines */
 extern double point_dt(Point *pt1, Point *pt2);
 extern double point_sl(Point *pt1, Point *pt2);
 extern double pg_hypot(double x, double y);
+extern BOX *box_copy(BOX *box);
 
 #endif							/* GEO_DECLS_H */
diff --git a/src/test/regress/expected/polygon.out b/src/test/regress/expected/polygon.out
index 2361274..a9e7752 100644
--- a/src/test/regress/expected/polygon.out
+++ b/src/test/regress/expected/polygon.out
@@ -227,3 +227,241 @@ SELECT	'(0,0)'::point <-> '((0,0),(1,2),(2,1))'::polygon as on_corner,
          0 |          0 |      0 | 1.4142135623731 |          3.2
 (1 row)
 
+--
+-- Test the SP-GiST index
+--
+CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+INSERT INTO quad_poly_tbl
+	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+	FROM generate_series(1, 100) x,
+		 generate_series(1, 100) y;
+INSERT INTO quad_poly_tbl
+	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+	FROM generate_series(10001, 11000) AS i;
+INSERT INTO quad_poly_tbl
+	VALUES
+		(11001, NULL),
+		(11002, NULL),
+		(11003, NULL);
+CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+-- get reference results for ORDER BY distance from seq scan
+SET enable_seqscan = ON;
+SET enable_indexscan = OFF;
+SET enable_bitmapscan = OFF;
+CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+FROM quad_poly_tbl;
+CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+-- check results results from index scan
+SET enable_seqscan = OFF;
+SET enable_indexscan = OFF;
+SET enable_bitmapscan = ON;
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p << '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  3890
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p &< '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  7900
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p && '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+   977
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p &> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  7000
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p >> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  2990
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p <<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  1890
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p &<| '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  6900
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p |&> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  9000
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p |>> '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+  3990
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p <@ '((300,300),(400,600),(600,500),(700,200))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+ count 
+-------
+   831
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p @> '((340,550),(343,552),(341,553))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+ count 
+-------
+     1
+(1 row)
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+                                 QUERY PLAN                                  
+-----------------------------------------------------------------------------
+ Aggregate
+   ->  Bitmap Heap Scan on quad_poly_tbl
+         Recheck Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+         ->  Bitmap Index Scan on quad_poly_tbl_idx
+               Index Cond: (p ~= '((200,300),(210,310),(230,290))'::polygon)
+(5 rows)
+
+SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+ count 
+-------
+  1000
+(1 row)
+
+RESET enable_seqscan;
+RESET enable_indexscan;
+RESET enable_bitmapscan;
diff --git a/src/test/regress/sql/polygon.sql b/src/test/regress/sql/polygon.sql
index 7ac8079..c58277b 100644
--- a/src/test/regress/sql/polygon.sql
+++ b/src/test/regress/sql/polygon.sql
@@ -116,3 +116,96 @@ SELECT	'(0,0)'::point <-> '((0,0),(1,2),(2,1))'::polygon as on_corner,
 	'(2,2)'::point <-> '((0,0),(1,4),(3,1))'::polygon as inside,
 	'(3,3)'::point <-> '((0,2),(2,0),(2,2))'::polygon as near_corner,
 	'(4,4)'::point <-> '((0,0),(0,3),(4,0))'::polygon as near_segment;
+
+--
+-- Test the SP-GiST index
+--
+
+CREATE TEMPORARY TABLE quad_poly_tbl (id int, p polygon);
+
+INSERT INTO quad_poly_tbl
+	SELECT (x - 1) * 100 + y, polygon(circle(point(x * 10, y * 10), 1 + (x + y) % 10))
+	FROM generate_series(1, 100) x,
+		 generate_series(1, 100) y;
+
+INSERT INTO quad_poly_tbl
+	SELECT i, polygon '((200, 300),(210, 310),(230, 290))'
+	FROM generate_series(10001, 11000) AS i;
+
+INSERT INTO quad_poly_tbl
+	VALUES
+		(11001, NULL),
+		(11002, NULL),
+		(11003, NULL);
+
+CREATE INDEX quad_poly_tbl_idx ON quad_poly_tbl USING spgist(p);
+
+-- get reference results for ORDER BY distance from seq scan
+SET enable_seqscan = ON;
+SET enable_indexscan = OFF;
+SET enable_bitmapscan = OFF;
+
+CREATE TEMP TABLE quad_poly_tbl_ord_seq1 AS
+SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+FROM quad_poly_tbl;
+
+CREATE TEMP TABLE quad_poly_tbl_ord_seq2 AS
+SELECT rank() OVER (ORDER BY p <-> point '123,456') n, p <-> point '123,456' dist, id
+FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+
+-- check results results from index scan
+SET enable_seqscan = OFF;
+SET enable_indexscan = OFF;
+SET enable_bitmapscan = ON;
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p << polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p &< polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p && polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p &> polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p >> polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p <<| polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p &<| polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p |&> polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p |>> polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+SELECT count(*) FROM quad_poly_tbl WHERE p <@ polygon '((300,300),(400,600),(600,500),(700,200))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+SELECT count(*) FROM quad_poly_tbl WHERE p @> polygon '((340,550),(343,552),(341,553))';
+
+EXPLAIN (COSTS OFF)
+SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+SELECT count(*) FROM quad_poly_tbl WHERE p ~= polygon '((200, 300),(210, 310),(230, 290))';
+
+RESET enable_seqscan;
+RESET enable_indexscan;
+RESET enable_bitmapscan;
-- 
2.7.4

In reply to: Nikita Glukhov (#28)

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will allow usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

#30Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Darafei Praliaskouski (#29)
1 attachment(s)

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me@komzpa.net> wrote:

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will allow
usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

I went trough this patch. I found documentation changes to be not
sufficient. And I've made some improvements.

In particular, I didn't understand why is reconstructedValue claimed to be
of spgConfigOut.leafType while it should be of spgConfigIn.attType both
from general logic and code. I've fixed that. Nikita, correct me if I'm
wrong.

Also, I wonder should we check for existence of compress method when
attType and leafType are not the same in spgvalidate() function? We don't
do this for now.

0002-spgist-polygon-8.patch is OK for me so soon.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-9.patchapplication/octet-stream; name=0001-spgist-compress-method-9.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
new file mode 100644
index 139c8ed..bae5714
*** a/doc/src/sgml/spgist.sgml
--- b/doc/src/sgml/spgist.sgml
***************
*** 240,259 ****
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide.  All five follow the convention
!   of accepting two <type>internal</type> arguments, the first of which is a
!   pointer to a C struct containing input values for the support method,
!   while the second argument is a pointer to a C struct where output values
!   must be placed.  Four of the methods just return <type>void</type>, since
!   all their results appear in the output struct; but
    <function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method.
   </para>
  
   <para>
!   The five user-defined methods are:
   </para>
  
   <variablelist>
--- 240,261 ----
  
   <para>
    There are five user-defined methods that an index operator class for
!   <acronym>SP-GiST</acronym> must provide, and one is optional.  All five
!   mandatory methods follow the convention of accepting two <type>internal</type>
!   arguments, the first of which is a pointer to a C struct containing input
!   values for the support method, while the second argument is a pointer to a
!   C struct where output values must be placed.  Four of the mandatory methods just
!   return <type>void</type>, since all their results appear in the output struct; but
    <function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
    The methods must not modify any fields of their input structs.  In all
    cases, the output struct is initialized to zeroes before calling the
!   user-defined method.  Optional sixth method <function>compress</function>
!   accepts datum to be indexed as the only argument and returns value suitable
!   for physical storage in leaf tuple.
   </para>
  
   <para>
!   The five mandatory user-defined methods are:
   </para>
  
   <variablelist>
*************** typedef struct spgConfigOut
*** 283,288 ****
--- 285,291 ----
  {
      Oid         prefixType;     /* Data type of inner-tuple prefixes */
      Oid         labelType;      /* Data type of inner-tuple node labels */
+     Oid         leafType;       /* Data type of leaf-tuple values */
      bool        canReturnData;  /* Opclass can reconstruct original data */
      bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
  } spgConfigOut;
*************** typedef struct spgConfigOut
*** 305,310 ****
--- 308,329 ----
        class is capable of segmenting long values by repeated suffixing
        (see <xref linkend="spgist-limits"/>).
       </para>
+ 
+      <para>
+       <structfield>leafType</structfield> is typically the same as
+       <structfield>attType</structfield>.  For the reasons of backward
+       compatibility, method <function>config</function> can
+       leave <structfield>leafType</structfield> uninitialized; that would
+       give the same effect as setting <structfield>leafType</structfield> equal
+       to <structfield>attType</structfield>.  When <structfield>attType</structfield>
+       and <structfield>leafType</structfield> are different, then optional
+       method <function>compress</function> must be provided.
+       Method <function>compress</function> is responsible
+       for transformation of datums to be indexed from <structfield>attType</structfield>
+       to <structfield>leafType</structfield>.
+       Note: both consistent functions will get <structfield>scankeys</structfield>
+       unchanged, without transformation using <function>compress</function>.
+      </para>
       </listitem>
      </varlistentry>
  
*************** typedef struct spgChooseOut
*** 380,389 ****
  } spgChooseOut;
  </programlisting>
  
!        <structfield>datum</structfield> is the original datum that was to be inserted
!        into the index.
!        <structfield>leafDatum</structfield> is initially the same as
!        <structfield>datum</structfield>, but can change at lower levels of the tree
         if the <function>choose</function> or <function>picksplit</function>
         methods change it.  When the insertion search reaches a leaf page,
         the current value of <structfield>leafDatum</structfield> is what will be stored
--- 399,414 ----
  } spgChooseOut;
  </programlisting>
  
!        <structfield>datum</structfield> is the original datum of
!        <structname>spgConfigIn</structname>.<structfield>attType</structfield>
!        type that was to be inserted into the index.
!        <structfield>leafDatum</structfield> is a value of
!        <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
!        type which is initially an result of method
!        <function>compress</function> applied to <structfield>datum</structfield>
!        when method <function>compress</function> is provided, or same value as
!        <structfield>datum</structfield> otherwise.
!        <structfield>leafDatum</structfield> can change at lower levels of the tree
         if the <function>choose</function> or <function>picksplit</function>
         methods change it.  When the insertion search reaches a leaf page,
         the current value of <structfield>leafDatum</structfield> is what will be stored
*************** typedef struct spgChooseOut
*** 418,424 ****
         Set <structfield>levelAdd</structfield> to the increment in
         <structfield>level</structfield> caused by descending through that node,
         or leave it as zero if the operator class does not use levels.
!        Set <structfield>restDatum</structfield> to equal <structfield>datum</structfield>
         if the operator class does not modify datums from one level to the
         next, or otherwise set it to the modified value to be used as
         <structfield>leafDatum</structfield> at the next level.
--- 443,449 ----
         Set <structfield>levelAdd</structfield> to the increment in
         <structfield>level</structfield> caused by descending through that node,
         or leave it as zero if the operator class does not use levels.
!        Set <structfield>restDatum</structfield> to equal <structfield>leafDatum</structfield>
         if the operator class does not modify datums from one level to the
         next, or otherwise set it to the modified value to be used as
         <structfield>leafDatum</structfield> at the next level.
*************** typedef struct spgPickSplitOut
*** 509,515 ****
  </programlisting>
  
         <structfield>nTuples</structfield> is the number of leaf tuples provided.
!        <structfield>datums</structfield> is an array of their datum values.
         <structfield>level</structfield> is the current level that all the leaf tuples
         share, which will become the level of the new inner tuple.
        </para>
--- 534,542 ----
  </programlisting>
  
         <structfield>nTuples</structfield> is the number of leaf tuples provided.
!        <structfield>datums</structfield> is an array of their datum values of
!        <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
!        type.
         <structfield>level</structfield> is the current level that all the leaf tuples
         share, which will become the level of the new inner tuple.
        </para>
*************** typedef struct spgInnerConsistentOut
*** 624,630 ****
         <structfield>reconstructedValue</structfield> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
         <function>inner_consistent</function> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</structfield> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</function>
         on the parent index tuple, or NULL at the root level.
--- 651,658 ----
         <structfield>reconstructedValue</structfield> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
         <function>inner_consistent</function> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</structfield> is always of
!        <structname>spgConfigIn</structname>.<structfield>attType</structfield> type.
         <structfield>traversalValue</structfield> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</function>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgInnerConsistentOut
*** 659,664 ****
--- 687,693 ----
         necessarily so, so an array is used.)
         If value reconstruction is needed, set
         <structfield>reconstructedValues</structfield> to an array of the values
+        of <structname>spgConfigIn</structname>.<structfield>attType</structfield> type
         reconstructed for each child node to be visited; otherwise, leave
         <structfield>reconstructedValues</structfield> as NULL.
         If it is desired to pass down additional out-of-band information
*************** typedef struct spgLeafConsistentOut
*** 730,736 ****
         <structfield>reconstructedValue</structfield> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
         <function>inner_consistent</function> function did not provide a value at the
!        parent level.
         <structfield>traversalValue</structfield> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</function>
         on the parent index tuple, or NULL at the root level.
--- 759,766 ----
         <structfield>reconstructedValue</structfield> is the value reconstructed for the
         parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
         <function>inner_consistent</function> function did not provide a value at the
!        parent level. <structfield>reconstructedValue</structfield> is always of
!        <structname>spgConfigIn</structname>.<structfield>attType</structfield> type. 
         <structfield>traversalValue</structfield> is a pointer to any traverse data
         passed down from the previous call of <function>inner_consistent</function>
         on the parent index tuple, or NULL at the root level.
*************** typedef struct spgLeafConsistentOut
*** 739,754 ****
         <structfield>returnData</structfield> is <literal>true</literal> if reconstructed data is
         required for this query; this will only be so if the
         <function>config</function> function asserted <structfield>canReturnData</structfield>.
!        <structfield>leafDatum</structfield> is the key value stored in the current
!        leaf tuple.
        </para>
  
        <para>
         The function must return <literal>true</literal> if the leaf tuple matches the
         query, or <literal>false</literal> if not.  In the <literal>true</literal> case,
         if <structfield>returnData</structfield> is <literal>true</literal> then
!        <structfield>leafValue</structfield> must be set to the value originally supplied
!        to be indexed for this leaf tuple.  Also,
         <structfield>recheck</structfield> may be set to <literal>true</literal> if the match
         is uncertain and so the operator(s) must be re-applied to the actual
         heap tuple to verify the match.
--- 769,786 ----
         <structfield>returnData</structfield> is <literal>true</literal> if reconstructed data is
         required for this query; this will only be so if the
         <function>config</function> function asserted <structfield>canReturnData</structfield>.
!        <structfield>leafDatum</structfield> is the key value of
!        <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
!        stored in the current leaf tuple.
        </para>
  
        <para>
         The function must return <literal>true</literal> if the leaf tuple matches the
         query, or <literal>false</literal> if not.  In the <literal>true</literal> case,
         if <structfield>returnData</structfield> is <literal>true</literal> then
!        <structfield>leafValue</structfield> must be set to the value of
!        <structname>spgConfigIn</structname>.<structfield>attType</structfield> type
!        originally supplied to be indexed for this leaf tuple.  Also,
         <structfield>recheck</structfield> may be set to <literal>true</literal> if the match
         is uncertain and so the operator(s) must be re-applied to the actual
         heap tuple to verify the match.
*************** typedef struct spgLeafConsistentOut
*** 757,762 ****
--- 789,814 ----
      </varlistentry>
     </variablelist>
  
+  <para>
+   The optional user-defined method is:
+  </para>
+ 
+  <variablelist>
+     <varlistentry>
+      <term><function>Datum compress(Datum in)</function></term>
+      <listitem>
+       <para>
+        Converts the data item into a format suitable for physical storage in 
+        a leaf tuple of index page.  It accepts
+        <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+        value and return
+        <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+        value.  Output value should not be toasted.
+       </para>
+      </listitem>
+     </varlistentry>
+   </variablelist>
+ 
    <para>
     All the SP-GiST support methods are normally called in a short-lived
     memory context; that is, <varname>CurrentMemoryContext</varname> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
new file mode 100644
index a5f4c40..a8cb8c7
*** a/src/backend/access/spgist/spgdoinsert.c
--- b/src/backend/access/spgist/spgdoinsert.c
*************** spgdoinsert(Relation index, SpGistState 
*** 1906,1919 ****
  		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
! 	 * Since we don't use index_form_tuple in this AM, we have to make sure
  	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
! 	 * that.
  	 */
! 	if (!isnull && state->attType.attlen == -1)
! 		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
  
! 	leafDatum = datum;
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
--- 1906,1942 ----
  		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
  
  	/*
! 	 * Prepare the leaf datum to insert.
! 	 *
! 	 * If an optional "compress" method is provided, then call it to form
! 	 * the leaf datum from the input datum.  Otherwise store the input datum as
! 	 * is.  Since we don't use index_form_tuple in this AM, we have to make sure
  	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
! 	 * that.  But we assume the "compress" method to return an untoasted value.
  	 */
! 	if (!isnull)
! 	{
! 		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 		{
! 			FmgrInfo   *compressProcinfo = NULL;
  
! 			compressProcinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
! 			leafDatum = FunctionCall1Coll(compressProcinfo,
! 										  index->rd_indcollation[0],
! 										  datum);
! 		}
! 		else
! 		{
! 			Assert(state->attLeafType.type == state->attType.type);
! 
! 			if (state->attType.attlen == -1)
! 				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
! 			else
! 				leafDatum = datum;
! 		}
! 	}
! 	else
! 		leafDatum = (Datum) 0;
  
  	/*
  	 * Compute space needed for a leaf tuple containing the given datum.
*************** spgdoinsert(Relation index, SpGistState 
*** 1923,1929 ****
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
--- 1946,1952 ----
  	 */
  	if (!isnull)
  		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 			SpGistGetTypeSize(&state->attLeafType, leafDatum);
  	else
  		leafSize = SGDTSIZE + sizeof(ItemIdData);
  
*************** spgdoinsert(Relation index, SpGistState 
*** 2138,2144 ****
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attType, leafDatum);
  					}
  
  					/*
--- 2161,2167 ----
  					{
  						leafDatum = out.result.matchNode.restDatum;
  						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
! 							SpGistGetTypeSize(&state->attLeafType, leafDatum);
  					}
  
  					/*
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
new file mode 100644
index bd5301f..668e3c4
*** a/src/backend/access/spgist/spgutils.c
--- b/src/backend/access/spgist/spgutils.c
*************** spgGetCache(Relation index)
*** 124,130 ****
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		fillTypeDesc(&cache->attType, atttype);
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
--- 124,146 ----
  						  PointerGetDatum(&cache->config));
  
  		/* Get the information we need about each relevant datatype */
! 		if (OidIsValid(cache->config.leafType) &&
! 			cache->config.leafType != atttype)
! 		{
! 			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
! 				ereport(ERROR,
! 						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
! 						 errmsg("compress method must not defined when leaf type is different from input type")));
! 
! 			fillTypeDesc(&cache->attType, atttype);
! 			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
! 		}
! 		else
! 		{
! 			fillTypeDesc(&cache->attType, atttype);
! 			cache->attLeafType = cache->attType;
! 		}
! 
  		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
  		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
  
*************** initSpGistState(SpGistState *state, Rela
*** 164,169 ****
--- 180,186 ----
  
  	state->config = cache->config;
  	state->attType = cache->attType;
+ 	state->attLeafType = cache->attLeafType;
  	state->attPrefixType = cache->attPrefixType;
  	state->attLabelType = cache->attLabelType;
  
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 618,624 ****
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
--- 635,641 ----
  	/* compute space needed (note result is already maxaligned) */
  	size = SGLTHDRSZ;
  	if (!isnull)
! 		size += SpGistGetTypeSize(&state->attLeafType, datum);
  
  	/*
  	 * Ensure that we can replace the tuple with a dead tuple later.  This
*************** spgFormLeafTuple(SpGistState *state, Ite
*** 634,640 ****
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
  
  	return tup;
  }
--- 651,657 ----
  	tup->nextOffset = InvalidOffsetNumber;
  	tup->heapPtr = *heapPtr;
  	if (!isnull)
! 		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
  
  	return tup;
  }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
new file mode 100644
index 157cf2a..514da47
*** a/src/backend/access/spgist/spgvalidate.c
--- b/src/backend/access/spgist/spgvalidate.c
*************** spgvalidate(Oid opclassoid)
*** 52,57 ****
--- 52,61 ----
  	OpFamilyOpFuncGroup *opclassgroup;
  	int			i;
  	ListCell   *lc;
+ 	spgConfigIn	configIn;
+ 	spgConfigOut configOut;
+ 	Oid			configOutLefttype = InvalidOid;
+ 	Oid			configOutRighttype = InvalidOid;
  
  	/* Fetch opclass information */
  	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
*************** spgvalidate(Oid opclassoid)
*** 100,105 ****
--- 104,118 ----
  		switch (procform->amprocnum)
  		{
  			case SPGIST_CONFIG_PROC:
+ 				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+ 											2, 2, INTERNALOID, INTERNALOID);
+ 				configIn.attType = procform->amproclefttype;
+ 				OidFunctionCall2(procform->amproc,
+ 								 PointerGetDatum(&configIn),
+ 								 PointerGetDatum(&configOut));
+ 				configOutLefttype = procform->amproclefttype;
+ 				configOutRighttype = procform->amprocrighttype;
+ 				break;
  			case SPGIST_CHOOSE_PROC:
  			case SPGIST_PICKSPLIT_PROC:
  			case SPGIST_INNER_CONSISTENT_PROC:
*************** spgvalidate(Oid opclassoid)
*** 110,115 ****
--- 123,137 ----
  				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
  											2, 2, INTERNALOID, INTERNALOID);
  				break;
+ 			case SPGIST_COMPRESS_PROC:
+ 				if (configOutLefttype != procform->amproclefttype ||
+ 					configOutRighttype != procform->amprocrighttype)
+ 					ok = false;
+ 				else
+ 					ok = check_amproc_signature(procform->amproc,
+ 												configOut.leafType, true,
+ 												1, 1, procform->amproclefttype);
+ 				break;
  			default:
  				ereport(INFO,
  						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
*************** spgvalidate(Oid opclassoid)
*** 212,218 ****
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
--- 234,240 ----
  		if (thisgroup->lefttype != thisgroup->righttype)
  			continue;
  
! 		for (i = 1; i <= SPGISTNRequiredProc; i++)
  		{
  			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
  				continue;		/* got it */
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
new file mode 100644
index d1bc396..06b1d88
*** a/src/include/access/spgist.h
--- b/src/include/access/spgist.h
***************
*** 30,36 ****
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGISTNProc						5
  
  /*
   * Argument structs for spg_config method
--- 30,38 ----
  #define SPGIST_PICKSPLIT_PROC			3
  #define SPGIST_INNER_CONSISTENT_PROC	4
  #define SPGIST_LEAF_CONSISTENT_PROC		5
! #define SPGIST_COMPRESS_PROC			6
! #define SPGISTNRequiredProc				5
! #define SPGISTNProc						6
  
  /*
   * Argument structs for spg_config method
*************** typedef struct spgConfigOut
*** 44,49 ****
--- 46,52 ----
  {
  	Oid			prefixType;		/* Data type of inner-tuple prefixes */
  	Oid			labelType;		/* Data type of inner-tuple node labels */
+ 	Oid			leafType;		/* Data type of leaf-tuple values */
  	bool		canReturnData;	/* Opclass can reconstruct original data */
  	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
  } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
new file mode 100644
index 1c4b321..e55de9d
*** a/src/include/access/spgist_private.h
--- b/src/include/access/spgist_private.h
*************** typedef struct SpGistState
*** 119,125 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 119,126 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef struct SpGistCache
*** 178,184 ****
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of input data and leaf values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
--- 179,186 ----
  {
  	spgConfigOut config;		/* filled in by opclass config method */
  
! 	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
! 	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
  	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
  	SpGistTypeDesc attLabelType;	/* type of node label values */
  
*************** typedef SpGistLeafTupleData *SpGistLeafT
*** 300,306 ****
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
--- 302,308 ----
  
  #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
  #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
! #define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
  							 *(Datum *) SGLTDATAPTR(x) : \
  							 PointerGetDatum(SGLTDATAPTR(x)))
  
#31Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Alexander Korotkov (#30)
1 attachment(s)

On 05.12.2017 23:59, Alexander Korotkov wrote:

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me@komzpa.net
<mailto:me@komzpa.net>> wrote:

The following review has been posted through the commitfest
application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will
allow usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

I went trough this patch.  I found documentation changes to be not
sufficient.  And I've made some improvements.

In particular, I didn't understand why is reconstructedValue claimed
to be of spgConfigOut.leafType while it should be of
spgConfigIn.attType both from general logic and code.  I've fixed
that.  Nikita, correct me if I'm wrong.

I think we are reconstructing a leaf datum, so documentation was correct
but the code in spgWalk() and freeScanStackEntry() wrongly used attType
instead of attLeafType. Fixed patch is attached.

Also, I wonder should we check for existence of compress method when
attType and leafType are not the same in spgvalidate() function?  We
don't do this for now.

I've added compress method existence check to spgvalidate().

0002-spgist-polygon-8.patch is OK for me so soon.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-10.patchtext/x-patch; name=0001-spgist-compress-method-10.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 139c8ed..b4a8be4 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -240,20 +240,22 @@
 
  <para>
   There are five user-defined methods that an index operator class for
-  <acronym>SP-GiST</acronym> must provide.  All five follow the convention
-  of accepting two <type>internal</type> arguments, the first of which is a
-  pointer to a C struct containing input values for the support method,
-  while the second argument is a pointer to a C struct where output values
-  must be placed.  Four of the methods just return <type>void</type>, since
-  all their results appear in the output struct; but
+  <acronym>SP-GiST</acronym> must provide, and one is optional.  All five
+  mandatory methods follow the convention of accepting two <type>internal</type>
+  arguments, the first of which is a pointer to a C struct containing input
+  values for the support method, while the second argument is a pointer to a
+  C struct where output values must be placed.  Four of the mandatory methods just
+  return <type>void</type>, since all their results appear in the output struct; but
   <function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
   The methods must not modify any fields of their input structs.  In all
   cases, the output struct is initialized to zeroes before calling the
-  user-defined method.
+  user-defined method.  Optional sixth method <function>compress</function>
+  accepts datum to be indexed as the only argument and returns value suitable
+  for physical storage in leaf tuple.
  </para>
 
  <para>
-  The five user-defined methods are:
+  The five mandatory user-defined methods are:
  </para>
 
  <variablelist>
@@ -283,6 +285,7 @@ typedef struct spgConfigOut
 {
     Oid         prefixType;     /* Data type of inner-tuple prefixes */
     Oid         labelType;      /* Data type of inner-tuple node labels */
+    Oid         leafType;       /* Data type of leaf-tuple values */
     bool        canReturnData;  /* Opclass can reconstruct original data */
     bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
 } spgConfigOut;
@@ -305,6 +308,22 @@ typedef struct spgConfigOut
       class is capable of segmenting long values by repeated suffixing
       (see <xref linkend="spgist-limits"/>).
      </para>
+
+     <para>
+      <structfield>leafType</structfield> is typically the same as
+      <structfield>attType</structfield>.  For the reasons of backward
+      compatibility, method <function>config</function> can
+      leave <structfield>leafType</structfield> uninitialized; that would
+      give the same effect as setting <structfield>leafType</structfield> equal
+      to <structfield>attType</structfield>.  When <structfield>attType</structfield>
+      and <structfield>leafType</structfield> are different, then optional
+      method <function>compress</function> must be provided.
+      Method <function>compress</function> is responsible
+      for transformation of datums to be indexed from <structfield>attType</structfield>
+      to <structfield>leafType</structfield>.
+      Note: both consistent functions will get <structfield>scankeys</structfield>
+      unchanged, without transformation using <function>compress</function>.
+     </para>
      </listitem>
     </varlistentry>
 
@@ -380,10 +399,16 @@ typedef struct spgChooseOut
 } spgChooseOut;
 </programlisting>
 
-       <structfield>datum</structfield> is the original datum that was to be inserted
-       into the index.
-       <structfield>leafDatum</structfield> is initially the same as
-       <structfield>datum</structfield>, but can change at lower levels of the tree
+       <structfield>datum</structfield> is the original datum of
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+       type that was to be inserted into the index.
+       <structfield>leafDatum</structfield> is a value of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       type which is initially an result of method
+       <function>compress</function> applied to <structfield>datum</structfield>
+       when method <function>compress</function> is provided, or same value as
+       <structfield>datum</structfield> otherwise.
+       <structfield>leafDatum</structfield> can change at lower levels of the tree
        if the <function>choose</function> or <function>picksplit</function>
        methods change it.  When the insertion search reaches a leaf page,
        the current value of <structfield>leafDatum</structfield> is what will be stored
@@ -418,7 +443,7 @@ typedef struct spgChooseOut
        Set <structfield>levelAdd</structfield> to the increment in
        <structfield>level</structfield> caused by descending through that node,
        or leave it as zero if the operator class does not use levels.
-       Set <structfield>restDatum</structfield> to equal <structfield>datum</structfield>
+       Set <structfield>restDatum</structfield> to equal <structfield>leafDatum</structfield>
        if the operator class does not modify datums from one level to the
        next, or otherwise set it to the modified value to be used as
        <structfield>leafDatum</structfield> at the next level.
@@ -509,7 +534,9 @@ typedef struct spgPickSplitOut
 </programlisting>
 
        <structfield>nTuples</structfield> is the number of leaf tuples provided.
-       <structfield>datums</structfield> is an array of their datum values.
+       <structfield>datums</structfield> is an array of their datum values of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       type.
        <structfield>level</structfield> is the current level that all the leaf tuples
        share, which will become the level of the new inner tuple.
       </para>
@@ -624,7 +651,8 @@ typedef struct spgInnerConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> is always of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type.
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -659,6 +687,7 @@ typedef struct spgInnerConsistentOut
        necessarily so, so an array is used.)
        If value reconstruction is needed, set
        <structfield>reconstructedValues</structfield> to an array of the values
+       of <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type
        reconstructed for each child node to be visited; otherwise, leave
        <structfield>reconstructedValues</structfield> as NULL.
        If it is desired to pass down additional out-of-band information
@@ -730,7 +759,8 @@ typedef struct spgLeafConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> is always of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type. 
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -739,16 +769,18 @@ typedef struct spgLeafConsistentOut
        <structfield>returnData</structfield> is <literal>true</literal> if reconstructed data is
        required for this query; this will only be so if the
        <function>config</function> function asserted <structfield>canReturnData</structfield>.
-       <structfield>leafDatum</structfield> is the key value stored in the current
-       leaf tuple.
+       <structfield>leafDatum</structfield> is the key value of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       stored in the current leaf tuple.
       </para>
 
       <para>
        The function must return <literal>true</literal> if the leaf tuple matches the
        query, or <literal>false</literal> if not.  In the <literal>true</literal> case,
        if <structfield>returnData</structfield> is <literal>true</literal> then
-       <structfield>leafValue</structfield> must be set to the value originally supplied
-       to be indexed for this leaf tuple.  Also,
+       <structfield>leafValue</structfield> must be set to the value of
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield> type
+       originally supplied to be indexed for this leaf tuple.  Also,
        <structfield>recheck</structfield> may be set to <literal>true</literal> if the match
        is uncertain and so the operator(s) must be re-applied to the actual
        heap tuple to verify the match.
@@ -757,6 +789,26 @@ typedef struct spgLeafConsistentOut
     </varlistentry>
    </variablelist>
 
+ <para>
+  The optional user-defined method is:
+ </para>
+
+ <variablelist>
+    <varlistentry>
+     <term><function>Datum compress(Datum in)</function></term>
+     <listitem>
+      <para>
+       Converts the data item into a format suitable for physical storage in 
+       a leaf tuple of index page.  It accepts
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+       value and return
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       value.  Output value should not be toasted.
+      </para>
+     </listitem>
+    </varlistentry>
+  </variablelist>
+
   <para>
    All the SP-GiST support methods are normally called in a short-lived
    memory context; that is, <varname>CurrentMemoryContext</varname> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index a5f4c40..a8cb8c7 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1906,14 +1906,37 @@ spgdoinsert(Relation index, SpGistState *state,
 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
 
 	/*
-	 * Since we don't use index_form_tuple in this AM, we have to make sure
+	 * Prepare the leaf datum to insert.
+	 *
+	 * If an optional "compress" method is provided, then call it to form
+	 * the leaf datum from the input datum.  Otherwise store the input datum as
+	 * is.  Since we don't use index_form_tuple in this AM, we have to make sure
 	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
-	 * that.
+	 * that.  But we assume the "compress" method to return an untoasted value.
 	 */
-	if (!isnull && state->attType.attlen == -1)
-		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+	if (!isnull)
+	{
+		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+		{
+			FmgrInfo   *compressProcinfo = NULL;
+
+			compressProcinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
+			leafDatum = FunctionCall1Coll(compressProcinfo,
+										  index->rd_indcollation[0],
+										  datum);
+		}
+		else
+		{
+			Assert(state->attLeafType.type == state->attType.type);
 
-	leafDatum = datum;
+			if (state->attType.attlen == -1)
+				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+			else
+				leafDatum = datum;
+		}
+	}
+	else
+		leafDatum = (Datum) 0;
 
 	/*
 	 * Compute space needed for a leaf tuple containing the given datum.
@@ -1923,7 +1946,7 @@ spgdoinsert(Relation index, SpGistState *state,
 	 */
 	if (!isnull)
 		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-			SpGistGetTypeSize(&state->attType, leafDatum);
+			SpGistGetTypeSize(&state->attLeafType, leafDatum);
 	else
 		leafSize = SGDTSIZE + sizeof(ItemIdData);
 
@@ -2138,7 +2161,7 @@ spgdoinsert(Relation index, SpGistState *state,
 					{
 						leafDatum = out.result.matchNode.restDatum;
 						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-							SpGistGetTypeSize(&state->attType, leafDatum);
+							SpGistGetTypeSize(&state->attLeafType, leafDatum);
 					}
 
 					/*
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 7965b58..c64a174 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -40,7 +40,7 @@ typedef struct ScanStackEntry
 static void
 freeScanStackEntry(SpGistScanOpaque so, ScanStackEntry *stackEntry)
 {
-	if (!so->state.attType.attbyval &&
+	if (!so->state.attLeafType.attbyval &&
 		DatumGetPointer(stackEntry->reconstructedValue) != NULL)
 		pfree(DatumGetPointer(stackEntry->reconstructedValue));
 	if (stackEntry->traversalValue)
@@ -527,8 +527,8 @@ redirect:
 					if (out.reconstructedValues)
 						newEntry->reconstructedValue =
 							datumCopy(out.reconstructedValues[i],
-									  so->state.attType.attbyval,
-									  so->state.attType.attlen);
+									  so->state.attLeafType.attbyval,
+									  so->state.attLeafType.attlen);
 					else
 						newEntry->reconstructedValue = (Datum) 0;
 
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index bd5301f..e571f0c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -125,6 +125,22 @@ spgGetCache(Relation index)
 
 		/* Get the information we need about each relevant datatype */
 		fillTypeDesc(&cache->attType, atttype);
+
+		if (OidIsValid(cache->config.leafType) &&
+			cache->config.leafType != atttype)
+		{
+			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("compress method must not defined when leaf type is different from input type")));
+
+			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
+		}
+		else
+		{
+			cache->attLeafType = cache->attType;
+		}
+
 		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
 		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
 
@@ -164,6 +180,7 @@ initSpGistState(SpGistState *state, Relation index)
 
 	state->config = cache->config;
 	state->attType = cache->attType;
+	state->attLeafType = cache->attLeafType;
 	state->attPrefixType = cache->attPrefixType;
 	state->attLabelType = cache->attLabelType;
 
@@ -618,7 +635,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	/* compute space needed (note result is already maxaligned) */
 	size = SGLTHDRSZ;
 	if (!isnull)
-		size += SpGistGetTypeSize(&state->attType, datum);
+		size += SpGistGetTypeSize(&state->attLeafType, datum);
 
 	/*
 	 * Ensure that we can replace the tuple with a dead tuple later.  This
@@ -634,7 +651,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	tup->nextOffset = InvalidOffsetNumber;
 	tup->heapPtr = *heapPtr;
 	if (!isnull)
-		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
+		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
 
 	return tup;
 }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
index 157cf2a..410675a 100644
--- a/src/backend/access/spgist/spgvalidate.c
+++ b/src/backend/access/spgist/spgvalidate.c
@@ -22,6 +22,7 @@
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_type.h"
 #include "utils/builtins.h"
+#include "utils/lsyscache.h"
 #include "utils/regproc.h"
 #include "utils/syscache.h"
 
@@ -52,6 +53,10 @@ spgvalidate(Oid opclassoid)
 	OpFamilyOpFuncGroup *opclassgroup;
 	int			i;
 	ListCell   *lc;
+	spgConfigIn	configIn;
+	spgConfigOut configOut;
+	Oid			configOutLefttype = InvalidOid;
+	Oid			configOutRighttype = InvalidOid;
 
 	/* Fetch opclass information */
 	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
@@ -100,6 +105,15 @@ spgvalidate(Oid opclassoid)
 		switch (procform->amprocnum)
 		{
 			case SPGIST_CONFIG_PROC:
+				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+											2, 2, INTERNALOID, INTERNALOID);
+				configIn.attType = procform->amproclefttype;
+				OidFunctionCall2(procform->amproc,
+								 PointerGetDatum(&configIn),
+								 PointerGetDatum(&configOut));
+				configOutLefttype = procform->amproclefttype;
+				configOutRighttype = procform->amprocrighttype;
+				break;
 			case SPGIST_CHOOSE_PROC:
 			case SPGIST_PICKSPLIT_PROC:
 			case SPGIST_INNER_CONSISTENT_PROC:
@@ -110,6 +124,15 @@ spgvalidate(Oid opclassoid)
 				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
 											2, 2, INTERNALOID, INTERNALOID);
 				break;
+			case SPGIST_COMPRESS_PROC:
+				if (configOutLefttype != procform->amproclefttype ||
+					configOutRighttype != procform->amprocrighttype)
+					ok = false;
+				else
+					ok = check_amproc_signature(procform->amproc,
+												configOut.leafType, true,
+												1, 1, procform->amproclefttype);
+				break;
 			default:
 				ereport(INFO,
 						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -216,6 +239,39 @@ spgvalidate(Oid opclassoid)
 		{
 			if ((thisgroup->functionset & (((uint64) 1) << i)) != 0)
 				continue;		/* got it */
+
+			/*
+			 * Compress function is not required when the leaf and attribute
+			 * types are equal.
+			 */
+			if (i == SPGIST_COMPRESS_PROC)
+			{
+				Oid			configProc;
+
+				if (!(thisgroup->functionset &
+					  (((uint64) 1) << SPGIST_CONFIG_PROC)))
+					continue;	/* we are not able to get the leaf type */
+
+				configProc = get_opfamily_proc(opfamilyoid,
+											   thisgroup->lefttype,
+											   thisgroup->righttype,
+											   SPGIST_CONFIG_PROC);
+
+				if (!OidIsValid(configProc))
+					continue;	/* we are not able to get the leaf type */
+
+				configIn.attType = thisgroup->lefttype;
+				memset(&configOut, 0, sizeof(configOut));
+
+				OidFunctionCall2(configProc,
+								 PointerGetDatum(&configIn),
+								 PointerGetDatum(&configOut));
+
+				if (!OidIsValid(configOut.leafType) ||
+					configOut.leafType == configIn.attType)
+					continue;	/* compress function is not required */
+			}
+
 			ereport(INFO,
 					(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
 					 errmsg("operator family \"%s\" of access method %s is missing support function %d for type %s",
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
index d1bc396..06b1d88 100644
--- a/src/include/access/spgist.h
+++ b/src/include/access/spgist.h
@@ -30,7 +30,9 @@
 #define SPGIST_PICKSPLIT_PROC			3
 #define SPGIST_INNER_CONSISTENT_PROC	4
 #define SPGIST_LEAF_CONSISTENT_PROC		5
-#define SPGISTNProc						5
+#define SPGIST_COMPRESS_PROC			6
+#define SPGISTNRequiredProc				5
+#define SPGISTNProc						6
 
 /*
  * Argument structs for spg_config method
@@ -44,6 +46,7 @@ typedef struct spgConfigOut
 {
 	Oid			prefixType;		/* Data type of inner-tuple prefixes */
 	Oid			labelType;		/* Data type of inner-tuple node labels */
+	Oid			leafType;		/* Data type of leaf-tuple values */
 	bool		canReturnData;	/* Opclass can reconstruct original data */
 	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
 } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 1c4b321..e55de9d 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -119,7 +119,8 @@ typedef struct SpGistState
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -178,7 +179,8 @@ typedef struct SpGistCache
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -300,7 +302,7 @@ typedef SpGistLeafTupleData *SpGistLeafTuple;
 
 #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
 #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
-#define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
+#define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
 							 *(Datum *) SGLTDATAPTR(x) : \
 							 PointerGetDatum(SGLTDATAPTR(x)))
 
#32Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Nikita Glukhov (#31)

On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov <n.gluhov@postgrespro.ru>
wrote:

On 05.12.2017 23:59, Alexander Korotkov wrote:

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me@komzpa.net>
wrote:

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will allow
usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

I went trough this patch. I found documentation changes to be not
sufficient. And I've made some improvements.

In particular, I didn't understand why is reconstructedValue claimed to be
of spgConfigOut.leafType while it should be of spgConfigIn.attType both
from general logic and code. I've fixed that. Nikita, correct me if I'm
wrong.

I think we are reconstructing a leaf datum, so documentation was correct
but the code in spgWalk() and freeScanStackEntry() wrongly used attType
instead of attLeafType. Fixed patch is attached.

Reconstructed datum is used for index-only scan. Thus, it should be
original indexed datum of attType, unless we have decompress method and
pass reconstructed datum through it.

Also, I wonder should we check for existence of compress method when
attType and leafType are not the same in spgvalidate() function? We don't
do this for now.

I've added compress method existence check to spgvalidate().

It would be nice to evade double calling of config method. Possible option
could be to memorize difference between attribute type and leaf type in
high bit of functionset, which is guaranteed to be free.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#33Nikita Glukhov
n.gluhov@postgrespro.ru
In reply to: Alexander Korotkov (#32)
1 attachment(s)

On 06.12.2017 21:49, Alexander Korotkov wrote:

On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov
<n.gluhov@postgrespro.ru <mailto:n.gluhov@postgrespro.ru>> wrote:

On 05.12.2017 23:59, Alexander Korotkov wrote:

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski
<me@komzpa.net <mailto:me@komzpa.net>> wrote:

The following review has been posted through the commitfest
application:
make installcheck-world:  not tested
Implements feature:       not tested
Spec compliant:           not tested
Documentation:            tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it
will allow usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

I went trough this patch.  I found documentation changes to be
not sufficient.  And I've made some improvements.

In particular, I didn't understand why is reconstructedValue
claimed to be of spgConfigOut.leafType while it should be of
spgConfigIn.attType both from general logic and code.  I've fixed
that.  Nikita, correct me if I'm wrong.

I think we are reconstructing a leaf datum, so documentation was
correct but the code in spgWalk() and freeScanStackEntry() wrongly
used attType instead of attLeafType. Fixed patch is attached.

Reconstructed datum is used for index-only scan.  Thus, it should be
original indexed datum of attType, unless we have decompress method
and pass reconstructed datum through it.

But if we have compress method and do not have decompress method then
index-only scan seems to be impossible.

Also, I wonder should we check for existence of compress method
when attType and leafType are not the same in spgvalidate()
function?  We don't do this for now.

I've added compress method existence check to spgvalidate().

It would be nice to evade double calling of config method.  Possible
option could be to memorize difference between attribute type and leaf
type in high bit of functionset, which is guaranteed to be free.

I decided to simply set compress method's bit in functionset when this
method it is not required.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-spgist-compress-method-11.patchtext/x-patch; name=0001-spgist-compress-method-11.patchDownload
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 139c8ed..b4a8be4 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -240,20 +240,22 @@
 
  <para>
   There are five user-defined methods that an index operator class for
-  <acronym>SP-GiST</acronym> must provide.  All five follow the convention
-  of accepting two <type>internal</type> arguments, the first of which is a
-  pointer to a C struct containing input values for the support method,
-  while the second argument is a pointer to a C struct where output values
-  must be placed.  Four of the methods just return <type>void</type>, since
-  all their results appear in the output struct; but
+  <acronym>SP-GiST</acronym> must provide, and one is optional.  All five
+  mandatory methods follow the convention of accepting two <type>internal</type>
+  arguments, the first of which is a pointer to a C struct containing input
+  values for the support method, while the second argument is a pointer to a
+  C struct where output values must be placed.  Four of the mandatory methods just
+  return <type>void</type>, since all their results appear in the output struct; but
   <function>leaf_consistent</function> additionally returns a <type>boolean</type> result.
   The methods must not modify any fields of their input structs.  In all
   cases, the output struct is initialized to zeroes before calling the
-  user-defined method.
+  user-defined method.  Optional sixth method <function>compress</function>
+  accepts datum to be indexed as the only argument and returns value suitable
+  for physical storage in leaf tuple.
  </para>
 
  <para>
-  The five user-defined methods are:
+  The five mandatory user-defined methods are:
  </para>
 
  <variablelist>
@@ -283,6 +285,7 @@ typedef struct spgConfigOut
 {
     Oid         prefixType;     /* Data type of inner-tuple prefixes */
     Oid         labelType;      /* Data type of inner-tuple node labels */
+    Oid         leafType;       /* Data type of leaf-tuple values */
     bool        canReturnData;  /* Opclass can reconstruct original data */
     bool        longValuesOK;   /* Opclass can cope with values &gt; 1 page */
 } spgConfigOut;
@@ -305,6 +308,22 @@ typedef struct spgConfigOut
       class is capable of segmenting long values by repeated suffixing
       (see <xref linkend="spgist-limits"/>).
      </para>
+
+     <para>
+      <structfield>leafType</structfield> is typically the same as
+      <structfield>attType</structfield>.  For the reasons of backward
+      compatibility, method <function>config</function> can
+      leave <structfield>leafType</structfield> uninitialized; that would
+      give the same effect as setting <structfield>leafType</structfield> equal
+      to <structfield>attType</structfield>.  When <structfield>attType</structfield>
+      and <structfield>leafType</structfield> are different, then optional
+      method <function>compress</function> must be provided.
+      Method <function>compress</function> is responsible
+      for transformation of datums to be indexed from <structfield>attType</structfield>
+      to <structfield>leafType</structfield>.
+      Note: both consistent functions will get <structfield>scankeys</structfield>
+      unchanged, without transformation using <function>compress</function>.
+     </para>
      </listitem>
     </varlistentry>
 
@@ -380,10 +399,16 @@ typedef struct spgChooseOut
 } spgChooseOut;
 </programlisting>
 
-       <structfield>datum</structfield> is the original datum that was to be inserted
-       into the index.
-       <structfield>leafDatum</structfield> is initially the same as
-       <structfield>datum</structfield>, but can change at lower levels of the tree
+       <structfield>datum</structfield> is the original datum of
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+       type that was to be inserted into the index.
+       <structfield>leafDatum</structfield> is a value of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       type which is initially an result of method
+       <function>compress</function> applied to <structfield>datum</structfield>
+       when method <function>compress</function> is provided, or same value as
+       <structfield>datum</structfield> otherwise.
+       <structfield>leafDatum</structfield> can change at lower levels of the tree
        if the <function>choose</function> or <function>picksplit</function>
        methods change it.  When the insertion search reaches a leaf page,
        the current value of <structfield>leafDatum</structfield> is what will be stored
@@ -418,7 +443,7 @@ typedef struct spgChooseOut
        Set <structfield>levelAdd</structfield> to the increment in
        <structfield>level</structfield> caused by descending through that node,
        or leave it as zero if the operator class does not use levels.
-       Set <structfield>restDatum</structfield> to equal <structfield>datum</structfield>
+       Set <structfield>restDatum</structfield> to equal <structfield>leafDatum</structfield>
        if the operator class does not modify datums from one level to the
        next, or otherwise set it to the modified value to be used as
        <structfield>leafDatum</structfield> at the next level.
@@ -509,7 +534,9 @@ typedef struct spgPickSplitOut
 </programlisting>
 
        <structfield>nTuples</structfield> is the number of leaf tuples provided.
-       <structfield>datums</structfield> is an array of their datum values.
+       <structfield>datums</structfield> is an array of their datum values of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       type.
        <structfield>level</structfield> is the current level that all the leaf tuples
        share, which will become the level of the new inner tuple.
       </para>
@@ -624,7 +651,8 @@ typedef struct spgInnerConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> is always of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type.
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -659,6 +687,7 @@ typedef struct spgInnerConsistentOut
        necessarily so, so an array is used.)
        If value reconstruction is needed, set
        <structfield>reconstructedValues</structfield> to an array of the values
+       of <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type
        reconstructed for each child node to be visited; otherwise, leave
        <structfield>reconstructedValues</structfield> as NULL.
        If it is desired to pass down additional out-of-band information
@@ -730,7 +759,8 @@ typedef struct spgLeafConsistentOut
        <structfield>reconstructedValue</structfield> is the value reconstructed for the
        parent tuple; it is <literal>(Datum) 0</literal> at the root level or if the
        <function>inner_consistent</function> function did not provide a value at the
-       parent level.
+       parent level. <structfield>reconstructedValue</structfield> is always of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield> type. 
        <structfield>traversalValue</structfield> is a pointer to any traverse data
        passed down from the previous call of <function>inner_consistent</function>
        on the parent index tuple, or NULL at the root level.
@@ -739,16 +769,18 @@ typedef struct spgLeafConsistentOut
        <structfield>returnData</structfield> is <literal>true</literal> if reconstructed data is
        required for this query; this will only be so if the
        <function>config</function> function asserted <structfield>canReturnData</structfield>.
-       <structfield>leafDatum</structfield> is the key value stored in the current
-       leaf tuple.
+       <structfield>leafDatum</structfield> is the key value of
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       stored in the current leaf tuple.
       </para>
 
       <para>
        The function must return <literal>true</literal> if the leaf tuple matches the
        query, or <literal>false</literal> if not.  In the <literal>true</literal> case,
        if <structfield>returnData</structfield> is <literal>true</literal> then
-       <structfield>leafValue</structfield> must be set to the value originally supplied
-       to be indexed for this leaf tuple.  Also,
+       <structfield>leafValue</structfield> must be set to the value of
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield> type
+       originally supplied to be indexed for this leaf tuple.  Also,
        <structfield>recheck</structfield> may be set to <literal>true</literal> if the match
        is uncertain and so the operator(s) must be re-applied to the actual
        heap tuple to verify the match.
@@ -757,6 +789,26 @@ typedef struct spgLeafConsistentOut
     </varlistentry>
    </variablelist>
 
+ <para>
+  The optional user-defined method is:
+ </para>
+
+ <variablelist>
+    <varlistentry>
+     <term><function>Datum compress(Datum in)</function></term>
+     <listitem>
+      <para>
+       Converts the data item into a format suitable for physical storage in 
+       a leaf tuple of index page.  It accepts
+       <structname>spgConfigIn</structname>.<structfield>attType</structfield>
+       value and return
+       <structname>spgConfigOut</structname>.<structfield>leafType</structfield>
+       value.  Output value should not be toasted.
+      </para>
+     </listitem>
+    </varlistentry>
+  </variablelist>
+
   <para>
    All the SP-GiST support methods are normally called in a short-lived
    memory context; that is, <varname>CurrentMemoryContext</varname> will be reset
diff --git a/src/backend/access/spgist/spgdoinsert.c b/src/backend/access/spgist/spgdoinsert.c
index a5f4c40..a8cb8c7 100644
--- a/src/backend/access/spgist/spgdoinsert.c
+++ b/src/backend/access/spgist/spgdoinsert.c
@@ -1906,14 +1906,37 @@ spgdoinsert(Relation index, SpGistState *state,
 		procinfo = index_getprocinfo(index, 1, SPGIST_CHOOSE_PROC);
 
 	/*
-	 * Since we don't use index_form_tuple in this AM, we have to make sure
+	 * Prepare the leaf datum to insert.
+	 *
+	 * If an optional "compress" method is provided, then call it to form
+	 * the leaf datum from the input datum.  Otherwise store the input datum as
+	 * is.  Since we don't use index_form_tuple in this AM, we have to make sure
 	 * value to be inserted is not toasted; FormIndexDatum doesn't guarantee
-	 * that.
+	 * that.  But we assume the "compress" method to return an untoasted value.
 	 */
-	if (!isnull && state->attType.attlen == -1)
-		datum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+	if (!isnull)
+	{
+		if (OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+		{
+			FmgrInfo   *compressProcinfo = NULL;
+
+			compressProcinfo = index_getprocinfo(index, 1, SPGIST_COMPRESS_PROC);
+			leafDatum = FunctionCall1Coll(compressProcinfo,
+										  index->rd_indcollation[0],
+										  datum);
+		}
+		else
+		{
+			Assert(state->attLeafType.type == state->attType.type);
 
-	leafDatum = datum;
+			if (state->attType.attlen == -1)
+				leafDatum = PointerGetDatum(PG_DETOAST_DATUM(datum));
+			else
+				leafDatum = datum;
+		}
+	}
+	else
+		leafDatum = (Datum) 0;
 
 	/*
 	 * Compute space needed for a leaf tuple containing the given datum.
@@ -1923,7 +1946,7 @@ spgdoinsert(Relation index, SpGistState *state,
 	 */
 	if (!isnull)
 		leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-			SpGistGetTypeSize(&state->attType, leafDatum);
+			SpGistGetTypeSize(&state->attLeafType, leafDatum);
 	else
 		leafSize = SGDTSIZE + sizeof(ItemIdData);
 
@@ -2138,7 +2161,7 @@ spgdoinsert(Relation index, SpGistState *state,
 					{
 						leafDatum = out.result.matchNode.restDatum;
 						leafSize = SGLTHDRSZ + sizeof(ItemIdData) +
-							SpGistGetTypeSize(&state->attType, leafDatum);
+							SpGistGetTypeSize(&state->attLeafType, leafDatum);
 					}
 
 					/*
diff --git a/src/backend/access/spgist/spgscan.c b/src/backend/access/spgist/spgscan.c
index 7965b58..c64a174 100644
--- a/src/backend/access/spgist/spgscan.c
+++ b/src/backend/access/spgist/spgscan.c
@@ -40,7 +40,7 @@ typedef struct ScanStackEntry
 static void
 freeScanStackEntry(SpGistScanOpaque so, ScanStackEntry *stackEntry)
 {
-	if (!so->state.attType.attbyval &&
+	if (!so->state.attLeafType.attbyval &&
 		DatumGetPointer(stackEntry->reconstructedValue) != NULL)
 		pfree(DatumGetPointer(stackEntry->reconstructedValue));
 	if (stackEntry->traversalValue)
@@ -527,8 +527,8 @@ redirect:
 					if (out.reconstructedValues)
 						newEntry->reconstructedValue =
 							datumCopy(out.reconstructedValues[i],
-									  so->state.attType.attbyval,
-									  so->state.attType.attlen);
+									  so->state.attLeafType.attbyval,
+									  so->state.attLeafType.attlen);
 					else
 						newEntry->reconstructedValue = (Datum) 0;
 
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index bd5301f..e571f0c 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -125,6 +125,22 @@ spgGetCache(Relation index)
 
 		/* Get the information we need about each relevant datatype */
 		fillTypeDesc(&cache->attType, atttype);
+
+		if (OidIsValid(cache->config.leafType) &&
+			cache->config.leafType != atttype)
+		{
+			if (!OidIsValid(index_getprocid(index, 1, SPGIST_COMPRESS_PROC)))
+				ereport(ERROR,
+						(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+						 errmsg("compress method must not defined when leaf type is different from input type")));
+
+			fillTypeDesc(&cache->attLeafType, cache->config.leafType);
+		}
+		else
+		{
+			cache->attLeafType = cache->attType;
+		}
+
 		fillTypeDesc(&cache->attPrefixType, cache->config.prefixType);
 		fillTypeDesc(&cache->attLabelType, cache->config.labelType);
 
@@ -164,6 +180,7 @@ initSpGistState(SpGistState *state, Relation index)
 
 	state->config = cache->config;
 	state->attType = cache->attType;
+	state->attLeafType = cache->attLeafType;
 	state->attPrefixType = cache->attPrefixType;
 	state->attLabelType = cache->attLabelType;
 
@@ -618,7 +635,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	/* compute space needed (note result is already maxaligned) */
 	size = SGLTHDRSZ;
 	if (!isnull)
-		size += SpGistGetTypeSize(&state->attType, datum);
+		size += SpGistGetTypeSize(&state->attLeafType, datum);
 
 	/*
 	 * Ensure that we can replace the tuple with a dead tuple later.  This
@@ -634,7 +651,7 @@ spgFormLeafTuple(SpGistState *state, ItemPointer heapPtr,
 	tup->nextOffset = InvalidOffsetNumber;
 	tup->heapPtr = *heapPtr;
 	if (!isnull)
-		memcpyDatum(SGLTDATAPTR(tup), &state->attType, datum);
+		memcpyDatum(SGLTDATAPTR(tup), &state->attLeafType, datum);
 
 	return tup;
 }
diff --git a/src/backend/access/spgist/spgvalidate.c b/src/backend/access/spgist/spgvalidate.c
index 157cf2a..440b3ce 100644
--- a/src/backend/access/spgist/spgvalidate.c
+++ b/src/backend/access/spgist/spgvalidate.c
@@ -22,6 +22,7 @@
 #include "catalog/pg_opfamily.h"
 #include "catalog/pg_type.h"
 #include "utils/builtins.h"
+#include "utils/lsyscache.h"
 #include "utils/regproc.h"
 #include "utils/syscache.h"
 
@@ -52,6 +53,10 @@ spgvalidate(Oid opclassoid)
 	OpFamilyOpFuncGroup *opclassgroup;
 	int			i;
 	ListCell   *lc;
+	spgConfigIn	configIn;
+	spgConfigOut configOut;
+	Oid			configOutLefttype = InvalidOid;
+	Oid			configOutRighttype = InvalidOid;
 
 	/* Fetch opclass information */
 	classtup = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclassoid));
@@ -74,6 +79,7 @@ spgvalidate(Oid opclassoid)
 	/* Fetch all operators and support functions of the opfamily */
 	oprlist = SearchSysCacheList1(AMOPSTRATEGY, ObjectIdGetDatum(opfamilyoid));
 	proclist = SearchSysCacheList1(AMPROCNUM, ObjectIdGetDatum(opfamilyoid));
+	grouplist = identify_opfamily_groups(oprlist, proclist);
 
 	/* Check individual support functions */
 	for (i = 0; i < proclist->n_members; i++)
@@ -100,6 +106,40 @@ spgvalidate(Oid opclassoid)
 		switch (procform->amprocnum)
 		{
 			case SPGIST_CONFIG_PROC:
+				ok = check_amproc_signature(procform->amproc, VOIDOID, true,
+											2, 2, INTERNALOID, INTERNALOID);
+				configIn.attType = procform->amproclefttype;
+				memset(&configOut, 0, sizeof(configOut));
+
+				OidFunctionCall2(procform->amproc,
+								 PointerGetDatum(&configIn),
+								 PointerGetDatum(&configOut));
+
+				configOutLefttype = procform->amproclefttype;
+				configOutRighttype = procform->amprocrighttype;
+
+				/*
+				 * When leaf and attribute types are the same, compress function
+				 * is not required and we set corresponding bit in functionset
+				 * for later group consistency check.
+				 */
+				if (!OidIsValid(configOut.leafType) ||
+					configOut.leafType == configIn.attType)
+				{
+					foreach(lc, grouplist)
+					{
+						OpFamilyOpFuncGroup *group = lfirst(lc);
+
+						if (group->lefttype == procform->amproclefttype &&
+							group->righttype == procform->amprocrighttype)
+						{
+							group->functionset |=
+								((uint64) 1) << SPGIST_COMPRESS_PROC;
+							break;
+						}
+					}
+				}
+				break;
 			case SPGIST_CHOOSE_PROC:
 			case SPGIST_PICKSPLIT_PROC:
 			case SPGIST_INNER_CONSISTENT_PROC:
@@ -110,6 +150,15 @@ spgvalidate(Oid opclassoid)
 				ok = check_amproc_signature(procform->amproc, BOOLOID, true,
 											2, 2, INTERNALOID, INTERNALOID);
 				break;
+			case SPGIST_COMPRESS_PROC:
+				if (configOutLefttype != procform->amproclefttype ||
+					configOutRighttype != procform->amprocrighttype)
+					ok = false;
+				else
+					ok = check_amproc_signature(procform->amproc,
+												configOut.leafType, true,
+												1, 1, procform->amproclefttype);
+				break;
 			default:
 				ereport(INFO,
 						(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
@@ -178,7 +227,6 @@ spgvalidate(Oid opclassoid)
 	}
 
 	/* Now check for inconsistent groups of operators/functions */
-	grouplist = identify_opfamily_groups(oprlist, proclist);
 	opclassgroup = NULL;
 	foreach(lc, grouplist)
 	{
diff --git a/src/include/access/spgist.h b/src/include/access/spgist.h
index d1bc396..06b1d88 100644
--- a/src/include/access/spgist.h
+++ b/src/include/access/spgist.h
@@ -30,7 +30,9 @@
 #define SPGIST_PICKSPLIT_PROC			3
 #define SPGIST_INNER_CONSISTENT_PROC	4
 #define SPGIST_LEAF_CONSISTENT_PROC		5
-#define SPGISTNProc						5
+#define SPGIST_COMPRESS_PROC			6
+#define SPGISTNRequiredProc				5
+#define SPGISTNProc						6
 
 /*
  * Argument structs for spg_config method
@@ -44,6 +46,7 @@ typedef struct spgConfigOut
 {
 	Oid			prefixType;		/* Data type of inner-tuple prefixes */
 	Oid			labelType;		/* Data type of inner-tuple node labels */
+	Oid			leafType;		/* Data type of leaf-tuple values */
 	bool		canReturnData;	/* Opclass can reconstruct original data */
 	bool		longValuesOK;	/* Opclass can cope with values > 1 page */
 } spgConfigOut;
diff --git a/src/include/access/spgist_private.h b/src/include/access/spgist_private.h
index 1c4b321..e55de9d 100644
--- a/src/include/access/spgist_private.h
+++ b/src/include/access/spgist_private.h
@@ -119,7 +119,8 @@ typedef struct SpGistState
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -178,7 +179,8 @@ typedef struct SpGistCache
 {
 	spgConfigOut config;		/* filled in by opclass config method */
 
-	SpGistTypeDesc attType;		/* type of input data and leaf values */
+	SpGistTypeDesc attType;		/* type of values to be indexed/restored */
+	SpGistTypeDesc attLeafType;		/* type of leaf-tuple values */
 	SpGistTypeDesc attPrefixType;	/* type of inner-tuple prefix values */
 	SpGistTypeDesc attLabelType;	/* type of node label values */
 
@@ -300,7 +302,7 @@ typedef SpGistLeafTupleData *SpGistLeafTuple;
 
 #define SGLTHDRSZ			MAXALIGN(sizeof(SpGistLeafTupleData))
 #define SGLTDATAPTR(x)		(((char *) (x)) + SGLTHDRSZ)
-#define SGLTDATUM(x, s)		((s)->attType.attbyval ? \
+#define SGLTDATUM(x, s)		((s)->attLeafType.attbyval ? \
 							 *(Datum *) SGLTDATAPTR(x) : \
 							 PointerGetDatum(SGLTDATAPTR(x)))
 
#34Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Nikita Glukhov (#33)

On Thu, Dec 7, 2017 at 3:17 AM, Nikita Glukhov <n.gluhov@postgrespro.ru>
wrote:

On 06.12.2017 21:49, Alexander Korotkov wrote:

On Wed, Dec 6, 2017 at 6:08 PM, Nikita Glukhov <n.gluhov@postgrespro.ru>
wrote:

On 05.12.2017 23:59, Alexander Korotkov wrote:

On Tue, Dec 5, 2017 at 1:14 PM, Darafei Praliaskouski <me@komzpa.net>
wrote:

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: tested, passed

I've read the updated patch and see my concerns addressed.

I'm looking forward to SP-GiST compress method support, as it will allow
usage of SP-GiST index infrastructure for PostGIS.

The new status of this patch is: Ready for Committer

I went trough this patch. I found documentation changes to be not
sufficient. And I've made some improvements.

In particular, I didn't understand why is reconstructedValue claimed to
be of spgConfigOut.leafType while it should be of spgConfigIn.attType both
from general logic and code. I've fixed that. Nikita, correct me if I'm
wrong.

I think we are reconstructing a leaf datum, so documentation was correct
but the code in spgWalk() and freeScanStackEntry() wrongly used attType
instead of attLeafType. Fixed patch is attached.

Reconstructed datum is used for index-only scan. Thus, it should be
original indexed datum of attType, unless we have decompress method and
pass reconstructed datum through it.

But if we have compress method and do not have decompress method then
index-only scan seems to be impossible.

Sorry, I didn't realize that we don't return reconstructed value
immediately, but return only leafValue provided by leafConsistent. In this
case, leafConsistent is responsible to convert value from
spgConfigOut.leafType to spgConfigIn.attType.

TBH, practical example of SP-GiST opclass with both compress method and
index-only scan support doesn't come to my mind, because compress method is
typically needed when we have lossy representation of index keys. For
example, in GiST all the opclasses where compress method do useful work use
lossy representation of keys. Nevertheless, it's good to not cut
possibility of index-only scans when spgConfigOut.leafType !=
spgConfigIn.attType. And to lay responsibility for conversion on
leafConsistent seems like elegant way to do this. So, that's OK for me.

Also, I wonder should we check for existence of compress method when

attType and leafType are not the same in spgvalidate() function? We don't
do this for now.

I've added compress method existence check to spgvalidate().

It would be nice to evade double calling of config method. Possible
option could be to memorize difference between attribute type and leaf type
in high bit of functionset, which is guaranteed to be free.

I decided to simply set compress method's bit in functionset when this
method it is not required.

Looks good for me.

Now, this patch is ready for committer from my point of view.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#35Teodor Sigaev
teodor@sigaev.ru
In reply to: Alexander Korotkov (#34)

Now, this patch is ready for committer from my point of view.

Thank you, pushed

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

#36Noname
ilmari@ilmari.org
In reply to: Teodor Sigaev (#35)
1 attachment(s)

Teodor Sigaev <teodor@sigaev.ru> writes:

Now, this patch is ready for committer from my point of view.

Thank you, pushed

This patch added two copies of the poly_ops row to the "Built-in SP-GiST
Operator Classes" table in spgist.sgml. The attached patched removes
one of them.

- ilmari
--
- Twitter seems more influential [than blogs] in the 'gets reported in
the mainstream press' sense at least. - Matt McLeod
- That'd be because the content of a tweet is easier to condense down
to a mainstream media article. - Calle Dybedahl

Attachments:

0001-Remove-duplicate-poly_ops-row-from-SP-GiST-opclass-t.patchtext/x-diffDownload
From 481afc4476f6eb3ec357ed795ce382ff1cb432fa Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= <ilmari@ilmari.org>
Date: Wed, 3 Jan 2018 22:04:09 +0000
Subject: [PATCH] Remove duplicate poly_ops row from SP-GiST opclass table

Commit ff963b393c added two identical copies of this row.
---
 doc/src/sgml/spgist.sgml | 18 ------------------
 1 file changed, 18 deletions(-)

diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index 51bb60c92a..e47f70be89 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -148,24 +148,6 @@
        <literal>|&amp;&gt;</literal>
       </entry>
      </row>
-     <row>
-      <entry><literal>poly_ops</literal></entry>
-      <entry><type>polygon</type></entry>
-      <entry>
-       <literal>&lt;&lt;</literal>
-       <literal>&amp;&lt;</literal>
-       <literal>&amp;&amp;</literal>
-       <literal>&amp;&gt;</literal>
-       <literal>&gt;&gt;</literal>
-       <literal>~=</literal>
-       <literal>@&gt;</literal>
-       <literal>&lt;@</literal>
-       <literal>&amp;&lt;|</literal>
-       <literal>&lt;&lt;|</literal>
-       <literal>|&gt;&gt;</literal>
-       <literal>|&amp;&gt;</literal>
-      </entry>
-     </row>
      <row>
       <entry><literal>text_ops</literal></entry>
       <entry><type>text</type></entry>
-- 
2.15.1

#37Alexander Korotkov
a.korotkov@postgrespro.ru
In reply to: Noname (#36)

On Thu, Jan 4, 2018 at 1:17 AM, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
wrote:

Teodor Sigaev <teodor@sigaev.ru> writes:

Now, this patch is ready for committer from my point of view.

Thank you, pushed

This patch added two copies of the poly_ops row to the "Built-in SP-GiST
Operator Classes" table in spgist.sgml.

Right.

The attached patched removes
one of them.

Thank for fixing this! I'm sure that Teodor will push this after end of
New Year holidays in Russia.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#38Daniel Gustafsson
daniel@yesql.se
In reply to: Noname (#36)

On 04 Jan 2018, at 06:17, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> wrote:

Teodor Sigaev <teodor@sigaev.ru> writes:

Now, this patch is ready for committer from my point of view.

Thank you, pushed

This patch added two copies of the poly_ops row to the "Built-in SP-GiST
Operator Classes" table in spgist.sgml. The attached patched removes
one of them.

Patch looks good, marked as Ready for Committer in the CF app.

cheers ./daniel

#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#37)

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

On Thu, Jan 4, 2018 at 1:17 AM, Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
wrote:

This patch added two copies of the poly_ops row to the "Built-in SP-GiST
Operator Classes" table in spgist.sgml.

Thank for fixing this! I'm sure that Teodor will push this after end of
New Year holidays in Russia.

He didn't ... so I did.

regards, tom lane