StrategyAM for IndexAMs

Started by Simon Riggsover 3 years ago3 messages

simon.riggs@enterprisedb.com

over 3 years ago

1 attachment(s)

I'm preparing the way for a later patch that would allow unique hash
indexes to be primary keys. There are various parts to the problem. I
was surprised at how many times we hardcode BTREE_AM_OID and
associated BT Strategy Numbers in many parts of the code, so have been
looking for ways to reconcile how Hash and Btree strategies work and
potentially remove hardcoding. There are various comments that say we
need a way to be able to define which Strategy Numbers are used by
indexams.

I came up with a rather simple way: the indexam just says "I'm like a
btree", which allows you to avoid adding hundreds of operator classes
for the new index, since that is cumbersome and insecure.

Specifically, we add a "strategyam" field to the IndexAmRoutine that
allows an indexam to declare whether it is like a btree, like a hash
index or another am. This then allows us to KEEP the hardcoded
BTREE_AM_OID tests, but point them at the strategyam rather than the
relam, which can be cached in various places as needed. No catalog
changes needed.

I've coded this up and it works fine.

The attached patch is still incomplete because we use this in a few
places and they all need to be checked. So before I do that, it seems
sensible to agree the approach.

(Obviously, there are hundreds of places where BTEqualStrategyNumber
is hardcoded, and this doesn't change that at all, in case that wasn't
clear).

Comments welcome on this still WIP patch.

--
Simon Riggs http://www.EnterpriseDB.com/

Attachments:

strategyam.v1.patchapplication/octet-stream; name=strategyam.v1.patchDownload

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index e88f7efa7e..d6be1955b7 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -91,6 +91,7 @@ brinhandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = BRIN_AM_OID;
 	amroutine->amstrategies = 0;
 	amroutine->amsupport = BRIN_LAST_OPTIONAL_PROCNUM;
 	amroutine->amoptsprocnum = BRIN_PROCNUM_OPTIONS;
diff --git a/src/backend/access/gin/ginutil.c b/src/backend/access/gin/ginutil.c
index 6df7f2eaeb..7f35109c76 100644
--- a/src/backend/access/gin/ginutil.c
+++ b/src/backend/access/gin/ginutil.c
@@ -39,6 +39,7 @@ ginhandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = GIN_AM_OID;
 	amroutine->amstrategies = 0;
 	amroutine->amsupport = GINNProcs;
 	amroutine->amoptsprocnum = GIN_OPTIONS_PROC;
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 5866c6aaaf..8f0144375b 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -17,6 +17,7 @@
 #include "access/gist_private.h"
 #include "access/gistscan.h"
 #include "access/xloginsert.h"
+#include "catalog/pg_am_d.h"
 #include "catalog/pg_collation.h"
 #include "commands/vacuum.h"
 #include "miscadmin.h"
@@ -61,6 +62,7 @@ gisthandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = GIST_AM_OID;
 	amroutine->amstrategies = 0;
 	amroutine->amsupport = GISTNProcs;
 	amroutine->amoptsprocnum = GIST_OPTIONS_PROC;
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index c361509d68..2de916dde8 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -58,6 +58,7 @@ hashhandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = HASH_AM_OID;
 	amroutine->amstrategies = HTMaxStrategyNumber;
 	amroutine->amsupport = HASHNProcs;
 	amroutine->amoptsprocnum = HASHOPTIONS_PROC;
diff --git a/src/backend/access/index/amapi.c b/src/backend/access/index/amapi.c
index 2b028e1950..0c4a10b8b7 100644
--- a/src/backend/access/index/amapi.c
+++ b/src/backend/access/index/amapi.c
@@ -141,3 +141,11 @@ amvalidate(PG_FUNCTION_ARGS)
 
 	PG_RETURN_BOOL(result);
 }
+
+Oid
+get_amstrategy_am(Oid indexam_oid)
+{
+	IndexAmRoutine *indexAm = GetIndexAmRoutineByAmId(indexam_oid, false);
+
+	return indexAm->amstrategy_am;
+}
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index b52eca8f38..be6b6dc9c3 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -97,6 +97,7 @@ bthandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = BTREE_AM_OID;
 	amroutine->amstrategies = BTMaxStrategyNumber;
 	amroutine->amsupport = BTNProcs;
 	amroutine->amoptsprocnum = BTOPTIONS_PROC;
diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 2c661fcf96..0b49099076 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -45,6 +45,7 @@ spghandler(PG_FUNCTION_ARGS)
 {
 	IndexAmRoutine *amroutine = makeNode(IndexAmRoutine);
 
+	amroutine->amstrategy_am = SPGIST_AM_OID;
 	amroutine->amstrategies = 0;
 	amroutine->amsupport = SPGISTNProc;
 	amroutine->amoptsprocnum = SPGIST_OPTIONS_PROC;
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index dc35b02910..cb74a9cd01 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -937,7 +937,7 @@ copy_table_data(Oid OIDNewHeap, Oid OIDOldHeap, Oid OIDOldIndex, bool verbose,
 	 * tells us it's cheaper.  Otherwise, always indexscan if an index is
 	 * provided, else plain seqscan.
 	 */
-	if (OldIndex != NULL && OldIndex->rd_rel->relam == BTREE_AM_OID)
+	if (OldIndex != NULL && get_amstrategy_am(OldIndex->rd_rel->relam) == BTREE_AM_OID)
 		use_sort = plan_cluster_use_sort(OIDOldHeap, OIDOldIndex);
 	else
 		use_sort = false;
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 9ac0383459..b2dd5aad36 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -14,7 +14,7 @@
  */
 #include "postgres.h"
 
-#include "access/genam.h"
+#include "access/amapi.h"
 #include "access/heapam.h"
 #include "access/htup_details.h"
 #include "access/multixact.h"
@@ -869,7 +869,7 @@ is_usable_unique_index(Relation indexRel)
 
 	/*
 	 * Must be unique, valid, immediate, non-partial, and be defined over
-	 * plain user columns (not expressions).  We also require it to be a
+	 * plain user columns (not expressions).  We also require it to be like a
 	 * btree.  Even if we had any other unique index kinds, we'd not know how
 	 * to identify the corresponding equality operator, nor could we be sure
 	 * that the planner could implement the required FULL JOIN with non-btree
@@ -877,7 +877,7 @@ is_usable_unique_index(Relation indexRel)
 	 */
 	if (indexStruct->indisunique &&
 		indexStruct->indimmediate &&
-		indexRel->rd_rel->relam == BTREE_AM_OID &&
+		get_amstrategy_am(indexRel->rd_rel->relam) == BTREE_AM_OID &&
 		indexStruct->indisvalid &&
 		RelationGetIndexPredicate(indexRel) == NIL &&
 		indexStruct->indnatts > 0)
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index 7d176e7b00..9276cd039d 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -2761,8 +2761,8 @@ match_rowcompare_to_indexcol(PlannerInfo *root,
 	Oid			expr_op;
 	Oid			expr_coll;
 
-	/* Forget it if we're not dealing with a btree index */
-	if (index->relam != BTREE_AM_OID)
+	/* Forget it if we're not dealing with an index like a btree */
+	if (index->strategyam != BTREE_AM_OID)
 		return NULL;
 
 	index_relid = index->rel->relid;
@@ -3459,7 +3459,7 @@ ec_member_matches_indexcol(PlannerInfo *root, RelOptInfo *rel,
 	 * generate_implied_equalities_for_column; see
 	 * match_eclass_clauses_to_index.
 	 */
-	if (index->relam == BTREE_AM_OID &&
+	if (index->strategyam == BTREE_AM_OID &&
 		!list_member_oid(ec->ec_opfamilies, curFamily))
 		return false;
 
diff --git a/src/backend/optimizer/util/plancat.c b/src/backend/optimizer/util/plancat.c
index 5012bfe142..b6b350bd48 100644
--- a/src/backend/optimizer/util/plancat.c
+++ b/src/backend/optimizer/util/plancat.c
@@ -268,6 +268,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 
 			/* We copy just the fields we need, not all of rd_indam */
 			amroutine = indexRelation->rd_indam;
+			info->strategyam = amroutine->amstrategy_am;
 			info->amcanorderbyop = amroutine->amcanorderbyop;
 			info->amoptionalkey = amroutine->amoptionalkey;
 			info->amsearcharray = amroutine->amsearcharray;
@@ -287,10 +288,10 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 			/*
 			 * Fetch the ordering information for the index, if any.
 			 */
-			if (info->relam == BTREE_AM_OID)
+			if (info->strategyam == BTREE_AM_OID)
 			{
 				/*
-				 * If it's a btree index, we can use its opfamily OIDs
+				 * If it's like a btree index, we can use its opfamily OIDs
 				 * directly as the sort ordering opfamily OIDs.
 				 */
 				Assert(amroutine->amcanorder);
@@ -317,11 +318,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 				 *
 				 * XXX This method is rather slow and also requires the
 				 * undesirable assumption that the other index AM numbers its
-				 * strategies the same as btree.  It'd be better to have a way
-				 * to explicitly declare the corresponding btree opfamily for
-				 * each opfamily of the other index type.  But given the lack
-				 * of current or foreseeable amcanorder index types, it's not
-				 * worth expending more effort on now.
+				 * strategies the same as btree.
 				 */
 				info->sortopfamily = (Oid *) palloc(sizeof(Oid) * nkeycolumns);
 				info->reverse_sort = (bool *) palloc(sizeof(bool) * nkeycolumns);
@@ -414,7 +411,7 @@ get_relation_info(PlannerInfo *root, Oid relationObjectId, bool inhparent,
 					info->tuples = rel->tuples;
 			}
 
-			if (info->relam == BTREE_AM_OID)
+			if (info->strategyam == BTREE_AM_OID)
 			{
 				/* For btrees, get tree height while we have the index open */
 				info->tree_height = _bt_getrootheight(indexRelation);
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index fa1f589fad..2ddb5d5e15 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -5998,8 +5998,8 @@ get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
 		IndexOptInfo *index = (IndexOptInfo *) lfirst(lc);
 		ScanDirection indexscandir;
 
-		/* Ignore non-btree indexes */
-		if (index->relam != BTREE_AM_OID)
+		/* Ignore indexes that don't behave like btrees */
+		if (index->strategyam != BTREE_AM_OID)
 			continue;
 
 		/*
diff --git a/src/include/access/amapi.h b/src/include/access/amapi.h
index 1dc674d230..e798e2aa66 100644
--- a/src/include/access/amapi.h
+++ b/src/include/access/amapi.h
@@ -207,6 +207,14 @@ typedef struct IndexAmRoutine
 {
 	NodeTag		type;
 
+	/*
+	 * The Oid of the AM that defines the strategies. Typically, this would
+	 * be BTREE_AM_OID for any AM that chooses to use BT StrategyNumbers, etc..
+	 * Or put another way, is it like a btree or a hash index etc..
+	 * This is NOT the Oid of the AM itself.
+	 */
+	Oid			amstrategy_am;
+
 	/*
 	 * Total number of strategies (operators) by which we can traverse/search
 	 * this AM.  Zero if AM does not have a fixed set of strategy assignments.
@@ -286,5 +294,6 @@ typedef struct IndexAmRoutine
 /* Functions in access/index/amapi.c */
 extern IndexAmRoutine *GetIndexAmRoutine(Oid amhandler);
 extern IndexAmRoutine *GetIndexAmRoutineByAmId(Oid amoid, bool noerror);
+extern Oid get_amstrategy_am(Oid indexam_oid);
 
 #endif							/* AMAPI_H */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 69ba254372..a257841ca6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1068,6 +1068,8 @@ struct IndexOptInfo
 	bool	   *canreturn pg_node_attr(read_write_ignore);
 	/* OID of the access method (in pg_am) */
 	Oid			relam;
+	/* OID of the access method that defines strategies */
+	Oid			strategyam;
 
 	/*
 	 * expressions for non-simple index columns; redundant to print since we

Matthias van de Meent

boekewurm+postgres@gmail.com

over 3 years ago

In reply to: Simon Riggs (#1)

Re: StrategyAM for IndexAMs

On Tue, 19 Jul 2022 at 18:56, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

I'm preparing the way for a later patch that would allow unique hash
indexes to be primary keys. There are various parts to the problem. I
was surprised at how many times we hardcode BTREE_AM_OID and
associated BT Strategy Numbers in many parts of the code, so have been
looking for ways to reconcile how Hash and Btree strategies work and
potentially remove hardcoding. There are various comments that say we
need a way to be able to define which Strategy Numbers are used by
indexams.

I came up with a rather simple way: the indexam just says "I'm like a
btree", which allows you to avoid adding hundreds of operator classes
for the new index, since that is cumbersome and insecure.

I'm fairly certain that you can't (and don't want to) make a hash
index look like a btree index, considering that of the btree
operations only equality checks make sense in the hash context, and
that you can't do ordered retrieval (incl. no min/max), which are
major features of btree.

With that in mind, could you tell whether this patch is related to the
effort of hash-based unique primary keys (apart from inspiration
during development), and if so, how?

Specifically, we add a "strategyam" field to the IndexAmRoutine that
allows an indexam to declare whether it is like a btree, like a hash
index or another am. This then allows us to KEEP the hardcoded
BTREE_AM_OID tests, but point them at the strategyam rather than the
relam, which can be cached in various places as needed. No catalog
changes needed.

I've coded this up and it works fine.

The attached patch is still incomplete because we use this in a few
places and they all need to be checked. So before I do that, it seems
sensible to agree the approach.

(Obviously, there are hundreds of places where BTEqualStrategyNumber
is hardcoded, and this doesn't change that at all, in case that wasn't
clear).

Comments welcome on this still WIP patch.

I think this is a great step in the right direction, fixing one of the
issues with core index AMs, issues I also complained about earlier
[0]: /messages/by-id/CAEze2Wg8QhpOnHoqPNB-AaexGX4Zaij=4TT0kaMhF_6T5FXxmQ@mail.gmail.com

Thanks,

Matthias van de Meent

[0]: /messages/by-id/CAEze2Wg8QhpOnHoqPNB-AaexGX4Zaij=4TT0kaMhF_6T5FXxmQ@mail.gmail.com

Simon Riggs

simon.riggs@enterprisedb.com

over 3 years ago

In reply to: Matthias van de Meent (#2)

Re: StrategyAM for IndexAMs

On Fri, 22 Jul 2022 at 10:23, Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Tue, 19 Jul 2022 at 18:56, Simon Riggs <simon.riggs@enterprisedb.com> wrote:

I'm preparing the way for a later patch that would allow unique hash
indexes to be primary keys. There are various parts to the problem. I
was surprised at how many times we hardcode BTREE_AM_OID and
associated BT Strategy Numbers in many parts of the code, so have been
looking for ways to reconcile how Hash and Btree strategies work and
potentially remove hardcoding. There are various comments that say we
need a way to be able to define which Strategy Numbers are used by
indexams.

I came up with a rather simple way: the indexam just says "I'm like a
btree", which allows you to avoid adding hundreds of operator classes
for the new index, since that is cumbersome and insecure.

I'm fairly certain that you can't (and don't want to) make a hash
index look like a btree index, considering that of the btree
operations only equality checks make sense in the hash context, and
that you can't do ordered retrieval (incl. no min/max), which are
major features of btree.

"like a $INDEX_TYPE" is wrong. What I really mean is "use the operator
strategy numbering same as $INDEX_TYPE".

With that in mind, could you tell whether this patch is related to the
effort of hash-based unique primary keys (apart from inspiration
during development), and if so, how?

There are lots of places that are hardcoded BTREE_AM_OID, with a mix
of purposes. It's hard to tackle one without getting drawn in to fix
the others.

Specifically, we add a "strategyam" field to the IndexAmRoutine that
allows an indexam to declare whether it is like a btree, like a hash
index or another am. This then allows us to KEEP the hardcoded
BTREE_AM_OID tests, but point them at the strategyam rather than the
relam, which can be cached in various places as needed. No catalog
changes needed.

I've coded this up and it works fine.

The attached patch is still incomplete because we use this in a few
places and they all need to be checked. So before I do that, it seems
sensible to agree the approach.

(Obviously, there are hundreds of places where BTEqualStrategyNumber
is hardcoded, and this doesn't change that at all, in case that wasn't
clear).

Comments welcome on this still WIP patch.

I think this is a great step in the right direction, fixing one of the
issues with core index AMs, issues I also complained about earlier
[0].

[0] /messages/by-id/CAEze2Wg8QhpOnHoqPNB-AaexGX4Zaij=4TT0kaMhF_6T5FXxmQ@mail.gmail.com

Guess we're thinking along similar lines. I was unaware of your recent
post; these days I don't read Hackers apart from what I'm working on.

--
Simon Riggs http://www.EnterpriseDB.com/