int2vector and btree indexes

Started by Amit Langoteover 9 years ago5 messages
#1Amit Langote
Langote_Amit_f8@lab.ntt.co.jp

If I create btree index on a int2vector column, it does not get used for
queries because the query search always fails to match the index operator
(family).

During index creation, GetDefaultOpClass() returns array_ops for a
int2vector index column, because type int2vector is binary-coercible with
anyarray (which is array_ops's input type). Whereas queries involving
int2vector columns would use a int2vector_ops operator.

I wonder if the index creation command should rather fail because an index
thus created will never get used? Or is_indexable_operator() should
somehow consider the fact that such indexes could in fact exist?

I might be missing something though.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Amit Langote (#1)
Re: int2vector and btree indexes

On 2016/10/11 15:58, Amit Langote wrote:

If I create btree index on a int2vector column, it does not get used for
queries because the query search always fails to match the index operator
(family).

During index creation, GetDefaultOpClass() returns array_ops for a
int2vector index column, because type int2vector is binary-coercible with
anyarray (which is array_ops's input type). Whereas queries involving
int2vector columns would use a int2vector_ops operator.

I was wrong that the index *never* gets used. It does in fact get used if
the operator is an ordering search operator (<, <=, >, >=), in which case
the query would use an array_ops operator (which is a btree operator class
for type anyarray) and hence matches the index operator family. I failed
to mention in my original message that int2vector_ops is a hash operator
class. There is exactly one =(int2vector, int2vector) operator in the
system of which there is no btree equivalent.

I guess there is not much to complaint about here after all. Sorry about
the noise.

Thanks,
Amit

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#2)
Re: int2vector and btree indexes

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

I was wrong that the index *never* gets used. It does in fact get used if
the operator is an ordering search operator (<, <=, >, >=), in which case
the query would use an array_ops operator (which is a btree operator class
for type anyarray) and hence matches the index operator family. I failed
to mention in my original message that int2vector_ops is a hash operator
class. There is exactly one =(int2vector, int2vector) operator in the
system of which there is no btree equivalent.

Hmm ... I kind of wonder why we have int2vectoreq or hashint2vector at
all, likewise the hash opclass based on them. The code says that they
are needed to support catcache index columns, but the only columns of
this type are

regression=# select attrelid::regclass,attname from pg_attribute where atttypid = 'int2vector'::regtype;
attrelid | attname
------------+-----------
pg_index | indkey
pg_index | indoption
pg_trigger | tgattr
(3 rows)

and those don't have indexes at all, let alone catcaches based on them.
So it looks to me like we could remove this infrastructure. There is
value in being able to hash int2vectors during queries, for sure, but
we could let that be done by the anyarray hash opclass.

Having said that, int2vector is not meant as a user-facing type and so
I don't particularly care whether indexes built on it work conveniently.
But it looks to me like we've got some unnecessary code here.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Amit Langote
Langote_Amit_f8@lab.ntt.co.jp
In reply to: Tom Lane (#3)
1 attachment(s)
Re: int2vector and btree indexes

On 2016/10/11 21:40, Tom Lane wrote:

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

I was wrong that the index *never* gets used. It does in fact get used if
the operator is an ordering search operator (<, <=, >, >=), in which case
the query would use an array_ops operator (which is a btree operator class
for type anyarray) and hence matches the index operator family. I failed
to mention in my original message that int2vector_ops is a hash operator
class. There is exactly one =(int2vector, int2vector) operator in the
system of which there is no btree equivalent.

Hmm ... I kind of wonder why we have int2vectoreq or hashint2vector at
all, likewise the hash opclass based on them. The code says that they
are needed to support catcache index columns, but the only columns of
this type are

regression=# select attrelid::regclass,attname from pg_attribute where atttypid = 'int2vector'::regtype;
attrelid | attname
------------+-----------
pg_index | indkey
pg_index | indoption
pg_trigger | tgattr
(3 rows)

and those don't have indexes at all, let alone catcaches based on them.
So it looks to me like we could remove this infrastructure. There is
value in being able to hash int2vectors during queries, for sure, but
we could let that be done by the anyarray hash opclass.

Agreed. So how about the attached patch to remove the said infrastructure?

Having said that, int2vector is not meant as a user-facing type and so
I don't particularly care whether indexes built on it work conveniently.
But it looks to me like we've got some unnecessary code here.

Ah, I did wonder whether int2vector has been deprecated as a user-facing
type. Anyway after applying the patch, it seems that the original
complaint I raised is no longer an issue (or so I think) - operators
applied to int2vector are always resolved to those accepting anyarray and
matched with anyarray_ops of the correct index access method.

Thanks,
Amit

Attachments:

rm-int2vector-op-hash-infra-1.patchtext/x-diff; name=rm-int2vector-op-hash-infra-1.patchDownload
diff --git a/src/backend/access/hash/hashfunc.c b/src/backend/access/hash/hashfunc.c
index 614f4ff..12dce2e 100644
--- a/src/backend/access/hash/hashfunc.c
+++ b/src/backend/access/hash/hashfunc.c
@@ -131,14 +131,6 @@ hashoidvector(PG_FUNCTION_ARGS)
 }
 
 Datum
-hashint2vector(PG_FUNCTION_ARGS)
-{
-	int2vector *key = (int2vector *) PG_GETARG_POINTER(0);
-
-	return hash_any((unsigned char *) key->values, key->dim1 * sizeof(int16));
-}
-
-Datum
 hashname(PG_FUNCTION_ARGS)
 {
 	char	   *key = NameStr(*PG_GETARG_NAME(0));
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index 29d92a7..443ac5c 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -254,22 +254,6 @@ int2vectorsend(PG_FUNCTION_ARGS)
 	return array_send(fcinfo);
 }
 
-/*
- * We don't have a complete set of int2vector support routines,
- * but we need int2vectoreq for catcache indexing.
- */
-Datum
-int2vectoreq(PG_FUNCTION_ARGS)
-{
-	int2vector *a = (int2vector *) PG_GETARG_POINTER(0);
-	int2vector *b = (int2vector *) PG_GETARG_POINTER(1);
-
-	if (a->dim1 != b->dim1)
-		PG_RETURN_BOOL(false);
-	PG_RETURN_BOOL(memcmp(a->values, b->values, a->dim1 * sizeof(int16)) == 0);
-}
-
-
 /*****************************************************************************
  *	 PUBLIC ROUTINES														 *
  *****************************************************************************/
diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index db7099f..6016d19 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -126,11 +126,6 @@ GetCCHashEqFuncs(Oid keytype, PGFunction *hashfunc, RegProcedure *eqfunc)
 
 			*eqfunc = F_INT2EQ;
 			break;
-		case INT2VECTOROID:
-			*hashfunc = hashint2vector;
-
-			*eqfunc = F_INT2VECTOREQ;
-			break;
 		case INT4OID:
 			*hashfunc = hashint4;
 
diff --git a/src/include/access/hash.h b/src/include/access/hash.h
index 491d4c9..725e2f2 100644
--- a/src/include/access/hash.h
+++ b/src/include/access/hash.h
@@ -283,7 +283,6 @@ extern Datum hashenum(PG_FUNCTION_ARGS);
 extern Datum hashfloat4(PG_FUNCTION_ARGS);
 extern Datum hashfloat8(PG_FUNCTION_ARGS);
 extern Datum hashoidvector(PG_FUNCTION_ARGS);
-extern Datum hashint2vector(PG_FUNCTION_ARGS);
 extern Datum hashname(PG_FUNCTION_ARGS);
 extern Datum hashtext(PG_FUNCTION_ARGS);
 extern Datum hashvarlena(PG_FUNCTION_ARGS);
diff --git a/src/include/catalog/pg_amop.h b/src/include/catalog/pg_amop.h
index 15b6290..e4c3515 100644
--- a/src/include/catalog/pg_amop.h
+++ b/src/include/catalog/pg_amop.h
@@ -573,8 +573,6 @@ DATA(insert (	2040   1114 1114 1 s 2060 405 0 ));
 DATA(insert (	2222   16 16 1 s   91 405 0 ));
 /* bytea_ops */
 DATA(insert (	2223   17 17 1 s 1955 405 0 ));
-/* int2vector_ops */
-DATA(insert (	2224   22 22 1 s	386 405 0 ));
 /* xid_ops */
 DATA(insert (	2225   28 28 1 s	352 405 0 ));
 /* cid_ops */
diff --git a/src/include/catalog/pg_amproc.h b/src/include/catalog/pg_amproc.h
index 1b654d5..f01a5b4 100644
--- a/src/include/catalog/pg_amproc.h
+++ b/src/include/catalog/pg_amproc.h
@@ -173,7 +173,6 @@ DATA(insert (	2001   1266 1266 1 1696 ));
 DATA(insert (	2040   1114 1114 1 2039 ));
 DATA(insert (	2222   16 16 1 454 ));
 DATA(insert (	2223   17 17 1 456 ));
-DATA(insert (	2224   22 22 1 398 ));
 DATA(insert (	2225   28 28 1 450 ));
 DATA(insert (	2226   29 29 1 450 ));
 DATA(insert (	2227   702 702 1 450 ));
diff --git a/src/include/catalog/pg_opclass.h b/src/include/catalog/pg_opclass.h
index 5900cdc..ade8da3 100644
--- a/src/include/catalog/pg_opclass.h
+++ b/src/include/catalog/pg_opclass.h
@@ -168,7 +168,6 @@ DATA(insert (	403		bpchar_pattern_ops	PGNSP PGUID 2097 1042 f 0 ));
 DATA(insert (	403		money_ops			PGNSP PGUID 2099  790 t 0 ));
 DATA(insert (	405		bool_ops			PGNSP PGUID 2222   16 t 0 ));
 DATA(insert (	405		bytea_ops			PGNSP PGUID 2223   17 t 0 ));
-DATA(insert (	405		int2vector_ops		PGNSP PGUID 2224   22 t 0 ));
 DATA(insert (	403		tid_ops				PGNSP PGUID 2789   27 t 0 ));
 DATA(insert (	405		xid_ops				PGNSP PGUID 2225   28 t 0 ));
 DATA(insert (	405		cid_ops				PGNSP PGUID 2226   29 t 0 ));
diff --git a/src/include/catalog/pg_operator.h b/src/include/catalog/pg_operator.h
index b8f06b3..26fa618 100644
--- a/src/include/catalog/pg_operator.h
+++ b/src/include/catalog/pg_operator.h
@@ -156,8 +156,6 @@ DATA(insert OID = 389 (  "!!"	   PGNSP PGUID l f f	 0	20	1700  0  0 numeric_fac
 DESCR("deprecated, use ! instead");
 DATA(insert OID = 385 (  "="	   PGNSP PGUID b f t	29	29	16 385	 0 cideq eqsel eqjoinsel ));
 DESCR("equal");
-DATA(insert OID = 386 (  "="	   PGNSP PGUID b f t	22	22	16 386	 0 int2vectoreq eqsel eqjoinsel ));
-DESCR("equal");
 
 DATA(insert OID = 387 (  "="	   PGNSP PGUID b t f	27	27	16 387 402 tideq eqsel eqjoinsel ));
 DESCR("equal");
diff --git a/src/include/catalog/pg_opfamily.h b/src/include/catalog/pg_opfamily.h
index ac6b304..7ba23e5 100644
--- a/src/include/catalog/pg_opfamily.h
+++ b/src/include/catalog/pg_opfamily.h
@@ -117,7 +117,6 @@ DATA(insert OID = 2099 (	403		money_ops		PGNSP PGUID ));
 DATA(insert OID = 2222 (	405		bool_ops		PGNSP PGUID ));
 #define BOOL_HASH_FAM_OID 2222
 DATA(insert OID = 2223 (	405		bytea_ops		PGNSP PGUID ));
-DATA(insert OID = 2224 (	405		int2vector_ops	PGNSP PGUID ));
 DATA(insert OID = 2789 (	403		tid_ops			PGNSP PGUID ));
 DATA(insert OID = 2225 (	405		xid_ops			PGNSP PGUID ));
 DATA(insert OID = 2226 (	405		cid_ops			PGNSP PGUID ));
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index e2d08ba..17ec71d 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -540,7 +540,6 @@ DATA(insert OID = 313 (  int4			   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23
 DESCR("convert int2 to int4");
 DATA(insert OID = 314 (  int2			   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 21	"23" _null_ _null_ _null_ _null_ _null_ i4toi2 _null_ _null_ _null_ ));
 DESCR("convert int4 to int2");
-DATA(insert OID = 315 (  int2vectoreq	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 2 0 16 "22 22" _null_ _null_ _null_ _null_ _null_ int2vectoreq _null_ _null_ _null_ ));
 DATA(insert OID = 316 (  float8			   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 701  "23" _null_ _null_ _null_ _null_ _null_	i4tod _null_ _null_ _null_ ));
 DESCR("convert int4 to float8");
 DATA(insert OID = 317 (  int4			   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "701" _null_ _null_ _null_ _null_ _null_ dtoi4 _null_ _null_ _null_ ));
@@ -687,8 +686,6 @@ DATA(insert OID = 457 (  hashoidvector	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s
 DESCR("hash");
 DATA(insert OID = 329 (  hash_aclitem	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "1033" _null_ _null_ _null_ _null_ _null_	hash_aclitem _null_ _null_ _null_ ));
 DESCR("hash");
-DATA(insert OID = 398 (  hashint2vector    PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "22" _null_ _null_ _null_ _null_ _null_ hashint2vector _null_ _null_ _null_ ));
-DESCR("hash");
 DATA(insert OID = 399 (  hashmacaddr	   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "829" _null_ _null_ _null_ _null_ _null_ hashmacaddr _null_ _null_ _null_ ));
 DESCR("hash");
 DATA(insert OID = 422 (  hashinet		   PGNSP PGUID 12 1 0 0 0 f f f f t f i s 1 0 23 "869" _null_ _null_ _null_ _null_ _null_ hashinet _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 2ae212a..90f5132 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -192,7 +192,6 @@ extern Datum int2vectorin(PG_FUNCTION_ARGS);
 extern Datum int2vectorout(PG_FUNCTION_ARGS);
 extern Datum int2vectorrecv(PG_FUNCTION_ARGS);
 extern Datum int2vectorsend(PG_FUNCTION_ARGS);
-extern Datum int2vectoreq(PG_FUNCTION_ARGS);
 extern Datum int4in(PG_FUNCTION_ARGS);
 extern Datum int4out(PG_FUNCTION_ARGS);
 extern Datum int4recv(PG_FUNCTION_ARGS);
#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Amit Langote (#4)
Re: int2vector and btree indexes

Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> writes:

On 2016/10/11 21:40, Tom Lane wrote:

Hmm ... I kind of wonder why we have int2vectoreq or hashint2vector at
all, likewise the hash opclass based on them.

Agreed. So how about the attached patch to remove the said infrastructure?

Looks good, pushed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers