[Patch] Use *other* indexes on the subscriber when REPLICA IDENTITY is FULL
Dear hackers,
(CC: Önder because he owned the related thread)
This is a follow-up thread of [1]https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=89e46da5e511a6970e26a020f265c9fb4b72b1d2. The commit allowed subscribers to use indexes
other than PK and REPLICA IDENTITY when REPLICA IDENTITY is FULL on publisher,
but the index must be a B-tree. In this proposal, I aim to extend this functionality to allow
for hash indexes and potentially other types.
I would like to share an initial patch to activate discussions.
# Current problem
The current limitation comes from the function build_replindex_scan_key(), specifically these lines:
```
/*
* Load the operator info. We need this to get the equality operator
* function for the scan key.
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
BTEqualStrategyNumber);
```
These lines retrieve an operator for equality comparisons. The "strategy number"
[2]: https://www.postgresql.org/docs/devel/xindex.html#XINDEX-STRATEGIES
is always associated with a fixed number (BTEqualStrategyNumber, 3). However,
this approach fails for other types of indexes because the strategy number is
determined by the operator class, not fixed.
# Proposed solution
I propose a patch that chooses the correct strategy number based on the index
access method. We would extract the access method from the pg_opclass system
catalog, similar to the approach used for data types and operator families.
Also, this patch change the condition for using the index on the subscriber
(see IsIndexUsableForReplicaIdentityFull()).
However, this solution does not yet extend to GiST, SP-GiST, GIN, BRIN due to
implementation constraints.
## Current difficulties
The challenge with supporting other indexes is that they lack a fixed set of strategies,
making it difficult to choose the correct strategy number based on the index
access method. Even within the same index type, different operator classes can
use different strategy numbers for the same operation.
E.g. [2]https://www.postgresql.org/docs/devel/xindex.html#XINDEX-STRATEGIES shows that number 6 can be used for the purpose, but other operator classes
added by btree_gist [3]https://www.postgresql.org/docs/devel/btree-gist.html seem to use number 3 for the euqlaity comparison.
I've looked into using ExecIndexBuildScanKeys(), which is used for normal index scans.
However, in this case, the operator and its family seem to be determined by the planner.
Based on that, the associated strategy number is extracted. This is the opposite
of what I am trying to achieve, so it doesn't seem helpful in this context.
The current patch only includes tests for hash indexes. These are separated into
the file 032_subscribe_use_index.pl for convenience, but will be integrated in a
later version.
How do you think? I want to know your opinion.
[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=89e46da5e511a6970e26a020f265c9fb4b72b1d2
[2]: https://www.postgresql.org/docs/devel/xindex.html#XINDEX-STRATEGIES
[3]: https://www.postgresql.org/docs/devel/btree-gist.html
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
0001-Allow-to-use-Hash-index-on-subscriber.patchapplication/octet-stream; name=0001-Allow-to-use-Hash-index-on-subscriber.patchDownload
From 473f9086b81260ba1751d1343d7fe0ef8615e943 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Mon, 19 Jun 2023 08:28:00 +0000
Subject: [PATCH] Allow to use Hash index on subscriber
---
src/backend/executor/execReplication.c | 52 +++++++++++--
src/backend/replication/logical/relation.c | 9 ++-
src/backend/utils/cache/lsyscache.c | 22 ++++++
src/include/utils/lsyscache.h | 1 +
src/test/subscription/meson.build | 1 +
src/test/subscription/t/034_hash.pl | 91 ++++++++++++++++++++++
6 files changed, 166 insertions(+), 10 deletions(-)
create mode 100644 src/test/subscription/t/034_hash.pl
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..0200017ae7 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,8 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
+#include "commands/defrem.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,15 +43,50 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * comparisons.
+ *
+ * TODO: support other indexes: GiST, GIN, SP-GiST, BRIN
+ */
+static int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am_method = get_opclass_method(opclass);
+ int ret;
+
+ switch (am_method)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ case GIST_AM_OID:
+ case GIN_AM_OID:
+ case SPGIST_AM_OID:
+ case BRIN_AM_OID:
+ /* TODO */
+ default:
+ /* XXX: Do we have to support extended indexes? */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
*
* Returns how many columns to use for the index scan.
*
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
*
* By definition, replication identity of a rel meets all limitations associated
* with that. Note that any other index could also meet these limitations.
@@ -77,6 +114,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ int strategy_number;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +131,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ strategy_number = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ strategy_number);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ strategy_number, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ strategy_number,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..38c8cebfdf 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,8 +779,8 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
*
* Note that the limitations of index scans for replica identity full only
@@ -841,11 +841,12 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
+ bool is_suitable_type = (indexInfo->ii_Am == BTREE_AM_OID) ||
+ (indexInfo->ii_Am == HASH_AM_OID);
bool is_partial = (indexInfo->ii_Predicate != NIL);
bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
- return is_btree && !is_partial && !is_only_on_expression;
+ return is_suitable_type && !is_partial && !is_only_on_expression;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..9a44e7ad54 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method operator family is for.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..ea8fbe42cd 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index bd673a9d68..5fa5f0a790 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -40,6 +40,7 @@ tests += {
't/031_column_list.pl',
't/032_subscribe_use_index.pl',
't/033_run_as_table_owner.pl',
+ 't/034_hash.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/034_hash.pl b/src/test/subscription/t/034_hash.pl
new file mode 100644
index 0000000000..e7babcafa7
--- /dev/null
+++ b/src/test/subscription/t/034_hash.pl
@@ -0,0 +1,91 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test logical replication behavior with subscriber using available index
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+my $result = '';
+
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data within the range 0-9 for y
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full updates 4 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use index
+# =============================================================================
--
2.27.0
Hi Hayato, all
This is a follow-up thread of [1]. The commit allowed subscribers to use
indexes
other than PK and REPLICA IDENTITY when REPLICA IDENTITY is FULL on
publisher,
but the index must be a B-tree. In this proposal, I aim to extend this
functionality to allow
for hash indexes and potentially other types.
Cool, thanks for taking the time to work on this.
# Current problem
The current limitation comes from the function build_replindex_scan_key(),
specifically these lines:
When I last dealt with the same issue, I was examining it from a slightly
broader perspective. I think
my conclusion was that RelationFindReplTupleByIndex() is designed for the
constraints of UNIQUE INDEX
and Primary Key. Hence, btree limitation was given.
So, my main point is that it might be useful to check
RelationFindReplTupleByIndex() once more in detail
to see if there is anything else that is specific to btree indexes.
build_replindex_scan_key() is definitely one of the major culprits but see
below as well.
I think we should also be mindful about tuples_equal() function. When an
index returns more than
one tuple, we rely on tuples_equal() function to make sure non-relevant
tuples are skipped.
For btree indexes, it was safe to rely on that function as the columns that
are indexed using btree
always have equality operator. I think we can safely assume the same for
hash indexes.
However, say we indexed "point" type using "gist" index. Then, if we let
this logic to kick in,
I think tuples_equal() would fail saying that there is no equality operator
exists.
One might argue that it is already the case for RelationFindReplTupleSeq()
or when you
have index but the index on a different column. But still, it seems useful
to make sure
you are aware of this limitation as well.
## Current difficulties
The challenge with supporting other indexes is that they lack a fixed set
of strategies,
making it difficult to choose the correct strategy number based on the
index
access method. Even within the same index type, different operator classes
can
use different strategy numbers for the same operation.
E.g. [2] shows that number 6 can be used for the purpose, but other
operator classes
added by btree_gist [3] seem to use number 3 for the euqlaity comparison.
Also, build_replindex_scan_key() seems like a too late place to check this?
I mean, what
if there is no equality operator, how should code react to that? It
effectively becomes
RelationFindReplTupleSeq(), so maybe better follow that route upfront?
In other words, that decision should maybe
happen IsIndexUsableForReplicaIdentityFull()?
For the specific notes you raised about strategy numbers / operator
classes, I need to
study a bit :) Though, I'll be available to do that early next week.
Thanks,
Onder
Dear Önder,
Thank you for giving comments! The author's comment is quite helpful for us.
When I last dealt with the same issue, I was examining it from a slightly broader perspective. I think
my conclusion was that RelationFindReplTupleByIndex() is designed for the constraints of UNIQUE INDEX
and Primary Key.
I see. IIUC that's why you have added tuples_equal() in the RelationFindReplTupleByIndex().
I think we should also be mindful about tuples_equal() function. When an index returns more than
one tuple, we rely on tuples_equal() function to make sure non-relevant tuples are skipped.
For btree indexes, it was safe to rely on that function as the columns that are indexed using btree
always have equality operator. I think we can safely assume the same for hash indexes.
However, say we indexed "point" type using "gist" index. Then, if we let this logic to kick in,
I think tuples_equal() would fail saying that there is no equality operator exists.
Good point. Actually I have tested for "point" datatype but it have not worked well.
Now I understand the reason.
It seemed that when TYPECACHE_EQ_OPR_FINFO is reuqesed to lookup_type_cache(),
it could return valid value only if the datatype has operator class for Btree or Hash.
So tuples_equal() might not be able to use if tuples have point box circle types.
BTW, I have doubt that the restriction is not related with your commit.
In other words, if the table has attributes which the datatype is not for operator
class of Btree, we could not use REPLICA IDENTITY FULL. IIUC it is not documented.
Please see attched script to reproduce that. The final DELETE statement cannot be
replicated at the subscriber on my env.
For the specific notes you raised about strategy numbers / operator classes, I need to
study a bit :) Though, I'll be available to do that early next week.
Thanks! I'm looking forward to see your opinions...
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Thu, Jun 22, 2023 at 11:37 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear hackers,
(CC: Önder because he owned the related thread)
...
The current patch only includes tests for hash indexes. These are separated into
the file 032_subscribe_use_index.pl for convenience, but will be integrated in a
later version.
Hi Kuroda-san.
I am still studying the docs references given in your initial post.
Meanwhile, FWIW, here are some minor review comments for the patch you gave.
======
src/backend/executor/execReplication.c
1. get_equal_strategy_number
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * comparisons.
+ *
+ * TODO: support other indexes: GiST, GIN, SP-GiST, BRIN
+ */
+static int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am_method = get_opclass_method(opclass);
+ int ret;
+
+ switch (am_method)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ case GIST_AM_OID:
+ case GIN_AM_OID:
+ case SPGIST_AM_OID:
+ case BRIN_AM_OID:
+ /* TODO */
+ default:
+ /* XXX: Do we have to support extended indexes? */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
1a.
In the file syscache.c there are already some other functions like
get_op_opfamily_strategy so I am wondering if this new function also
really belongs in that file.
~
1b.
Should that say "operator" instead of "comparisons"?
~
1c.
"am" stands for "access method" so "am_method" is like "access method
method" – is it correct?
~~~
2. build_replindex_scan_key
int table_attno = indkey->values[index_attoff];
+ int strategy_number;
Ought to say this is a strategy for "equality", so a better varname
might be "equality_strategy_number" or "eq_strategy" or similar.
======
src/backend/replication/logical/relation.c
3. IsIndexUsableForReplicaIdentityFull
It seems there is some overlap between this code which hardwired 2
valid AMs and the switch statement in your other
get_equal_strategy_number function which returns a strategy number for
those 2 AMs.
Would it be better to make another common function like
get_equality_strategy_for_am(), and then you don’t have to hardwire
anything? Instead, you can say:
is_usable_type = get_equality_strategy_for_am(indexInfo->ii_Am) !=
InvalidStrategy;
======
src/backend/utils/cache/lsyscache.c
4. get_opclass_method
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method operator family is for.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
Is the comment correct? IIUC, this function is not doing anything for
"families".
======
src/test/subscription/t/034_hash.pl
5.
+# insert some initial data within the range 0-9 for y
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM
generate_series(0,10) i"
+);
Why does the comment only say "for y"?
~~~
6.
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname
= 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full
updates 4 rows via index";
+
AFAIK this is OK but it was slightly misleading because it says
"updates 4 rows" whereas the previous UPDATE was only for 2 rows. So
here I think the 4 also counts the 2 DELETED rows. The comment can be
expanded slightly to clarify this.
~~~
7.
FYI, when I ran the TAP test the result was this:
t/034_hash.pl ...................... 1/? # Tests were run but no plan
was declared and done_testing() was not seen.
t/034_hash.pl ...................... All 2 subtests passed
t/100_bugs.pl ...................... ok
Test Summary Report
-------------------
t/034_hash.pl (Wstat: 0 Tests: 2 Failed: 0)
Parse errors: No plan found in TAP output
Files=36, Tests=457, 338 wallclock secs ( 0.29 usr 0.07 sys + 206.73
cusr 47.94 csys = 255.03 CPU)
Result: FAIL
~
Just adding the missing done_testing() seemed to fix this.
------
Kind Regards,
Peter Smith.
Fujitsu Australia
On Mon, Jun 26, 2023 at 11:44 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
...
BTW, I have doubt that the restriction is not related with your commit.
In other words, if the table has attributes which the datatype is not for operator
class of Btree, we could not use REPLICA IDENTITY FULL. IIUC it is not documented.
Please see attched script to reproduce that. The final DELETE statement cannot be
replicated at the subscriber on my env.
Missing attachment?
------
Kind Regards,
Peter Smith.
Fujitsu Australia
Please see attched script to reproduce that. The final DELETE statement cannot
be
replicated at the subscriber on my env.
Sorry, I forgot to attach...
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
Hi Hayato,
BTW, I have doubt that the restriction is not related with your commit.
In other words, if the table has attributes which the datatype is not for
operator
class of Btree, we could not use REPLICA IDENTITY FULL. IIUC it is not
documented.
Please see attched script to reproduce that. The final DELETE statement
cannot be
replicated at the subscriber on my env.
Yes, I agree, it is (and was before my patch as well)
un-documented limitation of REPLICA IDENTITY FULL.
And, as far as I can see, my patch actually didn't have any impact on the
limitation. The unsupported
cases are still unsupported, but now the same error is thrown in a slightly
different place.
I think that is a minor limitation, but maybe should be listed [1]https://www.postgresql.org/docs/current/logical-replication-restrictions.html?
For the specific notes you raised about strategy numbers / operator
classes, I need to
study a bit :) Though, I'll be available to do that early next week.Thanks! I'm looking forward to see your opinions...
For this one, I did some research in the code, but I'm not very
comfortable with the answer. Still, I wanted to share my observations so
that it might be useful for the discussion.
First, I checked if the function get_op_btree_interpretation() could be
used here. But, I think that is again btree-only and I couldn't find
anything generic that does something similar.
Then, I tried to come up with a SQL query, actually based on the link [2]https://www.postgresql.org/docs/devel/xindex.html#XINDEX-STRATEGIES
you shared. I think we should always have an "equality" strategy (e.g.,
"same", "overlaps", "contains" etc sounds wrong to me).
And, it seems btree, hash and brin supports "equal". So, a query like the
following seems to provide the list of (index type, strategy_number,
data_type) that we might be allowed to use.
SELECT
am.amname AS index_type,
amop.amoplefttype::regtype,amop.amoprighttype::regtype,
op.oprname AS operator,
amop.amopstrategy AS strategy_number
FROM
pg_amop amop
JOIN
pg_am am ON am.oid = amop.amopmethod
JOIN
pg_operator op ON op.oid = amop.amopopr
WHERE
(am.amname = 'btree' and amop.amopstrategy = 3) OR
(am.amname = 'hash' and amop.amopstrategy = 1) OR
(am.amname = 'brin' and amop.amopstrategy = 3)
ORDER BY
index_type,
strategy_number;
What do you think?
[1]: https://www.postgresql.org/docs/current/logical-replication-restrictions.html
https://www.postgresql.org/docs/current/logical-replication-restrictions.html
[2]: https://www.postgresql.org/docs/devel/xindex.html#XINDEX-STRATEGIES
Dear Önder,
Thank you for your analysis!
Yes, I agree, it is (and was before my patch as well) un-documented limitation of REPLICA IDENTITY FULL.
And, as far as I can see, my patch actually didn't have any impact on the limitation. The unsupported
cases are still unsupported, but now the same error is thrown in a slightly different place.
I think that is a minor limitation, but maybe should be listed [1]/messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com?
Yes, your modification did not touch the restriction. It has existed before the
commit. I (or my colleague) will post the patch to add the description, maybe
after [1]/messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com is committed.
For this one, I did some research in the code, but I'm not very
comfortable with the answer. Still, I wanted to share my observations so
that it might be useful for the discussion.
First, I checked if the function get_op_btree_interpretation() could be
used here. But, I think that is again btree-only and I couldn't find
anything generic that does something similar.
Thanks for checking. The function seems to return the list of operator family and
its strategy number when the oid of the operator is given. But what we want to do
here is get the operator oid. I think that the input and output of the function
seems opposite. And as you mentioned, the index must be btree.
Then, I tried to come up with a SQL query, actually based on the link [2]https://www.postgresql.org/docs/current/index-functions.html
you shared. I think we should always have an "equality" strategy (e.g.,
"same", "overlaps", "contains" etc sounds wrong to me).
I could agree that "overlaps", "contains", are not "equal", but not sure about
the "same". Around here we must discuss, but not now.
And, it seems btree, hash and brin supports "equal". So, a query like the
following seems to provide the list of (index type, strategy_number,
data_type) that we might be allowed to use.
Note that strategy numbers listed in the doc are just an example - Other than BTree
and Hash do not have a fixed set of strategies at all.
E.g., operator classes for Btree, Hash and BRIN (Minmax) has "equal" and the
strategy number is documented. But other user-defined operator classes for BRIN
may have another number, or it does not have equality comparison.
SELECT
am.amname AS index_type,
amop.amoplefttype::regtype,amop.amoprighttype::regtype,
op.oprname AS operator,
amop.amopstrategy AS strategy_number
FROM
pg_amop amop
JOIN
pg_am am ON am.oid = amop.amopmethod
JOIN
pg_operator op ON op.oid = amop.amopopr
WHERE
(am.amname = 'btree' and amop.amopstrategy = 3) OR
(am.amname = 'hash' and amop.amopstrategy = 1) OR
(am.amname = 'brin' and amop.amopstrategy = 3)
ORDER BY
index_type,
strategy_number;
What do you think?
Good SQL. You have listed the equality operator and related strategy number for given
operator classes.
While analyzing more, however, I found that it might be difficult to support GIN, BRIN,
and bloom indexes in the first version. These indexes does not implement the
"amgettuple" function, which is called in RelationFindReplTupleByIndex()->index_getnext_slot()->index_getnext_tid().
For example, in the brinhandler():
```
/*
* BRIN handler function: return IndexAmRoutine with access method parameters
* and callbacks.
*/
Datum
brinhandler(PG_FUNCTION_ARGS)
{
...
amroutine->amgettuple = NULL;
amroutine->amgetbitmap = bringetbitmap;
...
```
According to the document [2]https://www.postgresql.org/docs/current/index-functions.html, all of index access methods must implement either
of amgettuple or amgetbitmap API. "amgettuple" is used when the table is scaned
and tuples are fetched one by one, RelationFindReplTupleByIndex() do that.
"amgetbitmap" is used when all tuples are fetched at once and RelationFindReplTupleByIndex()
does not support such indexes. To do that the implemented API must be checked and
the function must be changed depends on that. It may be difficult to add them in
the first step so that I want not to support them. Fortunately, Hash, GiST, and
SP-GiST has implemented then so we can focus on them.
In the next patch I will add the mechanism for rejecting such indexes.
Anyway, thank you for keeping the interest to the patch, nevertheless it is difficult theme.
[1]: /messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com
[2]: https://www.postgresql.org/docs/current/index-functions.html
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Dear Peter,
Thanks for reviewing. PSA new version.
I planned to post new version after supporting more indexes, but it may take more time.
So I want to address comments from you once.
======
src/backend/executor/execReplication.c1. get_equal_strategy_number
+/* + * Return the appropriate strategy number which corresponds to the equality + * comparisons. + * + * TODO: support other indexes: GiST, GIN, SP-GiST, BRIN + */ +static int +get_equal_strategy_number(Oid opclass) +{ + Oid am_method = get_opclass_method(opclass); + int ret; + + switch (am_method) + { + case BTREE_AM_OID: + ret = BTEqualStrategyNumber; + break; + case HASH_AM_OID: + ret = HTEqualStrategyNumber; + break; + case GIST_AM_OID: + case GIN_AM_OID: + case SPGIST_AM_OID: + case BRIN_AM_OID: + /* TODO */ + default: + /* XXX: Do we have to support extended indexes? */ + ret = InvalidStrategy; + break; + } + + return ret; +}1a.
In the file syscache.c there are already some other functions like
get_op_opfamily_strategy so I am wondering if this new function also
really belongs in that file.
According to atop comment in the syscache.c, it contains routines which access
system catalog cache. get_equal_strategy_number() does not check it, so I don't
think it should be at the file.
1b.
Should that say "operator" instead of "comparisons"?
Changed.
1c.
"am" stands for "access method" so "am_method" is like "access method
method" – is it correct?
Changed to "am".
2. build_replindex_scan_key
int table_attno = indkey->values[index_attoff];
+ int strategy_number;Ought to say this is a strategy for "equality", so a better varname
might be "equality_strategy_number" or "eq_strategy" or similar.
Changed to "eq_strategy".
src/backend/replication/logical/relation.c
3. IsIndexUsableForReplicaIdentityFull
It seems there is some overlap between this code which hardwired 2
valid AMs and the switch statement in your other
get_equal_strategy_number function which returns a strategy number for
those 2 AMs.Would it be better to make another common function like
get_equality_strategy_for_am(), and then you don’t have to hardwire
anything? Instead, you can say:is_usable_type = get_equality_strategy_for_am(indexInfo->ii_Am) !=
InvalidStrategy;
Added. How do you think?
src/backend/utils/cache/lsyscache.c
4. get_opclass_method
+/* + * get_opclass_method + * + * Returns the OID of the index access method operator family is for. + */ +Oid +get_opclass_method(Oid opclass) +{ + HeapTuple tp; + Form_pg_opclass cla_tup; + Oid result; + + tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass)); + if (!HeapTupleIsValid(tp)) + elog(ERROR, "cache lookup failed for opclass %u", opclass); + cla_tup = (Form_pg_opclass) GETSTRUCT(tp); + + result = cla_tup->opcmethod; + ReleaseSysCache(tp); + return result; +}Is the comment correct? IIUC, this function is not doing anything for
"families".
Modified to "class".
src/test/subscription/t/034_hash.pl
5. +# insert some initial data within the range 0-9 for y +$node_publisher->safe_psql('postgres', + "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i" +);Why does the comment only say "for y"?
After considering more, I thought we do not have to mention data.
So removed the part " within the range 0-9 for y".
6. +# wait until the index is used on the subscriber +# XXX: the test will be suspended here +$node_publisher->wait_for_catchup($appname); +$node_subscriber->poll_query_until('postgres', + q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';} + ) + or die + "Timed out while waiting for check subscriber tap_sub_rep_full updates 4 rows via index"; +AFAIK this is OK but it was slightly misleading because it says
"updates 4 rows" whereas the previous UPDATE was only for 2 rows. So
here I think the 4 also counts the 2 DELETED rows. The comment can be
expanded slightly to clarify this.
Clarified two rows were deleted.
7.
FYI, when I ran the TAP test the result was this:t/034_hash.pl ...................... 1/? # Tests were run but no plan
was declared and done_testing() was not seen.
t/034_hash.pl ...................... All 2 subtests passed
t/100_bugs.pl ...................... okTest Summary Report
-------------------
t/034_hash.pl (Wstat: 0 Tests: 2 Failed: 0)
Parse errors: No plan found in TAP output
Files=36, Tests=457, 338 wallclock secs ( 0.29 usr 0.07 sys + 206.73
cusr 47.94 csys = 255.03 CPU)
Result: FAIL~
Just adding the missing done_testing() seemed to fix this.
Right. I used meson build system and it said OK. Added.
Furthermore, I added a check to reject indexes which do not implement "amgettuple" API.
More detail, please see [1]/messages/by-id/TYAPR01MB5866E02638D40C4D198334B4F52DA@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB5866E02638D40C4D198334B4F52DA@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v2-0001-Allow-to-use-Hash-index-on-subscriber.patchapplication/octet-stream; name=v2-0001-Allow-to-use-Hash-index-on-subscriber.patchDownload
From 9b102ac120313b4443e15a7d3d46a59a8b263ff3 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Mon, 19 Jun 2023 08:28:00 +0000
Subject: [PATCH v2] Allow to use Hash index on subscriber
---
src/backend/executor/execReplication.c | 69 ++++++++++++++--
src/backend/replication/logical/relation.c | 27 +++++--
src/backend/utils/cache/lsyscache.c | 22 +++++
src/include/executor/executor.h | 3 +
src/include/utils/lsyscache.h | 1 +
src/test/subscription/meson.build | 1 +
src/test/subscription/t/034_hash.pl | 93 ++++++++++++++++++++++
7 files changed, 204 insertions(+), 12 deletions(-)
create mode 100644 src/test/subscription/t/034_hash.pl
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..023585b619 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,8 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
+#include "commands/defrem.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,15 +43,67 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Return the strategy number which corresponds to the equality operator for
+ * given index access method.
+ *
+ * TODO: support other indexes: GiST, SP-GiST, other user-defined idexes.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
+{
+ int ret;
+
+ switch (am)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ case GIST_AM_OID:
+ case SPGIST_AM_OID:
+ /* TODO */
+ case GIN_AM_OID:
+ case BRIN_AM_OID:
+ /*
+ * XXX: GIN and BRIN do not support for the moment because they do not
+ * implement amgettuple API.
+ */
+ default:
+ /* XXX: Do we have to support extended indexes? */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * operator.
+ *
+ * TODO: support other indexes: GiST, SP-GiST, other user-defined idexes.
+ */
+int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am = get_opclass_method(opclass);
+
+ return get_equal_strategy_number_for_am(am);
+}
+
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
*
* Returns how many columns to use for the index scan.
*
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
*
* By definition, replication identity of a rel meets all limitations associated
* with that. Note that any other index could also meet these limitations.
@@ -77,6 +131,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ int eq_strategy;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +148,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ eq_strategy = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ eq_strategy);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ eq_strategy, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ eq_strategy,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..5f5da4c9d8 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -17,6 +17,7 @@
#include "postgres.h"
+#include "access/amapi.h"
#include "access/genam.h"
#include "access/table.h"
#include "catalog/namespace.h"
@@ -779,8 +780,8 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
*
* Note that the limitations of index scans for replica identity full only
@@ -841,11 +842,25 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ IndexAmRoutine *routine;
+ bool is_suitable_type;
+ bool is_partial;
+ bool is_only_on_expression;
- return is_btree && !is_partial && !is_only_on_expression;
+ /*
+ * XXX: Support only indexes which implement amgettuple API. This is
+ * because RelationFindReplTupleByIndex() assumes to be able to fetch
+ * tuples one by one by the API.
+ */
+ routine = GetIndexAmRoutineByAmId(indexInfo->ii_Am, false);
+ is_suitable_type = ((routine->amgettuple != NULL) &&
+ (get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..05813d98ae 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method operator class is for.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..3a23510478 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -666,6 +666,9 @@ extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
const char *relname);
+extern int get_equal_strategy_number(Oid opclass);
+extern int get_equal_strategy_number_for_am(Oid am);
+
/*
* prototypes from functions in nodeModifyTable.c
*/
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..ea8fbe42cd 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/meson.build b/src/test/subscription/meson.build
index bd673a9d68..5fa5f0a790 100644
--- a/src/test/subscription/meson.build
+++ b/src/test/subscription/meson.build
@@ -40,6 +40,7 @@ tests += {
't/031_column_list.pl',
't/032_subscribe_use_index.pl',
't/033_run_as_table_owner.pl',
+ 't/034_hash.pl',
't/100_bugs.pl',
],
},
diff --git a/src/test/subscription/t/034_hash.pl b/src/test/subscription/t/034_hash.pl
new file mode 100644
index 0000000000..3aead6958b
--- /dev/null
+++ b/src/test/subscription/t/034_hash.pl
@@ -0,0 +1,93 @@
+# Copyright (c) 2022-2023, PostgreSQL Global Development Group
+
+# Test logical replication behavior with subscriber using available index
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+# create publisher node
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');
+$node_publisher->start;
+
+# create subscriber node
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');
+$node_subscriber->init(allows_streaming => 'logical');
+$node_subscriber->start;
+
+my $publisher_connstr = $node_publisher->connstr . ' dbname=postgres';
+my $appname = 'tap_sub';
+my $result = '';
+
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full deletes 2 rows and updates 2 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use index
+# =============================================================================
+
+done_testing();
--
2.27.0
On Fri, Jul 7, 2023 at 1:31 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Thank you for your analysis!
Yes, I agree, it is (and was before my patch as well) un-documented limitation of REPLICA IDENTITY FULL.
And, as far as I can see, my patch actually didn't have any impact on the limitation. The unsupported
cases are still unsupported, but now the same error is thrown in a slightly different place.
I think that is a minor limitation, but maybe should be listed [1]?
+1.
Yes, your modification did not touch the restriction. It has existed before the
commit. I (or my colleague) will post the patch to add the description, maybe
after [1] is committed.
I don't think there is any dependency between the two. So, feel free
to post the patch separately.
--
With Regards,
Amit Kapila.
Dear Amit,
Yes, I agree, it is (and was before my patch as well) un-documented
limitation of REPLICA IDENTITY FULL.
And, as far as I can see, my patch actually didn't have any impact on the
limitation. The unsupported
cases are still unsupported, but now the same error is thrown in a slightly
different place.
I think that is a minor limitation, but maybe should be listed [1]?
+1.
Yes, your modification did not touch the restriction. It has existed before the
commit. I (or my colleague) will post the patch to add the description, maybe
after [1] is committed.I don't think there is any dependency between the two. So, feel free
to post the patch separately.
Okay, thank you for your confirmation. I have started the fork thread [1]/messages/by-id/TYAPR01MB58662174ED62648E0D611194F530A@TYAPR01MB5866.jpnprd01.prod.outlook.com.
[1]: /messages/by-id/TYAPR01MB58662174ED62648E0D611194F530A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
On Mon, Jun 26, 2023 at 7:14 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Thank you for giving comments! The author's comment is quite helpful for us.
When I last dealt with the same issue, I was examining it from a slightly broader perspective. I think
my conclusion was that RelationFindReplTupleByIndex() is designed for the constraints of UNIQUE INDEX
and Primary Key.I see. IIUC that's why you have added tuples_equal() in the RelationFindReplTupleByIndex().
I think we should also be mindful about tuples_equal() function. When an index returns more than
one tuple, we rely on tuples_equal() function to make sure non-relevant tuples are skipped.For btree indexes, it was safe to rely on that function as the columns that are indexed using btree
always have equality operator. I think we can safely assume the same for hash indexes.However, say we indexed "point" type using "gist" index. Then, if we let this logic to kick in,
I think tuples_equal() would fail saying that there is no equality operator exists.Good point. Actually I have tested for "point" datatype but it have not worked well.
Now I understand the reason.
It seemed that when TYPECACHE_EQ_OPR_FINFO is reuqesed to lookup_type_cache(),
it could return valid value only if the datatype has operator class for Btree or Hash.
So tuples_equal() might not be able to use if tuples have point box circle types.
I also think so. If this is true, how can we think of supporting
indexes other than hash like GiST, and SP-GiST as mentioned by you in
your latest email? As per my understanding if we don't have PK or
replica identity then after the index scan, we do tuples_equal which
will fail for GIST or SP-GIST. Am, I missing something?
--
With Regards,
Amit Kapila.
Hi,
I also think so. If this is true, how can we think of supporting
indexes other than hash like GiST, and SP-GiST as mentioned by you in
your latest email? As per my understanding if we don't have PK or
replica identity then after the index scan, we do tuples_equal which
will fail for GIST or SP-GIST. Am, I missing something?
I also don't think we can support anything other than btree, hash and brin
as those lack equality operators.
And, for BRIN, Hayato brought up the amgettuple issue, which is fair. So,
overall, as far as I can see, we can
easily add hash indexes but all others are either very hard to add or not
possible.
I think if someone one day works on supporting primary keys using other
index types, we can use it here :p
Thanks,
Onder
On Mon, Jul 10, 2023 at 7:43 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
I also think so. If this is true, how can we think of supporting
indexes other than hash like GiST, and SP-GiST as mentioned by you in
your latest email? As per my understanding if we don't have PK or
replica identity then after the index scan, we do tuples_equal which
will fail for GIST or SP-GIST. Am, I missing something?I also don't think we can support anything other than btree, hash and brin as those lack equality operators.
And, for BRIN, Hayato brought up the amgettuple issue, which is fair. So, overall, as far as I can see, we can
easily add hash indexes but all others are either very hard to add or not possible.
Agreed. So, let's update the patch with comments indicating the
challenges for supporting the other index types than btree and hash.
--
With Regards,
Amit Kapila.
Dear Amit, Önder
Thanks for giving discussions. IIUC all have agreed that the patch focus on extending
to Hash index. PSA the patch for that.
The basic workflow was not so changed, some comments were updated.
Regarding the test code, I think it should be combined into 032_subscribe_use_index.pl
because they have tested same feature. I have just copied tests to latter
part of 032.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v3-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchapplication/octet-stream; name=v3-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchDownload
From 668d5589da6421687d95e0d5187077ebe66b9e78 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Mon, 19 Jun 2023 08:28:00 +0000
Subject: [PATCH v3] Allow the use Hash index when REPLICA IDENTITY FULL is
specified.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used.
1) Other indexes do not have a fixed set of strategy numbers at all. In
build_replindex_scan_key(), the operator which corresponds to "equality comparison"
is searched by using the strategy number, but it is difficult for such indexes.
For example, GiST index operator classes for two-dimensional geometric objects have
a comparison "same", but its strategy number is different from the gist_int4_ops,
which is a GiST index operator class that implements the B-tree equivalent.
2) Some other indexes do not implement "amgettuple". RelationFindReplTupleByIndex()
assumes that tuples could be fetched one by one via index_getnext_slot(), but such
indexes are not supported.
---
src/backend/executor/execReplication.c | 79 +++++++++++++++++--
src/backend/replication/logical/relation.c | 19 +++--
src/backend/utils/cache/lsyscache.c | 22 ++++++
src/include/executor/executor.h | 1 +
src/include/utils/lsyscache.h | 1 +
.../subscription/t/032_subscribe_use_index.pl | 69 ++++++++++++++++
6 files changed, 179 insertions(+), 12 deletions(-)
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..79ee7eca06 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,7 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,15 +42,78 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Return the strategy number which corresponds to the equality operator for
+ * given index access method.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. This is because
+ * mainly two reasons:
+ *
+ * 1) Other indexes do not have a fixed set of strategy numbers at all.
+ * In build_replindex_scan_key(), the operator which corresponds to "equality
+ * comparison" is searched by using the strategy number, but it is difficult
+ * for such indexes. For example, GiST index operator classes for
+ * two-dimensional geometric objects have a comparison "same", but its strategy
+ * number is different from the gist_int4_ops, which is a GiST index operator
+ * class that implements the B-tree equivalent.
+ *
+ * 2) Some other indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by
+ * one via index_getnext_slot(), but such indexes are not supported. To make it
+ * use index_getbitmap() must be used, but not done yet because of the above
+ * reason.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
+{
+ int ret;
+
+ switch (am)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ case GIST_AM_OID:
+ case SPGIST_AM_OID:
+ case GIN_AM_OID:
+ case BRIN_AM_OID:
+ default:
+ /* XXX: Only Btree and Hash indexes are supported */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * operator.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. See comments
+ * atop get_equal_strategy_number_for_am().
+ */
+static int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am = get_opclass_method(opclass);
+
+ return get_equal_strategy_number_for_am(am);
+}
+
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
*
* Returns how many columns to use for the index scan.
*
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
*
* By definition, replication identity of a rel meets all limitations associated
* with that. Note that any other index could also meet these limitations.
@@ -77,6 +141,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ int eq_strategy;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +158,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ eq_strategy = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ eq_strategy);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ eq_strategy, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ eq_strategy,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..3a4daf2669 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,8 +779,8 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
*
* Note that the limitations of index scans for replica identity full only
@@ -841,11 +841,18 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ bool is_suitable_type;
+ bool is_partial;
+ bool is_only_on_expression;
- return is_btree && !is_partial && !is_only_on_expression;
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..05813d98ae 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method operator class is for.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..faa629c9fe 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -665,6 +665,7 @@ extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
const char *relname);
+extern int get_equal_strategy_number_for_am(Oid am);
/*
* prototypes from functions in nodeModifyTable.c
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..f5fdbfe116 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl b/src/test/subscription/t/032_subscribe_use_index.pl
index 576eec6a57..0e4e9ca18e 100644
--- a/src/test/subscription/t/032_subscribe_use_index.pl
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -478,6 +478,75 @@ $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
# data
# =============================================================================
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full deletes 2 rows and updates 2 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use hash index
+# =============================================================================
+
$node_subscriber->stop('fast');
$node_publisher->stop('fast');
--
2.27.0
Hi, here are some review comments for the v3 patch
======
Commit message
1.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used.
~
Before giving details about the problems of the other index types it
might be good to summarize why these 2 types are OK.
SUGGESTION
These 2 types of indexes are the only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- implement the function amgettuple()
~~~
2.
I'm not sure why the next paragraphs are numbered 1) and 2). Is that
necessary? It seems maybe a cut/paste hangover from the similar code
comment.
~~~
3.
1) Other indexes do not have a fixed set of strategy numbers at all. In
build_replindex_scan_key(), the operator which corresponds to
"equality comparison"
is searched by using the strategy number, but it is difficult for such indexes.
For example, GiST index operator classes for two-dimensional geometric
objects have
a comparison "same", but its strategy number is different from the
gist_int4_ops,
which is a GiST index operator class that implements the B-tree equivalent.
~
Don't need to say "at all"
~~~
4.
2) Some other indexes do not implement "amgettuple".
RelationFindReplTupleByIndex()
assumes that tuples could be fetched one by one via
index_getnext_slot(), but such
indexes are not supported.
~
4a.
"Some other indexes..." --> Maybe give an example here (e.g. XXX, YYY)
of indexes that do not implement the function.
~
4b.
BEFORE
RelationFindReplTupleByIndex() assumes that tuples could be fetched
one by one via index_getnext_slot(), but such indexes are not
supported.
AFTER
RelationFindReplTupleByIndex() assumes that tuples will be fetched one
by one via index_getnext_slot(), but this currently requires the
"amgetuple" function.
======
src/backend/executor/execReplication.c
5.
+ * 2) Some other indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by
+ * one via index_getnext_slot(), but such indexes are not supported. To make it
+ * use index_getbitmap() must be used, but not done yet because of the above
+ * reason.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
~
Change this text to better match exactly in the commit message (see
previous review comments above). Also I am not sure it is necessary to
say how it *might* be implemented in the future if somebody wanted to
enhance it to work also for "amgetbitmap" function. E.g. do we need
that sentence "To make it..."
~~~
6. get_equal_strategy_number_for_am
+ case GIST_AM_OID:
+ case SPGIST_AM_OID:
+ case GIN_AM_OID:
+ case BRIN_AM_OID:
+ default:
I am not sure it is necessary to spell out all these not-supported
cases in the switch. If seems sufficient just to say 'default:'
doesn't it?
~~~
7. get_equal_strategy_number
Two blank lines are following this function
~~~
8. build_replindex_scan_key
- * This is not generic routine, it expects the idxrel to be a btree,
non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1]Sawada-san doc for RI restriction - /messages/by-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g@mail.gmail.com.
======
src/backend/replication/logical/relation.c
9. FindUsableIndexForReplicaIdentityFull
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
~
Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1]Sawada-san doc for RI restriction - /messages/by-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g@mail.gmail.com.
~
10.
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;
I am not sure if the function IsIndexOnlyExpression() even needed
anymore. Isn't it sufficient just to check up-front is the leftmost
INDEX field is a column and that covers this condition also? Actually,
this same question is also open in the Sawada-san thread [1]Sawada-san doc for RI restriction - /messages/by-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g@mail.gmail.com.
======
.../subscription/t/032_subscribe_use_index.pl
11.
AFAIK there is no test to verify that the leftmost index field must be
a column (e.g. cannot be an expression). Yes, there are some tests for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.
------
[1]: Sawada-san doc for RI restriction - /messages/by-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g@mail.gmail.com
/messages/by-id/CAD21AoBzp9H2WV4kDagat2WUOsiYJLo10zJ1E5dZYnRTx1pc_g@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Austalia
Dear Peter,
Thank you for reviewing! PSA new version.
1.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY
on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used.~
Before giving details about the problems of the other index types it
might be good to summarize why these 2 types are OK.SUGGESTION
These 2 types of indexes are the only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- implement the function amgettuple()
Added.
2.
I'm not sure why the next paragraphs are numbered 1) and 2). Is that
necessary? It seems maybe a cut/paste hangover from the similar code
comment.
Yeah, this was just copied from code comments. Numbers were removed.
3.
1) Other indexes do not have a fixed set of strategy numbers at all. In
build_replindex_scan_key(), the operator which corresponds to
"equality comparison"
is searched by using the strategy number, but it is difficult for such indexes.
For example, GiST index operator classes for two-dimensional geometric
objects have
a comparison "same", but its strategy number is different from the
gist_int4_ops,
which is a GiST index operator class that implements the B-tree equivalent.~
Don't need to say "at all"
Removed.
4.
2) Some other indexes do not implement "amgettuple".
RelationFindReplTupleByIndex()
assumes that tuples could be fetched one by one via
index_getnext_slot(), but such
indexes are not supported.~
4a.
"Some other indexes..." --> Maybe give an example here (e.g. XXX, YYY)
of indexes that do not implement the function.
I clarified like "BRIN and GIN indexes do not implement...", which are the built-in
indexes. Actually the bloom index cannot be supported due to the same reason, but
I did not mention because it is not in core.
4b.
BEFORE
RelationFindReplTupleByIndex() assumes that tuples could be fetched
one by one via index_getnext_slot(), but such indexes are not
supported.AFTER
RelationFindReplTupleByIndex() assumes that tuples will be fetched one
by one via index_getnext_slot(), but this currently requires the
"amgetuple" function.
Changed.
src/backend/executor/execReplication.c
5. + * 2) Some other indexes do not implement "amgettuple". + * RelationFindReplTupleByIndex() assumes that tuples could be fetched one by + * one via index_getnext_slot(), but such indexes are not supported. To make it + * use index_getbitmap() must be used, but not done yet because of the above + * reason. + */ +int +get_equal_strategy_number_for_am(Oid am)~
Change this text to better match exactly in the commit message (see
previous review comments above).
Copied from commit message.
Also I am not sure it is necessary to
say how it *might* be implemented in the future if somebody wanted to
enhance it to work also for "amgetbitmap" function. E.g. do we need
that sentence "To make it..."
Added, how do you think?
6. get_equal_strategy_number_for_am
+ case GIST_AM_OID: + case SPGIST_AM_OID: + case GIN_AM_OID: + case BRIN_AM_OID: + default:I am not sure it is necessary to spell out all these not-supported
cases in the switch. If seems sufficient just to say 'default:'
doesn't it?
Yeah, it's sufficient. This is a garbage for previous PoC.
7. get_equal_strategy_number
Two blank lines are following this function
Removed.
8. build_replindex_scan_key
- * This is not generic routine, it expects the idxrel to be a btree, non-partial - * and have at least one column reference (i.e. cannot consist of only - * expressions). + * This is not generic routine, it expects the idxrel to be a btree or a hash, + * non-partial and have at least one column reference (i.e. cannot consist of + * only expressions).Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1].
Thanks for reminder. I thought that this change seems not needed anymore if the patch
is pushed. But anyway I kept it because this may be pushed first.
src/backend/replication/logical/relation.c
9. FindUsableIndexForReplicaIdentityFull
* Returns the oid of an index that can be used by the apply worker to scan - * the relation. The index must be btree, non-partial, and have at least - * one column reference (i.e. cannot consist of only expressions). These + * the relation. The index must be btree or hash, non-partial, and have at + * least one column reference (i.e. cannot consist of only expressions). These * limitations help to keep the index scan similar to PK/RI index scans.~
Take care. AFAIK this change will clash with changes Sawawa-san is
making to the same code comment in another thread here [1].
Thanks for reminder. I thought that this change must be kept even if it will be
pushed. We must check the thread...
10. + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;I am not sure if the function IsIndexOnlyExpression() even needed
anymore. Isn't it sufficient just to check up-front is the leftmost
INDEX field is a column and that covers this condition also? Actually,
this same question is also open in the Sawada-san thread [1].======
.../subscription/t/032_subscribe_use_index.pl11.
AFAIK there is no test to verify that the leftmost index field must be
a column (e.g. cannot be an expression). Yes, there are some tests for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.
I'm not sure they should be fixed in the patch. You have raised these points
at the Sawada-san's thread[1]/messages/by-id/CAHut+Pv3AgAnP+JTsPseuU-CT+OrSGiqzxqw4oQmWeKLkAea4Q@mail.gmail.com and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch.
[1]: /messages/by-id/CAHut+Pv3AgAnP+JTsPseuU-CT+OrSGiqzxqw4oQmWeKLkAea4Q@mail.gmail.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v4-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchapplication/octet-stream; name=v4-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchDownload
From ac476d5417ffd75faa27b20e4d4ece4864c064f7 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Mon, 19 Jun 2023 08:28:00 +0000
Subject: [PATCH v4] Allow the use Hash index when REPLICA IDENTITY FULL is
specified.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used. These 2 types of indexes are the
only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- implement the function amgettuple()
Other indexes do not have a fixed set of strategy numbers. In
build_replindex_scan_key(), the operator which corresponds to "equality comparison"
is searched by using the strategy number, but it is difficult for such indexes.
For example, GiST index operator classes for two-dimensional geometric objects have
a comparison "same", but its strategy number is different from the gist_int4_ops,
which is a GiST index operator class that implements the B-tree equivalent.
Moreover, BRIN and GIN indexes do not implement "amgettuple". RelationFindReplTupleByIndex()
assumes that tuples will be fetched one by one via index_getnext_slot(), but this
currently requires the "amgetuple" function.
---
src/backend/executor/execReplication.c | 74 +++++++++++++++++--
src/backend/replication/logical/relation.c | 19 +++--
src/backend/utils/cache/lsyscache.c | 22 ++++++
src/include/executor/executor.h | 1 +
src/include/utils/lsyscache.h | 1 +
.../subscription/t/032_subscribe_use_index.pl | 69 +++++++++++++++++
6 files changed, 174 insertions(+), 12 deletions(-)
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..5f72133fca 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,7 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,15 +42,73 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Return the strategy number which corresponds to the equality operator for
+ * given index access method.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. This is because
+ * mainly two reasons:
+ *
+ * 1) Other indexes do not have a fixed set of strategy numbers.
+ * In build_replindex_scan_key(), the operator which corresponds to "equality
+ * comparison" is searched by using the strategy number, but it is difficult
+ * for such indexes. For example, GiST index operator classes for
+ * two-dimensional geometric objects have a comparison "same", but its strategy
+ * number is different from the gist_int4_ops, which is a GiST index operator
+ * class that implements the B-tree equivalent.
+ *
+ * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples will be fetched one by
+ * one via index_getnext_slot(), but this currently requires the "amgetuple"
+ * function. To make it use index_getbitmap() must be used, which fetches all
+ * the tuples at once.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
+{
+ int ret;
+
+ switch (am)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ default:
+ /* XXX: Only Btree and Hash indexes are supported */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * operator.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. See comments
+ * atop get_equal_strategy_number_for_am().
+ */
+static int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am = get_opclass_method(opclass);
+
+ return get_equal_strategy_number_for_am(am);
+}
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
*
* Returns how many columns to use for the index scan.
*
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
*
* By definition, replication identity of a rel meets all limitations associated
* with that. Note that any other index could also meet these limitations.
@@ -77,6 +136,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ int eq_strategy;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +153,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ eq_strategy = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ eq_strategy);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ eq_strategy, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ eq_strategy,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..3a4daf2669 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,8 +779,8 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
*
* Note that the limitations of index scans for replica identity full only
@@ -841,11 +841,18 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ bool is_suitable_type;
+ bool is_partial;
+ bool is_only_on_expression;
- return is_btree && !is_partial && !is_only_on_expression;
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..05813d98ae 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method operator class is for.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..faa629c9fe 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -665,6 +665,7 @@ extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
const char *relname);
+extern int get_equal_strategy_number_for_am(Oid am);
/*
* prototypes from functions in nodeModifyTable.c
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..f5fdbfe116 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl b/src/test/subscription/t/032_subscribe_use_index.pl
index 576eec6a57..0e4e9ca18e 100644
--- a/src/test/subscription/t/032_subscribe_use_index.pl
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -478,6 +478,75 @@ $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
# data
# =============================================================================
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full deletes 2 rows and updates 2 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use hash index
+# =============================================================================
+
$node_subscriber->stop('fast');
$node_publisher->stop('fast');
--
2.27.0
On Wed, Jul 12, 2023 at 8:37 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
10. + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;I am not sure if the function IsIndexOnlyExpression() even needed
anymore. Isn't it sufficient just to check up-front is the leftmost
INDEX field is a column and that covers this condition also? Actually,
this same question is also open in the Sawada-san thread [1].======
.../subscription/t/032_subscribe_use_index.pl11.
AFAIK there is no test to verify that the leftmost index field must be
a column (e.g. cannot be an expression). Yes, there are some tests for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.I'm not sure they should be fixed in the patch. You have raised these points
at the Sawada-san's thread[1] and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch.
I think it is reasonable to discuss the existing code-related
improvements in a separate thread rather than trying to change this
patch. I have some other comments about your patch:
1.
+ * 1) Other indexes do not have a fixed set of strategy numbers.
+ * In build_replindex_scan_key(), the operator which corresponds to "equality
+ * comparison" is searched by using the strategy number, but it is difficult
+ * for such indexes. For example, GiST index operator classes for
+ * two-dimensional geometric objects have a comparison "same", but its strategy
+ * number is different from the gist_int4_ops, which is a GiST index operator
+ * class that implements the B-tree equivalent.
+ *
...
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
...
I think this comment is slightly difficult to understand. Can we make
it a bit generic by saying something like: "The index access methods
other than BTREE and HASH don't have a fixed strategy for equality
operation. Note that in other index access methods, the support
routines of each operator class interpret the strategy numbers
according to the operator class's definition. So, we return
InvalidStrategy number for index access methods other than BTREE and
HASH."
2.
+ * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples will be fetched one by
+ * one via index_getnext_slot(), but this currently requires the "amgetuple"
+ * function. To make it use index_getbitmap() must be used, which fetches all
+ * the tuples at once.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
{
..
I don't think this is a good place for such a comment. We can probably
move this atop IsIndexUsableForReplicaIdentityFull(). I think you need
to mention two reasons in IsIndexUsableForReplicaIdentityFull() why we
support only BTREE and HASH index access methods (a) Refer to comments
atop get_equal_strategy_number_for_am(); (b) mention the reason
related to tuples_equal() as discussed in email [1]/messages/by-id/CAA4eK1Jv8+8rax-bejd3Ej+T2fG3tuqP8v89byKCBPVQj9C9pg@mail.gmail.com. Then you can say
that we also need to ensure that the index access methods that we
support must have an implementation "amgettuple" as later while
searching tuple via RelationFindReplTupleByIndex, we need the same. We
can probably have an assert for this as well.
[1]: /messages/by-id/CAA4eK1Jv8+8rax-bejd3Ej+T2fG3tuqP8v89byKCBPVQj9C9pg@mail.gmail.com
--
With Regards,
Amit Kapila.
Here are my review comments for the patch v4.
======
General
1.
FYI, this patch also needs some minor PG docs updates [1]https://www.postgresql.org/docs/devel/logical-replication-publication.html because
section "31.1 Publication" currently refers to only btree - e.g.
"Candidate indexes must be btree, non-partial, and have..."
(this may also clash with Sawada-san's other thread as previously mentioned)
======
Commit message
2.
Moreover, BRIN and GIN indexes do not implement "amgettuple".
RelationFindReplTupleByIndex()
assumes that tuples will be fetched one by one via
index_getnext_slot(), but this
currently requires the "amgetuple" function.
~
Typo:
/"amgetuple"/"amgettuple"/
======
src/backend/executor/execReplication.c
3.
+ * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex() assumes that tuples will be fetched one by
+ * one via index_getnext_slot(), but this currently requires the "amgetuple"
+ * function. To make it use index_getbitmap() must be used, which fetches all
+ * the tuples at once.
+ */
+int
3a.
Typo:
/"amgetuple"/"amgettuple"/
~
3b.
I think my previous comment about this comment was misunderstood. I
was questioning why that last sentence ("To make it...") about
"index_getbitmap()" is even needed in this patch. I thought it may be
overreach to describe details how to make some future enhancement that
might never happen.
======
src/backend/replication/logical/relation.c
4. IsIndexUsableForReplicaIdentityFull
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ bool is_suitable_type;
+ bool is_partial;
+ bool is_only_on_expression;
- return is_btree && !is_partial && !is_only_on_expression;
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ return is_suitable_type && !is_partial && !is_only_on_expression;
4a.
The code can be written more optimally, because if it is deemed
already that 'is_suitable_type' is false, then there is no point to
continue to do the other checks and function calls.
~~~
4b.
+ /* Check whether the index is supported or not */
+ is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am)
+ != InvalidStrategy));
The above statement has an extra set of nested parentheses for some reason.
======
src/backend/utils/cache/lsyscache.c
5.
/*
* get_opclass_method
*
* Returns the OID of the index access method operator class is for.
*/
Oid
get_opclass_method(Oid opclass)
IMO the comment should be worded more like the previous one in this source file.
SUGGESTION
Returns the OID of the index access method the opclass belongs to.
------
[1]: https://www.postgresql.org/docs/devel/logical-replication-publication.html
Kind Regards,
Peter Smith.
Fujitsu Australia
Dear Peter,
Thanks for giving comment.
1.
FYI, this patch also needs some minor PG docs updates [1] because
section "31.1 Publication" currently refers to only btree - e.g.
"Candidate indexes must be btree, non-partial, and have..."(this may also clash with Sawada-san's other thread as previously mentioned)
Fixed that, but I could not find any other points. Do you have in mind others?
I checked related commits like 89e46d and adedf5, but only the part was changed.
Commit message
2.
Moreover, BRIN and GIN indexes do not implement "amgettuple".
RelationFindReplTupleByIndex()
assumes that tuples will be fetched one by one via
index_getnext_slot(), but this
currently requires the "amgetuple" function.~
Typo:
/"amgetuple"/"amgettuple"/
Fixed.
src/backend/executor/execReplication.c
3. + * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple". + * RelationFindReplTupleByIndex() assumes that tuples will be fetched one by + * one via index_getnext_slot(), but this currently requires the "amgetuple" + * function. To make it use index_getbitmap() must be used, which fetches all + * the tuples at once. + */ +int3a.
Typo:
/"amgetuple"/"amgettuple"/
Per suggestion from Amit [1]/messages/by-id/CAA4eK1++R3WSfsGH0yFR9WEbkDfF__OccADR7qXDnHGTww+kvQ@mail.gmail.com, the paragraph was removed.
3b.
I think my previous comment about this comment was misunderstood. I
was questioning why that last sentence ("To make it...") about
"index_getbitmap()" is even needed in this patch. I thought it may be
overreach to describe details how to make some future enhancement that
might never happen.
Removed.
src/backend/replication/logical/relation.c
4. IsIndexUsableForReplicaIdentityFull
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo) { - bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID); - bool is_partial = (indexInfo->ii_Predicate != NIL); - bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + bool is_suitable_type; + bool is_partial; + bool is_only_on_expression;- return is_btree && !is_partial && !is_only_on_expression; + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;4a.
The code can be written more optimally, because if it is deemed
already that 'is_suitable_type' is false, then there is no point to
continue to do the other checks and function calls.
Right, is_suitable_type is now removed.
4b.
+ /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy));The above statement has an extra set of nested parentheses for some reason.
This part was removed per above comment.
src/backend/utils/cache/lsyscache.c
5.
/*
* get_opclass_method
*
* Returns the OID of the index access method operator class is for.
*/
Oid
get_opclass_method(Oid opclass)IMO the comment should be worded more like the previous one in this source file.
SUGGESTION
Returns the OID of the index access method the opclass belongs to.
Fixed.
[1]: /messages/by-id/CAA4eK1++R3WSfsGH0yFR9WEbkDfF__OccADR7qXDnHGTww+kvQ@mail.gmail.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Attachments:
v5-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchapplication/octet-stream; name=v5-0001-Allow-the-use-Hash-index-when-REPLICA-IDENTITY-FU.patchDownload
From 3ba7e4e1f876d92a913015cdd21a2de32807cfc4 Mon Sep 17 00:00:00 2001
From: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Date: Mon, 19 Jun 2023 08:28:00 +0000
Subject: [PATCH v5] Allow the use Hash index when REPLICA IDENTITY FULL is
specified.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used. These 2 types of indexes are the
only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- supported by tuples_equal()
- implement the function amgettuple()
Index access methods other than them don't have a fixed strategy for equality
operation. Note that in other index access methods, the support routines of each
operator class interpret the strategy numbers according to the operator class's
definition. In build_replindex_scan_key(), the operator which corresponds to
"equality comparison" is searched by using the strategy number, but it is
difficult for such indexes.
tuples_equal() cannot accept a datatype that does not have an operator class for
Btree or Hash. One motivation for other types of indexes to be used is to define
an index to attributes such as datatypes like point and box. But lookup_type_cache(),
which is called from tuples_equal(), cannot return the valid value if input datatypes
do not have such a opclass.
Moreover, BRIN and GIN indexes do not implement "amgettuple". RelationFindReplTupleByIndex()
assumes that tuples will be fetched one by one via index_getnext_slot(), but this
currently requires the "amgettuple" function.
---
doc/src/sgml/logical-replication.sgml | 2 +-
src/backend/executor/execReplication.c | 64 +++++++++++++++--
src/backend/replication/logical/relation.c | 47 +++++++++++--
src/backend/utils/cache/lsyscache.c | 22 ++++++
src/include/executor/executor.h | 1 +
src/include/utils/lsyscache.h | 1 +
.../subscription/t/032_subscribe_use_index.pl | 69 +++++++++++++++++++
7 files changed, 193 insertions(+), 13 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..d8339add49 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,7 +134,7 @@
to replica identity <literal>FULL</literal>, which means the entire row becomes
the key. When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows. Candidate
- indexes must be btree, non-partial, and have at least one column reference
+ indexes must be btree or hash, non-partial, and have at least one column reference
(i.e. cannot consist of only expressions). These restrictions
on the non-unique index properties adhere to some of the restrictions that
are enforced for primary keys. If there are no such suitable indexes,
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..6f8af59be1 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,7 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,15 +42,63 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Return the strategy number which corresponds to the equality operator for
+ * given index access method.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. This is because
+ * index access methods other than them don't have a fixed strategy for
+ * equality operation. Note that in other index access methods, the support
+ * routines of each operator class interpret the strategy numbers according to
+ * the operator class's definition. So, we return the InvalidStrategy number
+ * for index access methods other than BTREE and HASH.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
+{
+ int ret;
+
+ switch (am)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ default:
+ /* XXX: Only Btree and Hash indexes are supported */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * operator.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. See comments
+ * atop get_equal_strategy_number_for_am().
+ */
+static int
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am = get_opclass_method(opclass);
+
+ return get_equal_strategy_number_for_am(am);
+}
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
*
* Returns how many columns to use for the index scan.
*
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, it expects the idxrel to be a btree or a hash,
+ * non-partial and have at least one column reference (i.e. cannot consist of
+ * only expressions).
*
* By definition, replication identity of a rel meets all limitations associated
* with that. Note that any other index could also meet these limitations.
@@ -77,6 +126,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ int eq_strategy;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +143,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ eq_strategy = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ eq_strategy);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ eq_strategy, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ eq_strategy,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..5f00a0caa9 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -17,6 +17,9 @@
#include "postgres.h"
+#ifdef USE_ASSERT_CHECKING
+#include "access/amapi.h"
+#endif
#include "access/genam.h"
#include "access/table.h"
#include "catalog/namespace.h"
@@ -779,8 +782,8 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
+ * the relation. The index must be btree or hash, non-partial, and have at
+ * least one column reference (i.e. cannot consist of only expressions). These
* limitations help to keep the index scan similar to PK/RI index scans.
*
* Note that the limitations of index scans for replica identity full only
@@ -837,15 +840,47 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
/*
* Returns true if the index is usable for replica identity full. For details,
* see FindUsableIndexForReplicaIdentityFull.
+ *
+ * XXX: Currently, only Btree and Hash indexes can be returned as usable one.
+ * This is because mainly two reasons:
+ *
+ * 1) Other index access methods other than Btree and Hash don't have a fixed
+ * strategy for equality operation. Refer comments atop
+ * get_equal_strategy_number_for_am.
+ * 2) tuples_equal cannot accept a datatype that does not have an operator
+ * class for Btree or Hash. One motivation for other types of indexes to be
+ * used is to define an index to attributes such as datatypes like point and
+ * box. But lookup_type_cache, which is called from tuples_equal, cannot return
+ * the valid value if input datatypes do not have such a opclass.
+ *
+ * Furthermore, BRIN and GIN indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex assumes that tuples will be fetched one by
+ * one via index_getnext_slot, but this currently requires the "amgettuple"
+ * function.
*/
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ bool is_partial;
+ bool is_only_on_expression;
+
+ /* Check whether the index is supported or not */
+ if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
+ return false;
+
+#ifdef USE_ASSERT_CHECKING
+ {
+ /* Check that the given index access method has amgettuple routine */
+ IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am,
+ false);
+ Assert(amroutine->amgettuple != NULL);
+ }
+#endif
+
+ is_partial = (indexInfo->ii_Predicate != NIL);
+ is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
- return is_btree && !is_partial && !is_only_on_expression;
+ return !is_partial && !is_only_on_expression;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..2c01a86c29 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method the opclass belongs to.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..faa629c9fe 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -665,6 +665,7 @@ extern void CheckCmdReplicaIdentity(Relation rel, CmdType cmd);
extern void CheckSubscriptionRelkind(char relkind, const char *nspname,
const char *relname);
+extern int get_equal_strategy_number_for_am(Oid am);
/*
* prototypes from functions in nodeModifyTable.c
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..f5fdbfe116 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl b/src/test/subscription/t/032_subscribe_use_index.pl
index 576eec6a57..0e4e9ca18e 100644
--- a/src/test/subscription/t/032_subscribe_use_index.pl
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -478,6 +478,75 @@ $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
# data
# =============================================================================
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+# XXX: the test will be suspended here
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full deletes 2 rows and updates 2 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use hash index
+# =============================================================================
+
$node_subscriber->stop('fast');
$node_publisher->stop('fast');
--
2.27.0
Dear Amit,
Thanks for checking my patch! The patch can be available at [1]/messages/by-id/TYAPR01MB5866B4F938ADD7088379633CF536A@TYAPR01MB5866.jpnprd01.prod.outlook.com.
======
.../subscription/t/032_subscribe_use_index.pl11.
AFAIK there is no test to verify that the leftmost index field must be
a column (e.g. cannot be an expression). Yes, there are some tests for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.I'm not sure they should be fixed in the patch. You have raised these points
at the Sawada-san's thread[1] and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch.I think it is reasonable to discuss the existing code-related
improvements in a separate thread rather than trying to change this
patch.
OK, so I will not touch in this thread.
I have some other comments about your patch:
1. + * 1) Other indexes do not have a fixed set of strategy numbers. + * In build_replindex_scan_key(), the operator which corresponds to "equality + * comparison" is searched by using the strategy number, but it is difficult + * for such indexes. For example, GiST index operator classes for + * two-dimensional geometric objects have a comparison "same", but its strategy + * number is different from the gist_int4_ops, which is a GiST index operator + * class that implements the B-tree equivalent. + * ... + */ +int +get_equal_strategy_number_for_am(Oid am) ...I think this comment is slightly difficult to understand. Can we make
it a bit generic by saying something like: "The index access methods
other than BTREE and HASH don't have a fixed strategy for equality
operation. Note that in other index access methods, the support
routines of each operator class interpret the strategy numbers
according to the operator class's definition. So, we return
InvalidStrategy number for index access methods other than BTREE and
HASH."
Sounds better. Fixed with some adjustments.
2. + * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple". + * RelationFindReplTupleByIndex() assumes that tuples will be fetched one by + * one via index_getnext_slot(), but this currently requires the "amgetuple" + * function. To make it use index_getbitmap() must be used, which fetches all + * the tuples at once. + */ +int +get_equal_strategy_number_for_am(Oid am) { ..I don't think this is a good place for such a comment. We can probably
move this atop IsIndexUsableForReplicaIdentityFull(). I think you need
to mention two reasons in IsIndexUsableForReplicaIdentityFull() why we
support only BTREE and HASH index access methods (a) Refer to comments
atop get_equal_strategy_number_for_am(); (b) mention the reason
related to tuples_equal() as discussed in email [1]. Then you can say
that we also need to ensure that the index access methods that we
support must have an implementation "amgettuple" as later while
searching tuple via RelationFindReplTupleByIndex, we need the same.
Fixed, and based on that I modified the commit message accordingly.
How do you feel?
We can probably have an assert for this as well.
Added.
[1]: /messages/by-id/TYAPR01MB5866B4F938ADD7088379633CF536A@TYAPR01MB5866.jpnprd01.prod.outlook.com
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
Hi Hayato, all
- return is_btree && !is_partial && !is_only_on_expression; + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;
I don't want to repeat this too much, as it is a minor note. Just
sharing my perspective here.
As discussed in the other email [1]/messages/by-id/CACawEhUWH1qAZ8QNeCve737Qe1_ye=vTW9P22ePiFssT7+HaaQ@mail.gmail.com, I feel like keeping
IsIndexUsableForReplicaIdentityFull() function readable is useful
for documentation purposes as well.
So, I'm more inclined to see something like your old code, maybe with
a different variable name.
bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
to
bool has_equal_strategy = get_equal_strategy_number_for_am...
....
return has_equal_strategy && !is_partial && !is_only_on_expression;
4a.
The code can be written more optimally, because if it is deemed
already that 'is_suitable_type' is false, then there is no point to
continue to do the other checks and function calls.
Sure, there are maybe few cycles of CPU gains, but this code is executed
infrequently, and I don't see much value optimizing it. I think keeping it
slightly
more readable is nicer.
Other than that, I think the code/test looks good. For the
comments/documentation,
I think Amit and Peter have already given quite a bit of useful feedback,
so nothing
much to add from my end.
Thanks,
Onder
[1]: /messages/by-id/CACawEhUWH1qAZ8QNeCve737Qe1_ye=vTW9P22ePiFssT7+HaaQ@mail.gmail.com
/messages/by-id/CACawEhUWH1qAZ8QNeCve737Qe1_ye=vTW9P22ePiFssT7+HaaQ@mail.gmail.com
Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>, 12 Tem 2023 Çar, 10:07
tarihinde şunu yazdı:
Show quoted text
Dear Amit,
Thanks for checking my patch! The patch can be available at [1].
======
.../subscription/t/032_subscribe_use_index.pl11.
AFAIK there is no test to verify that the leftmost index field mustbe
a column (e.g. cannot be an expression). Yes, there are some tests
for
*only* expressions like (expr, expr, expr), but IMO there should be
another test for an index like (expr, expr, column) which should also
fail because the column is not the leftmost field.I'm not sure they should be fixed in the patch. You have raised these
points
at the Sawada-san's thread[1] and not sure he has done.
Furthermore, these points are not related with our initial motivation.
Fork, or at least it should be done by another patch.I think it is reasonable to discuss the existing code-related
improvements in a separate thread rather than trying to change this
patch.OK, so I will not touch in this thread.
I have some other comments about your patch:
1. + * 1) Other indexes do not have a fixed set of strategy numbers. + * In build_replindex_scan_key(), the operator which corresponds to"equality
+ * comparison" is searched by using the strategy number, but it is
difficult
+ * for such indexes. For example, GiST index operator classes for + * two-dimensional geometric objects have a comparison "same", but its strategy + * number is different from the gist_int4_ops, which is a GiST indexoperator
+ * class that implements the B-tree equivalent. + * ... + */ +int +get_equal_strategy_number_for_am(Oid am) ...I think this comment is slightly difficult to understand. Can we make
it a bit generic by saying something like: "The index access methods
other than BTREE and HASH don't have a fixed strategy for equality
operation. Note that in other index access methods, the support
routines of each operator class interpret the strategy numbers
according to the operator class's definition. So, we return
InvalidStrategy number for index access methods other than BTREE and
HASH."Sounds better. Fixed with some adjustments.
2. + * 2) Moreover, BRIN and GIN indexes do not implement "amgettuple". + * RelationFindReplTupleByIndex() assumes that tuples will be fetchedone by
+ * one via index_getnext_slot(), but this currently requires the
"amgetuple"
+ * function. To make it use index_getbitmap() must be used, which
fetches all
+ * the tuples at once. + */ +int +get_equal_strategy_number_for_am(Oid am) { ..I don't think this is a good place for such a comment. We can probably
move this atop IsIndexUsableForReplicaIdentityFull(). I think you need
to mention two reasons in IsIndexUsableForReplicaIdentityFull() why we
support only BTREE and HASH index access methods (a) Refer to comments
atop get_equal_strategy_number_for_am(); (b) mention the reason
related to tuples_equal() as discussed in email [1]. Then you can say
that we also need to ensure that the index access methods that we
support must have an implementation "amgettuple" as later while
searching tuple via RelationFindReplTupleByIndex, we need the same.Fixed, and based on that I modified the commit message accordingly.
How do you feel?We can probably have an assert for this as well.
Added.
[1]:
/messages/by-id/TYAPR01MB5866B4F938ADD7088379633CF536A@TYAPR01MB5866.jpnprd01.prod.outlook.comBest Regards,
Hayato Kuroda
FUJITSU LIMITED
Here are some review comments for the v5 patch
======
Commit message.
1.
89e46d allowed using indexes other than PRIMARY KEY or REPLICA IDENTITY on the
subscriber, but only the BTree index could be used. This commit extends the
limitation, now the Hash index can be also used. These 2 types of
indexes are the
only ones supported because only these
- use fix strategy numbers
- implement the "equality" strategy
- supported by tuples_equal()
- implement the function amgettuple()
~
1a.
/fix/fixed/
~
1b.
/supported by tuples_equal()/are supported by tuples_equal()/
~~~
2.
Index access methods other than them don't have a fixed strategy for equality
operation. Note that in other index access methods, the support routines of each
operator class interpret the strategy numbers according to the operator class's
definition. In build_replindex_scan_key(), the operator which corresponds to
"equality comparison" is searched by using the strategy number, but it is
difficult for such indexes.
~
/Index access methods other than them/Other index access methods/
~~~
3.
tuples_equal() cannot accept a datatype that does not have an operator class for
Btree or Hash. One motivation for other types of indexes to be used is to define
an index to attributes such as datatypes like point and box. But
lookup_type_cache(),
which is called from tuples_equal(), cannot return the valid value if
input datatypes
do not have such a opclass.
~
That paragraph looks the same as the code comment in
IsIndexUsableForReplicaIdentityFull(). I wrote some review comments
about that (see #5d below) so the same maybe applies here too.
======
src/backend/executor/execReplication.c
4.
+/*
+ * Return the strategy number which corresponds to the equality operator for
+ * given index access method.
+ *
+ * XXX: Currently, only Btree and Hash indexes are supported. This is because
+ * index access methods other than them don't have a fixed strategy for
+ * equality operation. Note that in other index access methods, the support
+ * routines of each operator class interpret the strategy numbers according to
+ * the operator class's definition. So, we return the InvalidStrategy number
+ * for index access methods other than BTREE and HASH.
+ */
+int
+get_equal_strategy_number_for_am(Oid am)
The function comment seems a bit long. Maybe it can be simplified a bit:
SUGGESTION
Return the strategy number which corresponds to the equality operator
for the given index access method, otherwise, return InvalidStrategy.
XXX: Currently, only Btree and Hash indexes are supported. The other
index access methods don't have a fixed strategy for equality
operation - instead, the support routines of each operator class
interpret the strategy numbers according to the operator class's
definition.
======
src/backend/replication/logical/relation.c
5. FindUsableIndexForReplicaIdentityFull
/*
* Returns true if the index is usable for replica identity full. For details,
* see FindUsableIndexForReplicaIdentityFull.
+ *
+ * XXX: Currently, only Btree and Hash indexes can be returned as usable one.
+ * This is because mainly two reasons:
+ *
+ * 1) Other index access methods other than Btree and Hash don't have a fixed
+ * strategy for equality operation. Refer comments atop
+ * get_equal_strategy_number_for_am.
+ * 2) tuples_equal cannot accept a datatype that does not have an operator
+ * class for Btree or Hash. One motivation for other types of indexes to be
+ * used is to define an index to attributes such as datatypes like point and
+ * box. But lookup_type_cache, which is called from tuples_equal, cannot return
+ * the valid value if input datatypes do not have such a opclass.
+ *
+ * Furthermore, BRIN and GIN indexes do not implement "amgettuple".
+ * RelationFindReplTupleByIndex assumes that tuples will be fetched one by
+ * one via index_getnext_slot, but this currently requires the "amgettuple"
+ * function.
*/
5a.
/as usable one./as useable./
~
5b.
/Other index access methods other than Btree and Hash/Other index
access methods/
~
5c.
Maybe a blank line before 2) will help readability.
~
5d.
"One motivation for other types of indexes to be used is to define an
index to attributes such as datatypes like point and box."
Isn't it enough to omit that sentence and just say:
SUGGESTION
2) tuples_equal() cannot accept a datatype (e.g. point or box) that
does not have an operator class for Btree or Hash. This is because
lookup_type_cache(), which is called from tuples_equal(), cannot
return the valid value if input datatypes do not have such an opclass.
~~~
6. FindUsableIndexForReplicaIdentityFull
+ /* Check whether the index is supported or not */
+ if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
+ return false;
6a.
Really, this entire function is for checking "is supported or not" so
IMO this is not the correct comment just for this statement. BTW, I
noted Onder suggested keeping this as a variable assignment (called
'has_equal_strategy') [1]Onder suggestion - /messages/by-id/CACawEhXvVqxoaqj5aanaT02DHYUJwpkssS4RTZRSuqEOpT0zQg@mail.gmail.com. I also think having a variable is better
because then this extra comment would be unnecessary.
~
6b.
IMO the code is readable with the early exit, but it is fine also if
you want to revert it to how Onder suggested [1]Onder suggestion - /messages/by-id/CACawEhXvVqxoaqj5aanaT02DHYUJwpkssS4RTZRSuqEOpT0zQg@mail.gmail.com. I think it is not
worth worrying too much here because it seems the Sawada-san patch [2]Sawada-san other thread - /messages/by-id/CAD21AoAKx+FY4OPPj+MEF0gM-TAV0=fd3EfPoEsa+cMQLiWjyA@mail.gmail.com
might have intentions to refactor all this same function anyhow.
------
[1]: Onder suggestion - /messages/by-id/CACawEhXvVqxoaqj5aanaT02DHYUJwpkssS4RTZRSuqEOpT0zQg@mail.gmail.com
/messages/by-id/CACawEhXvVqxoaqj5aanaT02DHYUJwpkssS4RTZRSuqEOpT0zQg@mail.gmail.com
[2]: Sawada-san other thread - /messages/by-id/CAD21AoAKx+FY4OPPj+MEF0gM-TAV0=fd3EfPoEsa+cMQLiWjyA@mail.gmail.com
/messages/by-id/CAD21AoAKx+FY4OPPj+MEF0gM-TAV0=fd3EfPoEsa+cMQLiWjyA@mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
On Wed, Jul 12, 2023 at 8:14 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
- return is_btree && !is_partial && !is_only_on_expression; + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;I don't want to repeat this too much, as it is a minor note. Just
sharing my perspective here.As discussed in the other email [1], I feel like keeping
IsIndexUsableForReplicaIdentityFull() function readable is useful
for documentation purposes as well.So, I'm more inclined to see something like your old code, maybe with
a different variable name.bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
to
bool has_equal_strategy = get_equal_strategy_number_for_am...
....
return has_equal_strategy && !is_partial && !is_only_on_expression;
+1 for the readability. I think we can as you suggest or I feel it
would be better if we return false as soon as we found any condition
is false. The current patch has a mixed style such that for
InvalidStrategy, it returns immediately but for others, it does a
combined check. The other point we should consider in this regard is
the below assert check:
+#ifdef USE_ASSERT_CHECKING
+ {
+ /* Check that the given index access method has amgettuple routine */
+ IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am,
+ false);
+ Assert(amroutine->amgettuple != NULL);
+ }
+#endif
Apart from having an assert, we have the following two options (a)
check this condition as well and return false if am doesn't support
amgettuple (b) report elog(ERROR, ..) in this case.
I am of the opinion that we should either have an assert for this or
do (b) because if do (a) currently there is no case where it can
return false. What do you think?
--
With Regards,
Amit Kapila.
On Thu, Jul 13, 2023 at 8:51 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Wed, Jul 12, 2023 at 8:14 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
- return is_btree && !is_partial && !is_only_on_expression; + /* Check whether the index is supported or not */ + is_suitable_type = ((get_equal_strategy_number_for_am(indexInfo->ii_Am) + != InvalidStrategy)); + + is_partial = (indexInfo->ii_Predicate != NIL); + is_only_on_expression = IsIndexOnlyOnExpression(indexInfo); + + return is_suitable_type && !is_partial && !is_only_on_expression;I don't want to repeat this too much, as it is a minor note. Just
sharing my perspective here.As discussed in the other email [1], I feel like keeping
IsIndexUsableForReplicaIdentityFull() function readable is useful
for documentation purposes as well.So, I'm more inclined to see something like your old code, maybe with
a different variable name.bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
to
bool has_equal_strategy = get_equal_strategy_number_for_am...
....
return has_equal_strategy && !is_partial && !is_only_on_expression;+1 for the readability. I think we can as you suggest or I feel it
would be better if we return false as soon as we found any condition
is false. The current patch has a mixed style such that for
InvalidStrategy, it returns immediately but for others, it does a
combined check.
I have followed the later style in the attached.
The other point we should consider in this regard is
the below assert check:+#ifdef USE_ASSERT_CHECKING + { + /* Check that the given index access method has amgettuple routine */ + IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am, + false); + Assert(amroutine->amgettuple != NULL); + } +#endifApart from having an assert, we have the following two options (a)
check this condition as well and return false if am doesn't support
amgettuple (b) report elog(ERROR, ..) in this case.I am of the opinion that we should either have an assert for this or
do (b) because if do (a) currently there is no case where it can
return false. What do you think?
For now, I have kept the assert but moved it to the end of the function.
Apart from the above, I have made a number of minor changes (a)
changed the datatype for the strategy to StrategyNumber at various
places in the patch; (b) made a number of changes in comments based on
Peter's comments and otherwise; (c) ran pgindent and changed the
commit message; (d) few other cosmetic changes.
Let me know what you think of the attached.
--
With Regards,
Amit Kapila.
Attachments:
v6-0001-Allow-the-use-of-a-hash-index-on-the-subscriber-d.patchapplication/octet-stream; name=v6-0001-Allow-the-use-of-a-hash-index-on-the-subscriber-d.patchDownload
From 435c463cd7c30aecf51221c7ff9286b9a07328f6 Mon Sep 17 00:00:00 2001
From: Amit Kapila <akapila@postgresql.org>
Date: Thu, 13 Jul 2023 13:52:07 +0530
Subject: [PATCH v8] Allow the use of a hash index on the subscriber during
replication.
Commit 89e46da5e5 allowed using BTREE indexes that are neither
PRIMARY KEY nor REPLICA IDENTITY on the subscriber during apply of
update/delete. This patch extends that functionality to use HASH indexes
as well.
We explored supporting other index access methods as well but they don't
have a fixed strategy for equality operation which is required by the
current infrastructure in logical replication to scan the indexes.
Author: Kuroda Hayato
Reviewed-by: Peter Smith, Onder Kalaci, Amit Kapila
Discussion: https://postgr.es/m/TYAPR01MB58669D7414E59664E17A5827F522A@TYAPR01MB5866.jpnprd01.prod.outlook.com
---
doc/src/sgml/logical-replication.sgml | 6 +-
src/backend/executor/execReplication.c | 53 ++++++++++++++-
src/backend/replication/logical/relation.c | 41 +++++++++--
src/backend/utils/cache/lsyscache.c | 22 ++++++
src/include/executor/executor.h | 1 +
src/include/utils/lsyscache.h | 1 +
.../subscription/t/032_subscribe_use_index.pl | 68 +++++++++++++++++++
7 files changed, 181 insertions(+), 11 deletions(-)
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c2a749d882..e71f4bac69 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,9 +134,9 @@
to replica identity <literal>FULL</literal>, which means the entire row becomes
the key. When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows. Candidate
- indexes must be btree, non-partial, and the leftmost index field must be a
- column (not an expression) that references the published table column. These
- restrictions on the non-unique index properties adhere to some of the
+ indexes must be btree or hash, non-partial, and the leftmost index field must
+ be a column (not an expression) that references the published table column.
+ These restrictions on the non-unique index properties adhere to some of the
restrictions that are enforced for primary keys. If there are no such
suitable indexes, the search on the subscriber side can be very inefficient,
therefore replica identity <literal>FULL</literal> should only be used as a
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index af09342881..e776524227 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -19,6 +19,7 @@
#include "access/tableam.h"
#include "access/transam.h"
#include "access/xact.h"
+#include "catalog/pg_am_d.h"
#include "commands/trigger.h"
#include "executor/executor.h"
#include "executor/nodeModifyTable.h"
@@ -41,6 +42,49 @@
static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
TypeCacheEntry **eq);
+/*
+ * Returns the fixed strategy number, if any, of the equality operator for the
+ * given index access method, otherwise, InvalidStrategy.
+ *
+ * Currently, only Btree and Hash indexes are supported. The other index access
+ * methods don't have a fixed strategy for equality operation - instead, the
+ * support routines of each operator class interpret the strategy numbers
+ * according to the operator class's definition.
+ */
+StrategyNumber
+get_equal_strategy_number_for_am(Oid am)
+{
+ int ret;
+
+ switch (am)
+ {
+ case BTREE_AM_OID:
+ ret = BTEqualStrategyNumber;
+ break;
+ case HASH_AM_OID:
+ ret = HTEqualStrategyNumber;
+ break;
+ default:
+ /* XXX: Only Btree and Hash indexes are supported */
+ ret = InvalidStrategy;
+ break;
+ }
+
+ return ret;
+}
+
+/*
+ * Return the appropriate strategy number which corresponds to the equality
+ * operator.
+ */
+static StrategyNumber
+get_equal_strategy_number(Oid opclass)
+{
+ Oid am = get_opclass_method(opclass);
+
+ return get_equal_strategy_number_for_am(am);
+}
+
/*
* Setup a ScanKey for a search in the relation 'rel' for a tuple 'key' that
* is setup to match 'rel' (*NOT* idxrel!).
@@ -77,6 +121,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
Oid opfamily;
RegProcedure regop;
int table_attno = indkey->values[index_attoff];
+ StrategyNumber eq_strategy;
if (!AttributeNumberIsValid(table_attno))
{
@@ -93,20 +138,22 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
*/
optype = get_opclass_input_type(opclass->values[index_attoff]);
opfamily = get_opclass_family(opclass->values[index_attoff]);
+ eq_strategy = get_equal_strategy_number(opclass->values[index_attoff]);
operator = get_opfamily_member(opfamily, optype,
optype,
- BTEqualStrategyNumber);
+ eq_strategy);
+
if (!OidIsValid(operator))
elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
- BTEqualStrategyNumber, optype, optype, opfamily);
+ eq_strategy, optype, optype, opfamily);
regop = get_opcode(operator);
/* Initialize the scankey. */
ScanKeyInit(&skey[skey_attoff],
index_attoff + 1,
- BTEqualStrategyNumber,
+ eq_strategy,
regop,
searchslot->tts_values[table_attno - 1]);
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index c545b90636..e231a61334 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -17,6 +17,9 @@
#include "postgres.h"
+#ifdef USE_ASSERT_CHECKING
+#include "access/amapi.h"
+#endif
#include "access/genam.h"
#include "access/table.h"
#include "catalog/namespace.h"
@@ -779,7 +782,7 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
/*
* Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and the leftmost
+ * the relation. The index must be btree or hash, non-partial, and the leftmost
* field must be a column (not an expression) that references the remote
* relation column. These limitations help to keep the index scan similar
* to PK/RI index scans.
@@ -834,15 +837,43 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
/*
* Returns true if the index is usable for replica identity full. For details,
* see FindUsableIndexForReplicaIdentityFull.
+ *
+ * Currently, only Btree and Hash indexes can be returned as usable. This
+ * is due to following reasons:
+ *
+ * 1) Other index access methods don't have a fixed strategy for equality
+ * operation. Refer get_equal_strategy_number_for_am().
+ *
+ * 2) For indexes other than PK and REPLICA IDENTITY, we need to match the
+ * local and remote tuples. The equality routine tuples_equal() cannot accept
+ * a datatype (e.g. point or box) that does not have an operator class for
+ * Btree or Hash.
+ *
+ * XXX: Note that BRIN and GIN indexes do not implement "amgettuple" which
+ * will be used later to fetch the tuples. See RelationFindReplTupleByIndex().
*/
bool
IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
{
- bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
- bool is_partial = (indexInfo->ii_Predicate != NIL);
- bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+ /* Ensure that the index access method has a valid equal strategy */
+ if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
+ return false;
+ if (indexInfo->ii_Predicate != NIL)
+ return false;
+ if (IsIndexOnlyOnExpression(indexInfo))
+ return false;
+
+#ifdef USE_ASSERT_CHECKING
+ {
+ IndexAmRoutine *amroutine;
- return is_btree && !is_partial && !is_only_on_expression;
+ /* The given index access method must implement amgettuple. */
+ amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am, false);
+ Assert(amroutine->amgettuple != NULL);
+ }
+#endif
+
+ return true;
}
/*
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 60978f9415..2c01a86c29 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -1255,6 +1255,28 @@ get_opclass_opfamily_and_input_type(Oid opclass, Oid *opfamily, Oid *opcintype)
return true;
}
+/*
+ * get_opclass_method
+ *
+ * Returns the OID of the index access method the opclass belongs to.
+ */
+Oid
+get_opclass_method(Oid opclass)
+{
+ HeapTuple tp;
+ Form_pg_opclass cla_tup;
+ Oid result;
+
+ tp = SearchSysCache1(CLAOID, ObjectIdGetDatum(opclass));
+ if (!HeapTupleIsValid(tp))
+ elog(ERROR, "cache lookup failed for opclass %u", opclass);
+ cla_tup = (Form_pg_opclass) GETSTRUCT(tp);
+
+ result = cla_tup->opcmethod;
+ ReleaseSysCache(tp);
+ return result;
+}
+
/* ---------- OPERATOR CACHE ---------- */
/*
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..c677e490d7 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -646,6 +646,7 @@ extern void check_exclusion_constraint(Relation heap, Relation index,
/*
* prototypes from functions in execReplication.c
*/
+extern StrategyNumber get_equal_strategy_number_for_am(Oid am);
extern bool RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
LockTupleMode lockmode,
TupleTableSlot *searchslot,
diff --git a/src/include/utils/lsyscache.h b/src/include/utils/lsyscache.h
index 4f5418b972..f5fdbfe116 100644
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -106,6 +106,7 @@ extern Oid get_opclass_family(Oid opclass);
extern Oid get_opclass_input_type(Oid opclass);
extern bool get_opclass_opfamily_and_input_type(Oid opclass,
Oid *opfamily, Oid *opcintype);
+extern Oid get_opclass_method(Oid opclass);
extern RegProcedure get_opcode(Oid opno);
extern char *get_opname(Oid opno);
extern Oid get_op_rettype(Oid opno);
diff --git a/src/test/subscription/t/032_subscribe_use_index.pl b/src/test/subscription/t/032_subscribe_use_index.pl
index 576eec6a57..880ef2d57a 100644
--- a/src/test/subscription/t/032_subscribe_use_index.pl
+++ b/src/test/subscription/t/032_subscribe_use_index.pl
@@ -478,6 +478,74 @@ $node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
# data
# =============================================================================
+# =============================================================================
+# Testcase start: Subscription can use hash index
+#
+
+# create tables on pub and sub
+$node_publisher->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_publisher->safe_psql('postgres',
+ "ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL");
+$node_subscriber->safe_psql('postgres',
+ "CREATE TABLE test_replica_id_full (x int, y text)");
+$node_subscriber->safe_psql('postgres',
+ "CREATE INDEX test_replica_id_full_idx ON test_replica_id_full USING HASH (x)");
+
+# insert some initial data
+$node_publisher->safe_psql('postgres',
+ "INSERT INTO test_replica_id_full SELECT i, (i%10)::text FROM generate_series(0,10) i"
+);
+
+# create pub/sub
+$node_publisher->safe_psql('postgres',
+ "CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full");
+$node_subscriber->safe_psql('postgres',
+ "CREATE SUBSCRIPTION tap_sub_rep_full CONNECTION '$publisher_connstr application_name=$appname' PUBLICATION tap_pub_rep_full"
+);
+
+# wait for initial table synchronization to finish
+$node_subscriber->wait_for_subscription_sync($node_publisher, $appname);
+
+# delete 2 rows
+$node_publisher->safe_psql('postgres',
+ "DELETE FROM test_replica_id_full WHERE x IN (5, 6)");
+
+# update 2 rows
+$node_publisher->safe_psql('postgres',
+ "UPDATE test_replica_id_full SET x = 100, y = '200' WHERE x IN (1, 2)");
+
+# wait until the index is used on the subscriber
+$node_publisher->wait_for_catchup($appname);
+$node_subscriber->poll_query_until('postgres',
+ q{select (idx_scan = 4) from pg_stat_all_indexes where indexrelname = 'test_replica_id_full_idx';}
+ )
+ or die
+ "Timed out while waiting for check subscriber tap_sub_rep_full deletes 2 rows and updates 2 rows via index";
+
+# make sure that the subscriber has the correct data after the UPDATE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full WHERE (x = 100 and y = '200')"
+);
+is($result, qq(2),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# make sure that the subscriber has the correct data after the first DELETE
+$result = $node_subscriber->safe_psql('postgres',
+ "select count(*) from test_replica_id_full where x in (5, 6)");
+is($result, qq(0),
+ 'ensure subscriber has the correct data at the end of the test');
+
+# cleanup pub
+$node_publisher->safe_psql('postgres', "DROP PUBLICATION tap_pub_rep_full");
+$node_publisher->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+# cleanup sub
+$node_subscriber->safe_psql('postgres', "DROP SUBSCRIPTION tap_sub_rep_full");
+$node_subscriber->safe_psql('postgres', "DROP TABLE test_replica_id_full");
+
+# Testcase end: Subscription can use hash index
+# =============================================================================
+
$node_subscriber->stop('fast');
$node_publisher->stop('fast');
--
2.39.1
Hi Amit, Hayato, all
+1 for the readability. I think we can as you suggest or I feel it
would be better if we return false as soon as we found any condition
is false. The current patch has a mixed style such that for
InvalidStrategy, it returns immediately but for others, it does a
combined check.I have followed the later style in the attached.
Looks good to me!
I agree with your note that the confusion was mostly
around two different styles (e,g., some checks return early others wait
until the final return). Now, the code is pretty easy to follow.
The other point we should consider in this regard is
the below assert check:+#ifdef USE_ASSERT_CHECKING + { + /* Check that the given index access method has amgettuple routine */ + IndexAmRoutine *amroutine = GetIndexAmRoutineByAmId(indexInfo->ii_Am, + false); + Assert(amroutine->amgettuple != NULL); + } +#endifApart from having an assert, we have the following two options (a)
check this condition as well and return false if am doesn't support
amgettuple (b) report elog(ERROR, ..) in this case.
I think with the current state of the patch (e.g., we only support btree
and hash),
Assert looks reasonable.
In the future, in case we have a future hypothetical index type that
fulfills the
"if" checks but fails on amgettuple, we could change the code to "return
false"
such that it gives a chance for the other indexes to satisfy the condition.
Let me know what you think of the attached.
Looks good to me. I have also done some testing, which works as
documented/expected.
Thanks,
Onder
On Thu, Jul 13, 2023 at 8:07 PM Önder Kalacı <onderkalaci@gmail.com> wrote:
Looks good to me. I have also done some testing, which works as documented/expected.
Thanks, I have pushed this after minor changes in the comments.
--
With Regards,
Amit Kapila.