doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Started by Masahiko Sawadaover 2 years ago37 messages
#1Masahiko Sawada
sawada.mshk@gmail.com
1 attachment(s)

Hi,

Commit 89e46da5e5 allowed us to use indexes for searching on REPLICA
IDENTITY FULL tables. The documentation explains:

When replica identity <quote>full</quote> is specified,
indexes can be used on the subscriber side for searching the rows. Candidate
indexes must be btree, non-partial, and have at least one column reference
(i.e. cannot consist of only expressions).

To be exact, IIUC the column reference must be on the leftmost column
of indexes. Does it make sense to mention that? I've attached the
patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

doc_update.patchapplication/octet-stream; name=doc_update.patchDownload
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..4f4d54758c 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -135,9 +135,9 @@
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
    indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only expressions).  These
+   restrictions on the non-unique index properties adhere to some of the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
    the search on the subscriber side can be very inefficient, therefore
    replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#1)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Mon, Jul 3, 2023 at 7:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

Commit 89e46da5e5 allowed us to use indexes for searching on REPLICA
IDENTITY FULL tables. The documentation explains:

When replica identity <quote>full</quote> is specified,
indexes can be used on the subscriber side for searching the rows. Candidate
indexes must be btree, non-partial, and have at least one column reference
(i.e. cannot consist of only expressions).

To be exact, IIUC the column reference must be on the leftmost column
of indexes. Does it make sense to mention that?

Yeah, I think it is good to mention that. Accordingly, the comments
atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted.

--
With Regards,
Amit Kapila.

#3Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#1)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]Amit suggestions - /messages/by-id/CAA4eK1LZ_A-UmC_P+_hLi+Pbwyqak+vRKemZ7imzk2puVTpHOA@mail.gmail.com:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

Actually, I thought the FindUsableIndexForReplicaIdentityFull()
function comment is *already* describing the limitation about the
leftmost column (see fragment below), so IIUC the Sawada-san patch is
only trying to expose that same information in the PG docs.

[FindUsableIndexForReplicaIdentityFull comment fragment]
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

~

OTOH, it may be better if these limitation rule details were not
scattered in the code. e.g. build_replindex_scan_key() function
comment can be simplified:

CURRENT:
* This is not generic routine, it expects the idxrel to be a btree, non-partial
* and have at least one column reference (i.e. cannot consist of only
* expressions).

SUGGESTION:
This is not a generic routine. It expects the 'idxrel' to be an index
deemed "usable" by the function
FindUsableIndexForReplicaIdentityFull().

------
doc/src/sgml/logical-replication.sgml

1.
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.
Candidate
    indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only
expressions).  These
+   restrictions on the non-unique index properties adhere to some of
the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
    the search on the subscriber side can be very inefficient, therefore
    replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields. TBH, I am not sure the patch
wording is even describing the limitation in quite the same way as
what the code is actually doing.

HEAD (code comment):
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

HEAD (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions). These
restrictions on the non-unique index properties adhere to some of the
restrictions that are enforced for primary keys.

PATCHED (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference at the leftmost column indexes (i.e. cannot consist
of only expressions). These restrictions on the non-unique index
properties adhere to some of the restrictions that are enforced for
primary keys.

MY SUGGESTION:
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions).
Furthermore, the leftmost field of the candidate index must be a
column of the published table. These restrictions on the non-unique
index properties adhere to some of the restrictions that are enforced
for primary keys.

------
[1]: Amit suggestions - /messages/by-id/CAA4eK1LZ_A-UmC_P+_hLi+Pbwyqak+vRKemZ7imzk2puVTpHOA@mail.gmail.com
/messages/by-id/CAA4eK1LZ_A-UmC_P+_hLi+Pbwyqak+vRKemZ7imzk2puVTpHOA@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#3)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

Actually, I thought the FindUsableIndexForReplicaIdentityFull()
function comment is *already* describing the limitation about the
leftmost column (see fragment below), so IIUC the Sawada-san patch is
only trying to expose that same information in the PG docs.

[FindUsableIndexForReplicaIdentityFull comment fragment]
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

This implies that the leftmost column of the index must be
non-expression but I feel what the patch intends to say in docs is
more straightforward and it doesn't match what the proposed docs says.

~

OTOH, it may be better if these limitation rule details were not
scattered in the code. e.g. build_replindex_scan_key() function
comment can be simplified:

CURRENT:
* This is not generic routine, it expects the idxrel to be a btree, non-partial
* and have at least one column reference (i.e. cannot consist of only
* expressions).

SUGGESTION:
This is not a generic routine. It expects the 'idxrel' to be an index
deemed "usable" by the function
FindUsableIndexForReplicaIdentityFull().

Note that for PK/ReplicaIdentity, we don't even call
FindUsableIndexForReplicaIdentityFull() but build_replindex_scan_key()
would still be called for such index. So, I am not sure your proposed
wording is an improvement.

------
doc/src/sgml/logical-replication.sgml

1.
the key.  When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only
expressions).  These
+   restrictions on the non-unique index properties adhere to some of
the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity <literal>FULL</literal> should only be used as a
fallback if no other solution is possible.  If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields. TBH, I am not sure the patch
wording is even describing the limitation in quite the same way as
what the code is actually doing.

HEAD (code comment):
* We also skip indexes if the remote relation does not contain the leftmost
* column of the index. This is because in most such cases sequential scan is
* favorable over index scan.

HEAD (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions). These
restrictions on the non-unique index properties adhere to some of the
restrictions that are enforced for primary keys.

PATCHED (rendered docs)
Candidate indexes must be btree, non-partial, and have at least one
column reference at the leftmost column indexes (i.e. cannot consist
of only expressions). These restrictions on the non-unique index
properties adhere to some of the restrictions that are enforced for
primary keys.

MY SUGGESTION:
Candidate indexes must be btree, non-partial, and have at least one
column reference (i.e. cannot consist of only expressions).
Furthermore, the leftmost field of the candidate index must be a
column of the published table. These restrictions on the non-unique
index properties adhere to some of the restrictions that are enforced
for primary keys.

I don't know if this suggestion is what the code is actually doing. In
function RemoteRelContainsLeftMostColumnOnIdx(), we have the following
checks:
==========
keycol = indexInfo->ii_IndexAttrNumbers[0];
if (!AttributeNumberIsValid(keycol))
return false;

if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
return false;

return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
==========

The first of these checks indicates that the leftmost column of the
index should be non-expression, second and third indicates what you
suggest in your wording. We can also think that what you wrote in a
way is a superset of "leftmost index column is a non-expression" and
"leftmost index column should be present in remote rel" but I feel it
would be better to explicit about the first part as it is easy to
understand for users at least in docs.

--
With Regards,
Amit Kapila.

#5Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#4)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 5, 2023 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

As for IsIndexOnlyOnExpression(), what part do you think we need to
adjust? It says:

/*
* Returns true if the given index consists only of expressions such as:
* CREATE INDEX idx ON table(foo(col));
*
* Returns false even if there is one column reference:
* CREATE INDEX idx ON table(foo(col), col_2);
*/

and it seems to me that the function doesn't check if the leftmost
index column is a non-expression.

doc/src/sgml/logical-replication.sgml

1.
the key.  When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only
expressions).  These
+   restrictions on the non-unique index properties adhere to some of
the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity <literal>FULL</literal> should only be used as a
fallback if no other solution is possible.  If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields.

That was my mistake, it should be " at the leftmost column".

I don't know if this suggestion is what the code is actually doing. In
function RemoteRelContainsLeftMostColumnOnIdx(), we have the following
checks:
==========
keycol = indexInfo->ii_IndexAttrNumbers[0];
if (!AttributeNumberIsValid(keycol))
return false;

if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
return false;

return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
==========

The first of these checks indicates that the leftmost column of the
index should be non-expression, second and third indicates what you
suggest in your wording. We can also think that what you wrote in a
way is a superset of "leftmost index column is a non-expression" and
"leftmost index column should be present in remote rel" but I feel it
would be better to explicit about the first part as it is easy to
understand for users at least in docs.

+1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#6Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#5)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 5, 2023 at 12:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 5, 2023 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

As for IsIndexOnlyOnExpression(), what part do you think we need to
adjust? It says:

/*
* Returns true if the given index consists only of expressions such as:
* CREATE INDEX idx ON table(foo(col));
*
* Returns false even if there is one column reference:
* CREATE INDEX idx ON table(foo(col), col_2);
*/

and it seems to me that the function doesn't check if the leftmost
index column is a non-expression.

Right, so, we can leave this as is.

--
With Regards,
Amit Kapila.

#7Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#5)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 5, 2023 at 4:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 5, 2023 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

As for IsIndexOnlyOnExpression(), what part do you think we need to
adjust? It says:

/*
* Returns true if the given index consists only of expressions such as:
* CREATE INDEX idx ON table(foo(col));
*
* Returns false even if there is one column reference:
* CREATE INDEX idx ON table(foo(col), col_2);
*/

and it seems to me that the function doesn't check if the leftmost
index column is a non-expression.

TBH, this IsIndexOnlyOnExpression() function seemed redundant to me,
otherwise, there can be some indexes that are firstly considered
"useable" but then fail the subsequent leftmost check. It does not
seem right.

doc/src/sgml/logical-replication.sgml

1.
the key.  When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only
expressions).  These
+   restrictions on the non-unique index properties adhere to some of
the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity <literal>FULL</literal> should only be used as a
fallback if no other solution is possible.  If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields.

That was my mistake, it should be " at the leftmost column".

IIUC the subscriber-side table can have more columns than the
publisher-side table, so just describing in the docs that the leftmost
INDEX field must be a column may not be quite enough; it also needs to
say that column has to exist on the publisher-table, doesn't it?

Also, after you document this 'leftmost field restriction' that
already implies there *must* be a non-expression in the INDEX. So I
thought we can just omit the "(i.e. cannot consist of only
expressions)" part.

Anyway, I will wait to see the wording of the updated patch before
commenting further.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#8Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Smith (#7)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 6, 2023 at 7:58 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Jul 5, 2023 at 4:32 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 5, 2023 at 2:46 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 5, 2023 at 9:01 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi. Here are some review comments for this patch.

+1 for the patch idea.

------

I wasn't sure about the code comment adjustments suggested by Amit [1]:
"Accordingly, the comments atop build_replindex_scan_key(),
FindUsableIndexForReplicaIdentityFull(), IsIndexOnlyOnExpression()
should also be adjusted."

As for IsIndexOnlyOnExpression(), what part do you think we need to
adjust? It says:

/*
* Returns true if the given index consists only of expressions such as:
* CREATE INDEX idx ON table(foo(col));
*
* Returns false even if there is one column reference:
* CREATE INDEX idx ON table(foo(col), col_2);
*/

and it seems to me that the function doesn't check if the leftmost
index column is a non-expression.

TBH, this IsIndexOnlyOnExpression() function seemed redundant to me,
otherwise, there can be some indexes that are firstly considered
"useable" but then fail the subsequent leftmost check. It does not
seem right.

I see your point. IsIndexUsableForReplicaIdentityFull(), the sole user
of IsIndexOnlyOnExpression(), is also called by
RelationFindReplTupleByIndex() in an assertion build. I thought this
is the reason why we have separate IsIndexOnlyOnExpression() ( and
IsIndexUsableForReplicaIdentityFull()). But this assertion doesn't
check if the leftmost index column exists on the remote relation. What
are we doing this check for? If it's not necessary, we can remove this
assertion and merge both IsIndexOnlyOnExpression() and
IsIndexUsableForReplicaIdentityFull() into
FindUsableIndexForReplicaIdentityFull().

doc/src/sgml/logical-replication.sgml

1.
the key.  When replica identity <literal>FULL</literal> is specified,
indexes can be used on the subscriber side for searching the rows.
Candidate
indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   at the leftmost column indexes (i.e. cannot consist of only
expressions).  These
+   restrictions on the non-unique index properties adhere to some of
the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
the search on the subscriber side can be very inefficient, therefore
replica identity <literal>FULL</literal> should only be used as a
fallback if no other solution is possible.  If a replica identity other

Isn't this using the word "indexes" with different meanings in the
same sentence? e.g. IIUC "leftmost column indexes" is referring to the
ordinal number of the index fields.

That was my mistake, it should be " at the leftmost column".

IIUC the subscriber-side table can have more columns than the
publisher-side table, so just describing in the docs that the leftmost
INDEX field must be a column may not be quite enough; it also needs to
say that column has to exist on the publisher-table, doesn't it?

Right. I've updated the patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

doc_update_v2.patchapplication/octet-stream; name=doc_update_v2.patchDownload
diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..1acf75be6f 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,12 +134,12 @@
    to replica identity <literal>FULL</literal>, which means the entire row becomes
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
-   indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
-   the search on the subscriber side can be very inefficient, therefore
-   replica identity <literal>FULL</literal> should only be used as a
+   indexes must be btree, non-partial, and have at least one column reference to
+   the published table column at the leftmost column (i.e. cannot consist of only
+   expressions).  These restrictions on the non-unique index properties adhere to
+   some of the restrictions that are enforced for primary keys.  If there are no
+   such suitable indexes, the search on the subscriber side can be very inefficient,
+   therefore replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
    than <literal>FULL</literal> is set on the publisher side, a replica identity
    comprising the same or fewer columns must also be set on the subscriber
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..6f2a465133 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -47,9 +47,9 @@ static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
  *
  * Returns how many columns to use for the index scan.
  *
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, idxrel must be PK, RI, or an index that can be
+ * used for REPLICA IDENTITY FULL table (see FindUsableIndexForReplicaIdentityFull()
+ * for details).
  *
  * By definition, replication identity of a rel meets all limitations associated
  * with that. Note that any other index could also meet these limitations.
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..ce66149b5e 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -780,8 +780,9 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
  * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * one column reference to the remote relation's column at the leftmost column
+ * (i.e. cannot consist of only expressions). These limitations help to keep the
+ * index scan similar to PK/RI index scans.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
#9Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#8)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Hi, Here are my review comments for patch v2.

======
1. doc/src/sgml/logical-replication.sgml

Candidate indexes must be btree, non-partial, and have at least one
column reference to the published table column at the leftmost column
(i.e. cannot consist of only expressions).

~

There is only one column which can be the leftmost one, so the wording
"At least one ... at the leftmost" seemed a bit strange to me.
Personally, I would phrase it something like below:

SUGGESTION #1
Candidate indexes must be btree, non-partial, and the leftmost index
column must reference a published table column (i.e. the index cannot
consist of only expressions).

SUGGESTION #2 (same as above, but omitting the "only expressions"
part, which I think is implied by the "leftmost" rule anyway)
Candidate indexes must be btree, non-partial, and the leftmost index
column must reference a published table column.

======
2. src/backend/replication/logical/relation.c

  * Returns the oid of an index that can be used by the apply worker to scan
  * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * one column reference to the remote relation's column at the leftmost column
+ * (i.e. cannot consist of only expressions). These limitations help
to keep the
+ * index scan similar to PK/RI index scans.

This comment text is similar to the docs change, so refer to the same
suggestions as #1 above.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#10Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Smith (#9)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Fri, Jul 7, 2023 at 10:55 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi, Here are my review comments for patch v2.

======
1. doc/src/sgml/logical-replication.sgml

Candidate indexes must be btree, non-partial, and have at least one
column reference to the published table column at the leftmost column
(i.e. cannot consist of only expressions).

~

There is only one column which can be the leftmost one, so the wording
"At least one ... at the leftmost" seemed a bit strange to me.
Personally, I would phrase it something like below:

SUGGESTION #1
Candidate indexes must be btree, non-partial, and the leftmost index
column must reference a published table column (i.e. the index cannot
consist of only expressions).

I prefer the first suggestion. I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchapplication/octet-stream; name=v3-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchDownload
From 2ff55c318971d34c68fb8a28b9acb97ab99bbf71 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 7 Jul 2023 16:49:39 +0900
Subject: [PATCH v3] Doc: clarify the conditions of usable indexes for REPLICA
 IDENTITY FULL tables.

Commit 89e46da5e allowed REPLICA IDENTIFY FULL tables to use an index
on the subscriber during apply of update/delete.

The source code comments are also updated accordingly.

Reviewed-by: Amit Kapila, Peter Smith
Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com
Backpatch-through: 16
---
 doc/src/sgml/logical-replication.sgml      | 8 ++++----
 src/backend/executor/execReplication.c     | 6 +++---
 src/backend/replication/logical/relation.c | 7 ++++---
 3 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..92c3043303 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,10 +134,10 @@
    to replica identity <literal>FULL</literal>, which means the entire row becomes
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
-   indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
+   indexes must be btree, non-partial, and the leftmost index column must reference
+   a published table column (i.e. cannot consist of only expressions).  These
+   restrictions on the non-unique index properties adhere to some of the restrictions
+   that are enforced for primary keys.  If there are no such suitable indexes,
    the search on the subscriber side can be very inefficient, therefore
    replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..6f2a465133 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -47,9 +47,9 @@ static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
  *
  * Returns how many columns to use for the index scan.
  *
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, idxrel must be PK, RI, or an index that can be
+ * used for REPLICA IDENTITY FULL table (see FindUsableIndexForReplicaIdentityFull()
+ * for details).
  *
  * By definition, replication identity of a rel meets all limitations associated
  * with that. Note that any other index could also meet these limitations.
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..8bc002cdfc 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,9 +779,10 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
 
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * the relation. The index must be btree, non-partial, and the leftmost column
+ * must reference a published table column (i.e. cannot consist of only
+ * expressions). These limitations help to keep the index scan similar to
+ * PK/RI index scans.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
-- 
2.31.1

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#10)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

--
With Regards,
Amit Kapila.

#12Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#11)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1]my review v2 - /messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com.

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2]create index - https://www.postgresql.org/docs/current/sql-createindex.html.

SUGGESTION
Candidate indexes must be btree, non-partial, and the leftmost index
field must be a column that references a published table column (i.e.
the index cannot consist of only expressions).

~~~~

What happened to the earlier idea of removing/merging the redundant
(?) function IsIndexOnlyOnExpression()?
- Something wrong with that?
- Chose not to do it?
- Will do it raised in another thread?

------
[1]: my review v2 - /messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com
/messages/by-id/CAHut+PsFdTZJ7DG1jyu7BpA_1d4hwEd-Q+mQAPWcj1ZLD_X5Dw@mail.gmail.com
[2]: create index - https://www.postgresql.org/docs/current/sql-createindex.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#12)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Mon, Jul 10, 2023 at 7:55 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1].

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2].

I thought it would be better to be explicit for this case but I am
fine if Sawada-San and you prefer some other way to document it.

--
With Regards,
Amit Kapila.

#14Peter Smith
smithpb2250@gmail.com
In reply to: Amit Kapila (#13)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Mon, Jul 10, 2023 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jul 10, 2023 at 7:55 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1].

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2].

I thought it would be better to be explicit for this case but I am
fine if Sawada-San and you prefer some other way to document it.

I see. How about just moving the parenthesized part to explicitly
refer only to the leftmost field?

SUGGESTION
Candidate indexes must be btree, non-partial, and the leftmost index
field must be a column (not an expression) that references a published
table column.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Smith (#14)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Tue, Jul 11, 2023 at 4:54 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, Jul 10, 2023 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jul 10, 2023 at 7:55 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1].

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2].

I thought it would be better to be explicit for this case but I am
fine if Sawada-San and you prefer some other way to document it.

I see. How about just moving the parenthesized part to explicitly
refer only to the leftmost field?

SUGGESTION
Candidate indexes must be btree, non-partial, and the leftmost index
field must be a column (not an expression) that references a published
table column.

Yeah, something like that works for me.

--
With Regards,
Amit Kapila.

#16Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#15)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Tue, Jul 11, 2023 at 1:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, Jul 11, 2023 at 4:54 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, Jul 10, 2023 at 2:21 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jul 10, 2023 at 7:55 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 8, 2023 at 1:49 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 7, 2023 at 1:36 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I prefer the first suggestion. I've attached the updated patch.

This looks mostly good to me but I think it would be better if we can
also add the information that the leftmost index column must be a
non-expression. So, how about: "Candidate indexes must be btree,
non-partial, and the leftmost index column must be a non-expression
and reference to a published table column (i.e. cannot consist of only
expressions)."?

That part in parentheses ought to say "the index ..." because it is
referring to the full INDEX, not to the leftmost column. I think this
was missed when Sawada-san took my previous suggestion [1].

IMO it doesn't sound right to say the "index column must be a
non-expression". It is already a non-expression because it is a
column. So I think it would be better to refer to this as an INDEX
"field" instead of an INDEX column. Note that "field" is the same
terminology used in the docs for CREATE INDEX [2].

I thought it would be better to be explicit for this case but I am
fine if Sawada-San and you prefer some other way to document it.

I see. How about just moving the parenthesized part to explicitly
refer only to the leftmost field?

SUGGESTION
Candidate indexes must be btree, non-partial, and the leftmost index
field must be a column (not an expression) that references a published
table column.

Yeah, something like that works for me.

Looks good to me. I've attached the updated patch. In the comment in
RemoteRelContainsLeftMostColumnOnIdx(), I used "remote relation"
instead of "published table" as it's more consistent with surrounding
comments. Also, I've removed the comment starting with "We also skip
indedes..." as the new comment now covers it. Please review it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchapplication/octet-stream; name=v4-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchDownload
From 5b55b9850b633c1b9767d3a95a49d0dfd98fecad Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 7 Jul 2023 16:49:39 +0900
Subject: [PATCH v4] Doc: clarify the conditions of usable indexes for REPLICA
 IDENTITY FULL tables.

Commit 89e46da5e allowed REPLICA IDENTIFY FULL tables to use an index
on the subscriber during apply of update/delete.

The source code comments are also updated accordingly.

Reviewed-by: Amit Kapila, Peter Smith
Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com
Backpatch-through: 16
---
 doc/src/sgml/logical-replication.sgml      | 12 ++++++------
 src/backend/executor/execReplication.c     |  6 +++---
 src/backend/replication/logical/relation.c | 11 ++++-------
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..e0996b84ea 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,12 +134,12 @@
    to replica identity <literal>FULL</literal>, which means the entire row becomes
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
-   indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
-   the search on the subscriber side can be very inefficient, therefore
-   replica identity <literal>FULL</literal> should only be used as a
+   indexes must be btree, non-partial, and the leftmost index field must be a
+   column (not an expression) that reference a published table column.  These
+   restrictions on the non-unique index properties adhere to some of the
+   restrictions that are enforced for primary keys.  If there are no such
+   suitable indexes, the search on the subscriber side can be very inefficient,
+   therefore replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
    than <literal>FULL</literal> is set on the publisher side, a replica identity
    comprising the same or fewer columns must also be set on the subscriber
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..6f2a465133 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -47,9 +47,9 @@ static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
  *
  * Returns how many columns to use for the index scan.
  *
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, idxrel must be PK, RI, or an index that can be
+ * used for REPLICA IDENTITY FULL table (see FindUsableIndexForReplicaIdentityFull()
+ * for details).
  *
  * By definition, replication identity of a rel meets all limitations associated
  * with that. Note that any other index could also meet these limitations.
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..fe940335a3 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,9 +779,10 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
 
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * the relation. The index must be btree, non-partial, and the leftmost
+ * field must be a column (not an expression) that reference the remote
+ * relation. These limitations help to keep the index scan similar to
+ * PK/RI index scans.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
@@ -796,10 +797,6 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
  * none of the tuples satisfy the expression for the index scan, we fall-back
  * to sequential execution, which might not be a good idea in some cases.
  *
- * We also skip indexes if the remote relation does not contain the leftmost
- * column of the index. This is because in most such cases sequential scan is
- * favorable over index scan.
- *
  * We expect to call this function when REPLICA IDENTITY FULL is defined for
  * the remote relation.
  *
-- 
2.31.1

#17Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#16)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Here are my comments for v4.

======

Docs/Comments:

All the docs and updated comments LTGM, except I felt one sentence
might be written differently to avoid nested parentheses.

BEFORE
...used for REPLICA IDENTITY FULL table (see
FindUsableIndexForReplicaIdentityFull() for details).

AFTER
...used for REPLICA IDENTITY FULL table. See
FindUsableIndexForReplicaIdentityFull() for details.

====

Logic:

What was the decision about the earlier question [1]/messages/by-id/CAHut+PuGhGHp9Uq8-Wk7uBiirAHF5quDY_1Z6WDoUKRZqkn+rg@mail.gmail.com of
removing/merging the function IsIndexOnlyOnExpression()?

------
[1]: /messages/by-id/CAHut+PuGhGHp9Uq8-Wk7uBiirAHF5quDY_1Z6WDoUKRZqkn+rg@mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

#18Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Smith (#17)
2 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

Here are my comments for v4.

======

Docs/Comments:

All the docs and updated comments LTGM, except I felt one sentence
might be written differently to avoid nested parentheses.

BEFORE
...used for REPLICA IDENTITY FULL table (see
FindUsableIndexForReplicaIdentityFull() for details).

AFTER
...used for REPLICA IDENTITY FULL table. See
FindUsableIndexForReplicaIdentityFull() for details.

====

Agreed. I've attached the updated patch. I'll push it barring any objections.

Logic:

What was the decision about the earlier question [1] of
removing/merging the function IsIndexOnlyOnExpression()?

I don't think we have concluded any action for it. I agree that
IsIndexOnlyOnExpression() is redundant. We don't need to check *all*
index fields actually. I've attached a draft patch. It removes
IsIndexOnlyOnExpression() and merges
RemoteRelContainsLeftMostColumnOnIdx() to
FindUsableIndexForReplicaIdentityFull. One concern is that we no
longer do the assertion check with
IsIndexUsableForReplicaIdentityFull(). What do you think?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchapplication/octet-stream; name=v5-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchDownload
From 1afcd0a67b2d49840821d6e62fc38de9253121de Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Fri, 7 Jul 2023 16:49:39 +0900
Subject: [PATCH v5] Doc: clarify the conditions of usable indexes for REPLICA
 IDENTITY FULL tables.

Commit 89e46da5e allowed REPLICA IDENTIFY FULL tables to use an index
on the subscriber during apply of update/delete.

The source code comments are also updated accordingly.

Reviewed-by: Amit Kapila, Peter Smith
Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com
Backpatch-through: 16
---
 doc/src/sgml/logical-replication.sgml      | 12 ++++++------
 src/backend/executor/execReplication.c     |  6 +++---
 src/backend/replication/logical/relation.c | 11 ++++-------
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..e0996b84ea 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,12 +134,12 @@
    to replica identity <literal>FULL</literal>, which means the entire row becomes
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
-   indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
-   the search on the subscriber side can be very inefficient, therefore
-   replica identity <literal>FULL</literal> should only be used as a
+   indexes must be btree, non-partial, and the leftmost index field must be a
+   column (not an expression) that reference a published table column.  These
+   restrictions on the non-unique index properties adhere to some of the
+   restrictions that are enforced for primary keys.  If there are no such
+   suitable indexes, the search on the subscriber side can be very inefficient,
+   therefore replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
    than <literal>FULL</literal> is set on the publisher side, a replica identity
    comprising the same or fewer columns must also be set on the subscriber
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..af09342881 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -47,9 +47,9 @@ static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
  *
  * Returns how many columns to use for the index scan.
  *
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, idxrel must be PK, RI, or an index that can be
+ * used for REPLICA IDENTITY FULL table. See FindUsableIndexForReplicaIdentityFull()
+ * for details.
  *
  * By definition, replication identity of a rel meets all limitations associated
  * with that. Note that any other index could also meet these limitations.
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..fe940335a3 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,9 +779,10 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
 
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * the relation. The index must be btree, non-partial, and the leftmost
+ * field must be a column (not an expression) that reference the remote
+ * relation. These limitations help to keep the index scan similar to
+ * PK/RI index scans.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
@@ -796,10 +797,6 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
  * none of the tuples satisfy the expression for the index scan, we fall-back
  * to sequential execution, which might not be a good idea in some cases.
  *
- * We also skip indexes if the remote relation does not contain the leftmost
- * column of the index. This is because in most such cases sequential scan is
- * favorable over index scan.
- *
  * We expect to call this function when REPLICA IDENTITY FULL is defined for
  * the remote relation.
  *
-- 
2.31.1

remove_redundant_check.patchapplication/octet-stream; name=remove_redundant_check.patchDownload
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index af09342881..25d2868744 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -175,16 +175,7 @@ retry:
 		if (!isIdxSafeToSkipDuplicates)
 		{
 			if (eq == NULL)
-			{
-#ifdef USE_ASSERT_CHECKING
-				/* apply assertions only once for the input idxoid */
-				IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
-
-				Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
-#endif
-
 				eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
-			}
 
 			if (!tuples_equal(outslot, searchslot, eq))
 				continue;
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index fe940335a3..39aedf24c8 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -731,52 +731,6 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
 	return entry;
 }
 
-/*
- * Returns true if the given index consists only of expressions such as:
- * 	CREATE INDEX idx ON table(foo(col));
- *
- * Returns false even if there is one column reference:
- * 	 CREATE INDEX idx ON table(foo(col), col_2);
- */
-static bool
-IsIndexOnlyOnExpression(IndexInfo *indexInfo)
-{
-	for (int i = 0; i < indexInfo->ii_NumIndexKeyAttrs; i++)
-	{
-		AttrNumber	attnum = indexInfo->ii_IndexAttrNumbers[i];
-
-		if (AttributeNumberIsValid(attnum))
-			return false;
-	}
-
-	return true;
-}
-
-/*
- * Returns true if the attrmap contains the leftmost column of the index.
- * Otherwise returns false.
- *
- * attrmap is a map of local attributes to remote ones. We can consult this
- * map to check whether the local index attribute has a corresponding remote
- * attribute.
- */
-static bool
-RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
-{
-	AttrNumber	keycol;
-
-	Assert(indexInfo->ii_NumIndexAttrs >= 1);
-
-	keycol = indexInfo->ii_IndexAttrNumbers[0];
-	if (!AttributeNumberIsValid(keycol))
-		return false;
-
-	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
-		return false;
-
-	return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
-}
-
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
  * the relation. The index must be btree, non-partial, and the leftmost
@@ -791,6 +745,10 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
  * compare the tuples for non-PK/RI index scans. See
  * RelationFindReplTupleByIndex().
  *
+ * attrmap is a map of local attributes to remote ones. We can consult this
+ * map to check whether the local index attribute has a corresponding remote
+ * attribute.
+ *
  * XXX: There are no fundamental problems for supporting non-btree indexes.
  * We mostly need to relax the limitations in RelationFindReplTupleByIndex().
  * For partial indexes, the required changes are likely to be larger. If
@@ -811,40 +769,44 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 	foreach(lc, idxlist)
 	{
 		Oid			idxoid = lfirst_oid(lc);
-		bool		isUsableIdx;
-		bool		containsLeftMostCol;
 		Relation	idxRel;
 		IndexInfo  *idxInfo;
+		AttrNumber	keycol;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		idxInfo = BuildIndexInfo(idxRel);
-		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
-		containsLeftMostCol =
-			RemoteRelContainsLeftMostColumnOnIdx(idxInfo, attrmap);
 		index_close(idxRel, AccessShareLock);
 
+		/* Must be a btree index */
+		if (idxInfo->ii_Am != BTREE_AM_OID)
+			continue;
+
+		/* Must be non-partial index */
+		if (idxInfo->ii_Predicate != NIL)
+			continue;
+
+		Assert(idxInfo->ii_NumIndexAttrs >= 1);
+
+		/* The left most index field must not be an expression */
+		keycol = idxInfo->ii_IndexAttrNumbers[0];
+		if (!AttributeNumberIsValid(keycol))
+			continue;
+
+		/*
+		 * The lest most index field must have a reference to a remote
+		 * relation column.
+		 */
+		if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+			(attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0))
+			continue;
+
 		/* Return the first eligible index found */
-		if (isUsableIdx && containsLeftMostCol)
-			return idxoid;
+		return idxoid;
 	}
 
 	return InvalidOid;
 }
 
-/*
- * Returns true if the index is usable for replica identity full. For details,
- * see FindUsableIndexForReplicaIdentityFull.
- */
-bool
-IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
-{
-	bool		is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
-	bool		is_partial = (indexInfo->ii_Predicate != NIL);
-	bool		is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
-
-	return is_btree && !is_partial && !is_only_on_expression;
-}
-
 /*
  * Get replica identity index or if it is not defined a primary key.
  *
#19Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#18)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 12, 2023 at 12:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

I don't think we have concluded any action for it. I agree that
IsIndexOnlyOnExpression() is redundant. We don't need to check *all*
index fields actually. I've attached a draft patch. It removes
IsIndexOnlyOnExpression() and merges
RemoteRelContainsLeftMostColumnOnIdx() to
FindUsableIndexForReplicaIdentityFull. One concern is that we no
longer do the assertion check with
IsIndexUsableForReplicaIdentityFull(). What do you think?

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

--
With Regards,
Amit Kapila.

#20Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#19)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 12, 2023 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 12, 2023 at 12:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

I don't think we have concluded any action for it. I agree that
IsIndexOnlyOnExpression() is redundant. We don't need to check *all*
index fields actually. I've attached a draft patch. It removes
IsIndexOnlyOnExpression() and merges
RemoteRelContainsLeftMostColumnOnIdx() to
FindUsableIndexForReplicaIdentityFull. One concern is that we no
longer do the assertion check with
IsIndexUsableForReplicaIdentityFull(). What do you think?

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

You mean to pass ApplyExecutionData or attrmap down to
RelationFindReplTupleByIndex()? I think it would be better to call it
from FindReplTupleInLocalRel() instead.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#21Önder Kalacı
onderkalaci@gmail.com
In reply to: Amit Kapila (#19)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Hi Amit, all

Amit Kapila <amit.kapila16@gmail.com>, 12 Tem 2023 Çar, 13:09 tarihinde
şunu yazdı:

On Wed, Jul 12, 2023 at 12:31 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com>

wrote:

I don't think we have concluded any action for it. I agree that
IsIndexOnlyOnExpression() is redundant. We don't need to check *all*
index fields actually. I've attached a draft patch. It removes
IsIndexOnlyOnExpression() and merges
RemoteRelContainsLeftMostColumnOnIdx() to
FindUsableIndexForReplicaIdentityFull. One concern is that we no
longer do the assertion check with
IsIndexUsableForReplicaIdentityFull(). What do you think?

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

I think such an approach is slightly better than the proposed changes on
remove_redundant_check.patch

I think one reason we ended up with IsIndexUsableForReplicaIdentityFull()
is that it
is a nice way for documenting the requirements in the code.

However, as you also alluded to in the
thread, RemoteRelContainsLeftMostColumnOnIdx()
breaks this documentation.

I agree that it is nice to have all the logic to be in the same place. I
think remove_redundant_check.patch
does that by inlining IsIndexUsableForReplicaIdentityFull
and RemoteRelContainsLeftMostColumnOnIdx
into FindUsableIndexForReplicaIdentityFull().

As Amit noted, the other way around might be more interesting. We expand
IsIndexUsableForReplicaIdentityFull() such that it also includes
RemoteRelContainsLeftMostColumnOnIdx(). With that, readers of
IsIndexUsableForReplicaIdentityFull() can follow the requirements slightly
easier.

Though, not sure yet if we can get all the necessary information for the
Assert
via ApplyExecutionData in FindReplTupleInLocalRel. Perhaps yes.

Thanks,
Onder

#22Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#18)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 12, 2023 at 5:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

Here are my comments for v4.

======

Docs/Comments:

====

Agreed. I've attached the updated patch. I'll push it barring any objections.

I checked v5-0001 and noticed the following:

======
doc/src/sgml/logical-replication.sgml

BEFORE
... and the leftmost index field must be a column (not an expression)
that reference a published table column.

SUGGESTION ("references the", instead of "reference a")
... and the leftmost index field must be a column (not an expression)
that references the published table column.

(maybe that last word "column" is also unnecessary?)

======
src/backend/replication/logical/relation.c

BEFORE
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that reference the remote relation.

SUGGESTION ("references", instead of "reference")
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that references the remote relation.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#23Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Smith (#22)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 13, 2023 at 8:03 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Jul 12, 2023 at 5:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

Here are my comments for v4.

======

Docs/Comments:

====

Agreed. I've attached the updated patch. I'll push it barring any objections.

I checked v5-0001 and noticed the following:

======
doc/src/sgml/logical-replication.sgml

BEFORE
... and the leftmost index field must be a column (not an expression)
that reference a published table column.

SUGGESTION ("references the", instead of "reference a")
... and the leftmost index field must be a column (not an expression)
that references the published table column.

Thanks, will fix.

(maybe that last word "column" is also unnecessary?)

But an index column doesn't reference the published table, but the
published table's column, no?

======
src/backend/replication/logical/relation.c

BEFORE
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that reference the remote relation.

SUGGESTION ("references", instead of "reference")
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that references the remote relation.

Will fix.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#24Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#23)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 13, 2023 at 11:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jul 13, 2023 at 8:03 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Jul 12, 2023 at 5:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

...

I checked v5-0001 and noticed the following:

======
doc/src/sgml/logical-replication.sgml

BEFORE
... and the leftmost index field must be a column (not an expression)
that reference a published table column.

SUGGESTION ("references the", instead of "reference a")
... and the leftmost index field must be a column (not an expression)
that references the published table column.

Thanks, will fix.

(maybe that last word "column" is also unnecessary?)

But an index column doesn't reference the published table, but the
published table's column, no?

Yeah, but there is some inconsistency with the other code comment that
just says "... that references the remote relation.", so I thought one
of them needs to change. If not this one, then the other one.

======
src/backend/replication/logical/relation.c

BEFORE
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that reference the remote relation.

SUGGESTION ("references", instead of "reference")
The index must be btree, non-partial, and the leftmost field must be a
column (not an expression) that references the remote relation.

Will fix.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#25Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Peter Smith (#24)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 13, 2023 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Thu, Jul 13, 2023 at 11:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jul 13, 2023 at 8:03 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Jul 12, 2023 at 5:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

...

I checked v5-0001 and noticed the following:

======
doc/src/sgml/logical-replication.sgml

BEFORE
... and the leftmost index field must be a column (not an expression)
that reference a published table column.

SUGGESTION ("references the", instead of "reference a")
... and the leftmost index field must be a column (not an expression)
that references the published table column.

Thanks, will fix.

(maybe that last word "column" is also unnecessary?)

But an index column doesn't reference the published table, but the
published table's column, no?

Yeah, but there is some inconsistency with the other code comment that
just says "... that references the remote relation.", so I thought one
of them needs to change. If not this one, then the other one.

Right. So let's add "column" in both places. Attached the updated patch

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v6-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchapplication/octet-stream; name=v6-0001-Doc-clarify-the-conditions-of-usable-indexes-for-.patchDownload
From 267c54b68dccb966ad2ed82c22ece2a0e8a438d1 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <msawada@postgresql.org>
Date: Thu, 13 Jul 2023 10:23:52 +0900
Subject: [PATCH v6] Doc: clarify the conditions of usable indexes for REPLICA
 IDENTITY FULL tables.

Commit 89e46da5e allowed REPLICA IDENTITY FULL tables to use an index
on the subscriber during apply of update/delete. This commit clarifies
in the documentation that the leftmost field of candidate indexes must
be a column (not an expression) that references the published relation
column.

The source code comments are also updated accordingly.

Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAD21AoDJjffEvUFKXT27Q5U8-UU9JHv4rrJ9Ke8Zkc5UPWHLvA@mail.gmail.com
Backpatch-through: 16
---
 doc/src/sgml/logical-replication.sgml      | 12 ++++++------
 src/backend/executor/execReplication.c     |  6 +++---
 src/backend/replication/logical/relation.c | 11 ++++-------
 3 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index c5de2040f7..c2a749d882 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -134,12 +134,12 @@
    to replica identity <literal>FULL</literal>, which means the entire row becomes
    the key.  When replica identity <literal>FULL</literal> is specified,
    indexes can be used on the subscriber side for searching the rows.  Candidate
-   indexes must be btree, non-partial, and have at least one column reference
-   (i.e. cannot consist of only expressions).  These restrictions
-   on the non-unique index properties adhere to some of the restrictions that
-   are enforced for primary keys.  If there are no such suitable indexes,
-   the search on the subscriber side can be very inefficient, therefore
-   replica identity <literal>FULL</literal> should only be used as a
+   indexes must be btree, non-partial, and the leftmost index field must be a
+   column (not an expression) that references the published table column.  These
+   restrictions on the non-unique index properties adhere to some of the
+   restrictions that are enforced for primary keys.  If there are no such
+   suitable indexes, the search on the subscriber side can be very inefficient,
+   therefore replica identity <literal>FULL</literal> should only be used as a
    fallback if no other solution is possible.  If a replica identity other
    than <literal>FULL</literal> is set on the publisher side, a replica identity
    comprising the same or fewer columns must also be set on the subscriber
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..af09342881 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -47,9 +47,9 @@ static bool tuples_equal(TupleTableSlot *slot1, TupleTableSlot *slot2,
  *
  * Returns how many columns to use for the index scan.
  *
- * This is not generic routine, it expects the idxrel to be a btree, non-partial
- * and have at least one column reference (i.e. cannot consist of only
- * expressions).
+ * This is not generic routine, idxrel must be PK, RI, or an index that can be
+ * used for REPLICA IDENTITY FULL table. See FindUsableIndexForReplicaIdentityFull()
+ * for details.
  *
  * By definition, replication identity of a rel meets all limitations associated
  * with that. Note that any other index could also meet these limitations.
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..c545b90636 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -779,9 +779,10 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
 
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree, non-partial, and have at least
- * one column reference (i.e. cannot consist of only expressions). These
- * limitations help to keep the index scan similar to PK/RI index scans.
+ * the relation. The index must be btree, non-partial, and the leftmost
+ * field must be a column (not an expression) that references the remote
+ * relation column. These limitations help to keep the index scan similar
+ * to PK/RI index scans.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
@@ -796,10 +797,6 @@ RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
  * none of the tuples satisfy the expression for the index scan, we fall-back
  * to sequential execution, which might not be a good idea in some cases.
  *
- * We also skip indexes if the remote relation does not contain the leftmost
- * column of the index. This is because in most such cases sequential scan is
- * favorable over index scan.
- *
  * We expect to call this function when REPLICA IDENTITY FULL is defined for
  * the remote relation.
  *
-- 
2.31.1

#26Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#25)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 13, 2023 at 12:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jul 13, 2023 at 11:12 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Thu, Jul 13, 2023 at 11:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Thu, Jul 13, 2023 at 8:03 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, Jul 12, 2023 at 5:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

...

I checked v5-0001 and noticed the following:

======
doc/src/sgml/logical-replication.sgml

BEFORE
... and the leftmost index field must be a column (not an expression)
that reference a published table column.

SUGGESTION ("references the", instead of "reference a")
... and the leftmost index field must be a column (not an expression)
that references the published table column.

Thanks, will fix.

(maybe that last word "column" is also unnecessary?)

But an index column doesn't reference the published table, but the
published table's column, no?

Yeah, but there is some inconsistency with the other code comment that
just says "... that references the remote relation.", so I thought one
of them needs to change. If not this one, then the other one.

Right. So let's add "column" in both places. Attached the updated patch

v6-0001 LGTM.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#27Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#20)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 12, 2023 at 11:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 12, 2023 at 7:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Wed, Jul 12, 2023 at 12:31 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Tue, Jul 11, 2023 at 5:31 PM Peter Smith <smithpb2250@gmail.com> wrote:

I don't think we have concluded any action for it. I agree that
IsIndexOnlyOnExpression() is redundant. We don't need to check *all*
index fields actually. I've attached a draft patch. It removes
IsIndexOnlyOnExpression() and merges
RemoteRelContainsLeftMostColumnOnIdx() to
FindUsableIndexForReplicaIdentityFull. One concern is that we no
longer do the assertion check with
IsIndexUsableForReplicaIdentityFull(). What do you think?

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

You mean to pass ApplyExecutionData or attrmap down to
RelationFindReplTupleByIndex()? I think it would be better to call it
from FindReplTupleInLocalRel() instead.

I've attached a draft patch for this idea.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

remove_redundant_check_v2.patchapplication/octet-stream; name=remove_redundant_check_v2.patchDownload
diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index 9dd7168461..e2b29cd9bd 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -132,7 +132,7 @@ build_replindex_scan_key(ScanKey skey, Relation rel, Relation idxrel,
  * contents, and return true.  Return false otherwise.
  */
 bool
-RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
+RelationFindReplTupleByIndex(Relation rel, Relation idxrel,
 							 LockTupleMode lockmode,
 							 TupleTableSlot *searchslot,
 							 TupleTableSlot *outslot)
@@ -142,15 +142,12 @@ RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
 	IndexScanDesc scan;
 	SnapshotData snap;
 	TransactionId xwait;
-	Relation	idxrel;
 	bool		found;
 	TypeCacheEntry **eq = NULL;
 	bool		isIdxSafeToSkipDuplicates;
 
-	/* Open the index. */
-	idxrel = index_open(idxoid, RowExclusiveLock);
-
-	isIdxSafeToSkipDuplicates = (GetRelationIdentityOrPK(rel) == idxoid);
+	isIdxSafeToSkipDuplicates =
+		(GetRelationIdentityOrPK(rel) == RelationGetRelid(idxrel));
 
 	InitDirtySnapshot(snap);
 
@@ -175,16 +172,7 @@ retry:
 		if (!isIdxSafeToSkipDuplicates)
 		{
 			if (eq == NULL)
-			{
-#ifdef USE_ASSERT_CHECKING
-				/* apply assertions only once for the input idxoid */
-				IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
-
-				Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
-#endif
-
 				eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
-			}
 
 			if (!tuples_equal(outslot, searchslot, eq))
 				continue;
@@ -260,9 +248,6 @@ retry:
 
 	index_endscan(scan);
 
-	/* Don't release lock until commit. */
-	index_close(idxrel, NoLock);
-
 	return found;
 }
 
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index 57ad22b48a..57a5de44b1 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -731,52 +731,6 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
 	return entry;
 }
 
-/*
- * Returns true if the given index consists only of expressions such as:
- * 	CREATE INDEX idx ON table(foo(col));
- *
- * Returns false even if there is one column reference:
- * 	 CREATE INDEX idx ON table(foo(col), col_2);
- */
-static bool
-IsIndexOnlyOnExpression(IndexInfo *indexInfo)
-{
-	for (int i = 0; i < indexInfo->ii_NumIndexKeyAttrs; i++)
-	{
-		AttrNumber	attnum = indexInfo->ii_IndexAttrNumbers[i];
-
-		if (AttributeNumberIsValid(attnum))
-			return false;
-	}
-
-	return true;
-}
-
-/*
- * Returns true if the attrmap contains the leftmost column of the index.
- * Otherwise returns false.
- *
- * attrmap is a map of local attributes to remote ones. We can consult this
- * map to check whether the local index attribute has a corresponding remote
- * attribute.
- */
-static bool
-RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
-{
-	AttrNumber	keycol;
-
-	Assert(indexInfo->ii_NumIndexAttrs >= 1);
-
-	keycol = indexInfo->ii_IndexAttrNumbers[0];
-	if (!AttributeNumberIsValid(keycol))
-		return false;
-
-	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
-		return false;
-
-	return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
-}
-
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
  * the relation. The index must be btree, non-partial, and have at least
@@ -815,19 +769,16 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 	{
 		Oid			idxoid = lfirst_oid(lc);
 		bool		isUsableIdx;
-		bool		containsLeftMostCol;
 		Relation	idxRel;
 		IndexInfo  *idxInfo;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		idxInfo = BuildIndexInfo(idxRel);
-		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
-		containsLeftMostCol =
-			RemoteRelContainsLeftMostColumnOnIdx(idxInfo, attrmap);
+		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo, attrmap);
 		index_close(idxRel, AccessShareLock);
 
 		/* Return the first eligible index found */
-		if (isUsableIdx && containsLeftMostCol)
+		if (isUsableidx)
 			return idxoid;
 	}
 
@@ -837,15 +788,35 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 /*
  * Returns true if the index is usable for replica identity full. For details,
  * see FindUsableIndexForReplicaIdentityFull.
+ *
+ * attrmap is a map of local attributes to remote ones. We can consult this
+ * map to check whether the local index attribute has a corresponding remote
+ * attribute.
  */
 bool
-IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap)
 {
 	bool		is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
 	bool		is_partial = (indexInfo->ii_Predicate != NIL);
-	bool		is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+	AttrNumber	keycol;
 
-	return is_btree && !is_partial && !is_only_on_expression;
+	/* Must be btree and non-partial index */
+	if (!is_btree || is_partial)
+		return false;
+
+	Assert(indexInfo->ii_NumIndexAttrs >= 1);
+
+	/* The leftmost index field must not be an expression */
+	keycol = indexInfo->ii_IndexAttrNumbers[0];
+	if (!AttributeNumberIsValid(keycol))
+		return false;
+
+	/* And must reference the remote relation column */
+	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+		attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
+		return false;
+
+	return true;
 }
 
 /*
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index 0ee764d68f..6c09334ffc 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -78,7 +78,7 @@
  *
  * To avoid this, and similar prepare confusions the subscription's two_phase
  * commit is enabled only after the initial sync is over. The two_phase option
- * has been implemented as a tri-state with values DISABLED, PENDING, and
+n * has been implemented as a tri-state with values DISABLED, PENDING, and
  * ENABLED.
  *
  * Even if the user specifies they want a subscription with two_phase = on,
@@ -140,6 +140,7 @@
 #include <sys/stat.h>
 #include <unistd.h>
 
+#include "access/genam.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/twophase.h"
@@ -410,7 +411,7 @@ static void apply_handle_delete_internal(ApplyExecutionData *edata,
 										 ResultRelInfo *relinfo,
 										 TupleTableSlot *remoteslot,
 										 Oid localindexoid);
-static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+static bool FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 									LogicalRepRelation *remoterel,
 									Oid localidxoid,
 									TupleTableSlot *remoteslot,
@@ -2677,7 +2678,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel,
+	found = FindReplTupleInLocalRel(edata, localrel,
 									&relmapentry->remoterel,
 									localindexoid,
 									remoteslot, &localslot);
@@ -2830,7 +2831,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel, remoterel, localindexoid,
+	found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid,
 									remoteslot, &localslot);
 
 	/* If found delete it. */
@@ -2869,12 +2870,13 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
  * Local tuple, if found, is returned in '*localslot'.
  */
 static bool
-FindReplTupleInLocalRel(EState *estate, Relation localrel,
+FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 						LogicalRepRelation *remoterel,
 						Oid localidxoid,
 						TupleTableSlot *remoteslot,
 						TupleTableSlot **localslot)
 {
+	EState	   *estate = edata->estate;
 	bool		found;
 
 	/*
@@ -2889,9 +2891,24 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(localidxoid))
-		found = RelationFindReplTupleByIndex(localrel, localidxoid,
+	{
+		Relation	idxrel;
+
+		/* Open the index. */
+		idxrel = index_open(localidxoid, RowExclusiveLock);
+
+		/* Index must be PK, RI, or usable for REPLICA IDENTITY FULL tables */
+		Assert(GetRelationIdentityOrPK(idxrel) == localidxoid ||
+			   IsIndexUsableForReplicaIdentityFull(BuildIndexInfo(idxrel),
+												   edata->targetRel->attrmap));
+
+		found = RelationFindReplTupleByIndex(localrel, idxrel,
 											 LockTupleExclusive,
 											 remoteslot, *localslot);
+
+		/* Don't release lock until commit. */
+		index_close(idxrel, NoLock);
+	}
 	else
 		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
 										 remoteslot, *localslot);
@@ -3009,7 +3026,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
 				bool		found;
 
 				/* Get the matching local tuple from the partition. */
-				found = FindReplTupleInLocalRel(estate, partrel,
+				found = FindReplTupleInLocalRel(edata, partrel,
 												&part_entry->remoterel,
 												part_entry->localindexoid,
 												remoteslot_part, &localslot);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index ac02247947..92153da1df 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -646,7 +646,7 @@ extern void check_exclusion_constraint(Relation heap, Relation index,
 /*
  * prototypes from functions in execReplication.c
  */
-extern bool RelationFindReplTupleByIndex(Relation rel, Oid idxoid,
+extern bool RelationFindReplTupleByIndex(Relation rel, Relation idxrel,
 										 LockTupleMode lockmode,
 										 TupleTableSlot *searchslot,
 										 TupleTableSlot *outslot);
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 921b9974db..3f4d906d74 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -48,7 +48,7 @@ extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *r
 														Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
-extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo);
+extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap);
 extern Oid	GetRelationIdentityOrPK(Relation rel);
 
 #endif							/* LOGICALRELATION_H */
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#27)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Thu, Jul 13, 2023 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 12, 2023 at 11:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

You mean to pass ApplyExecutionData or attrmap down to
RelationFindReplTupleByIndex()? I think it would be better to call it
from FindReplTupleInLocalRel() instead.

I've attached a draft patch for this idea.

Looks reasonable to me. However, I am not very sure if we need to
change the prototype of RelationFindReplTupleByIndex(). Few other
minor comments:

1.
- * has been implemented as a tri-state with values DISABLED, PENDING, and
+n * has been implemented as a tri-state with values DISABLED, PENDING, and
  * ENABLED.
  *
The above seems like a spurious change.
2.
+ /* And must reference the remote relation column */
+ if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+ attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
+ return false;
+

I think we should specify the reason for this. I see that in the
commit fd48a86c62 [1]- * We also skip indexes if the remote relation does not contain the leftmost - * column of the index. This is because in most such cases sequential scan is - * favorable over index scan., the reason for this is removed. Shouldn't we
retain that in some form?

[1]: - * We also skip indexes if the remote relation does not contain the leftmost - * column of the index. This is because in most such cases sequential scan is - * favorable over index scan.
- * We also skip indexes if the remote relation does not contain the leftmost
- * column of the index. This is because in most such cases sequential scan is
- * favorable over index scan.

--
With Regards,
Amit Kapila.

#29Önder Kalacı
onderkalaci@gmail.com
In reply to: Masahiko Sawada (#27)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Hi,

I've attached a draft patch for this idea.

I think it needs a rebase after edca3424342da323499a1998d18a888283e52ac7.

Also, as discussed in [1]/messages/by-id/CAA4eK1Jcyrxt_84wt2=QnOcwwJEC2et+tCLjAuTXzE6N3FXqQw@mail.gmail.com, I think one improvement we had was to
keep IsIndexUsableForReplicaIdentityFull() in a way that it is easier to
read & better documented in the code. So, it would be nice to stick to that.

Overall, the proposed changes make sense to me as well. Once the patch is
ready, I'm happy to review & test in detail.

Thanks,
Onder

[1]: /messages/by-id/CAA4eK1Jcyrxt_84wt2=QnOcwwJEC2et+tCLjAuTXzE6N3FXqQw@mail.gmail.com
/messages/by-id/CAA4eK1Jcyrxt_84wt2=QnOcwwJEC2et+tCLjAuTXzE6N3FXqQw@mail.gmail.com

#30Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#28)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Sat, Jul 15, 2023 at 2:11 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Thu, Jul 13, 2023 at 10:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Wed, Jul 12, 2023 at 11:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I think this is a valid concern. Can't we move all the checks
(including the remote attrs check) inside
IsIndexUsableForReplicaIdentityFull() and then call it from both
places? Won't we have attrmap information available in the callers of
FindReplTupleInLocalRel() via ApplyExecutionData?

You mean to pass ApplyExecutionData or attrmap down to
RelationFindReplTupleByIndex()? I think it would be better to call it
from FindReplTupleInLocalRel() instead.

I've attached a draft patch for this idea.

Looks reasonable to me. However, I am not very sure if we need to
change the prototype of RelationFindReplTupleByIndex(). Few other
minor comments:

Agreed. I reverted the change.

1.
- * has been implemented as a tri-state with values DISABLED, PENDING, and
+n * has been implemented as a tri-state with values DISABLED, PENDING, and
* ENABLED.
*
The above seems like a spurious change.

Fixed.

2.
+ /* And must reference the remote relation column */
+ if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+ attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
+ return false;
+

I think we should specify the reason for this. I see that in the
commit fd48a86c62 [1], the reason for this is removed. Shouldn't we
retain that in some form?

Agreed.

I've updated the patch. Regarding one change in the patch:

  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree or hash, non-partial, and the leftmost
- * field must be a column (not an expression) that references the remote
- * relation column. These limitations help to keep the index scan similar
- * to PK/RI index scans.
+ * the relation.

I moved the second sentence to IsIndexUsableForReplicaIdentityFull()
because this function is now responsible for checking if the given
index is usable for REPLICA IDENTITY FULL tables. I think it would be
better to mention all conditions for such usable indexes in one place.
Currently, the conditions are explained in
FindUsableIndexForReplicaIdentityFull() but the checks are performed
and the details are explained in
IsIndexUsableForReplicaIdentityFull().

BTW, IsIndexOnlyExpression() is not necessary but the current code
still works fine. So do we need to backpatch it to PG16? I'm thinking
we can apply it to only HEAD.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchapplication/octet-stream; name=v3-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchDownload
From da4eb0de3eda7792b1f8fd0de7ff19e4dfeadba5 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Jul 2023 15:11:27 +0900
Subject: [PATCH v3] Remove unnecessary checks for indexes for REPLICA IDENTITY
 FULL tables.

Previously, when selecting an usable index for update/delete for the
REPLICA IDENTITY FULL table, in IsIndexonlyExpression(), we used to
check if all index fields are not expressions. However, it was not
necessary actually, because it is enough to check if only the leftmost
index field is not an expression (and references the remote table
column) and this check has already been checked by
RemoteRelContainsLeftMostColumnOnIdx().

This commit removes IsIndexOnlyExpression() and
RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable
indexes for REPLICA IDENTITY FULL tables are now performed by
IsIndexUsableForReplicaIdentityFull().

Reviewed-by:
Discussion: https://postgr.es/m/
---
 src/backend/executor/execReplication.c     |  9 ---
 src/backend/replication/logical/relation.c | 92 ++++++++--------------
 src/backend/replication/logical/worker.c   | 24 ++++--
 src/include/replication/logicalrelation.h  |  2 +-
 4 files changed, 51 insertions(+), 76 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index e776524227..81f27042bc 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -222,16 +222,7 @@ retry:
 		if (!isIdxSafeToSkipDuplicates)
 		{
 			if (eq == NULL)
-			{
-#ifdef USE_ASSERT_CHECKING
-				/* apply assertions only once for the input idxoid */
-				IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
-
-				Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
-#endif
-
 				eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
-			}
 
 			if (!tuples_equal(outslot, searchslot, eq))
 				continue;
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index ed57e5d2b6..ad960f7bad 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -734,58 +734,9 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
 	return entry;
 }
 
-/*
- * Returns true if the given index consists only of expressions such as:
- * 	CREATE INDEX idx ON table(foo(col));
- *
- * Returns false even if there is one column reference:
- * 	 CREATE INDEX idx ON table(foo(col), col_2);
- */
-static bool
-IsIndexOnlyOnExpression(IndexInfo *indexInfo)
-{
-	for (int i = 0; i < indexInfo->ii_NumIndexKeyAttrs; i++)
-	{
-		AttrNumber	attnum = indexInfo->ii_IndexAttrNumbers[i];
-
-		if (AttributeNumberIsValid(attnum))
-			return false;
-	}
-
-	return true;
-}
-
-/*
- * Returns true if the attrmap contains the leftmost column of the index.
- * Otherwise returns false.
- *
- * attrmap is a map of local attributes to remote ones. We can consult this
- * map to check whether the local index attribute has a corresponding remote
- * attribute.
- */
-static bool
-RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
-{
-	AttrNumber	keycol;
-
-	Assert(indexInfo->ii_NumIndexAttrs >= 1);
-
-	keycol = indexInfo->ii_IndexAttrNumbers[0];
-	if (!AttributeNumberIsValid(keycol))
-		return false;
-
-	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
-		return false;
-
-	return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
-}
-
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree or hash, non-partial, and the leftmost
- * field must be a column (not an expression) that references the remote
- * relation column. These limitations help to keep the index scan similar
- * to PK/RI index scans.
+ * the relation.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
@@ -815,19 +766,16 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 	{
 		Oid			idxoid = lfirst_oid(lc);
 		bool		isUsableIdx;
-		bool		containsLeftMostCol;
 		Relation	idxRel;
 		IndexInfo  *idxInfo;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		idxInfo = BuildIndexInfo(idxRel);
-		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
-		containsLeftMostCol =
-			RemoteRelContainsLeftMostColumnOnIdx(idxInfo, attrmap);
+		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo, attrmap);
 		index_close(idxRel, AccessShareLock);
 
 		/* Return the first eligible index found */
-		if (isUsableIdx && containsLeftMostCol)
+		if (isUsableIdx)
 			return idxoid;
 	}
 
@@ -835,11 +783,13 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 }
 
 /*
- * Returns true if the index is usable for replica identity full. For details,
- * see FindUsableIndexForReplicaIdentityFull.
+ * Returns true if the index is usable for replica identity full.
+ *
+ * The index must be btree or hash, non-partial, and the leftmost field must be
+ * a column (not an expression) that references the remote relation column. These
+ * limitations help to keep the index scan similar to PK/RI index scans.
  *
- * Currently, only Btree and Hash indexes can be returned as usable. This
- * is due to following reasons:
+ * The reasons why only Btree and Hash indexes can be considered as usable are:
  *
  * 1) Other index access methods don't have a fixed strategy for equality
  * operation. Refer get_equal_strategy_number_for_am().
@@ -851,16 +801,36 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
  *
  * XXX: Note that BRIN and GIN indexes do not implement "amgettuple" which
  * will be used later to fetch the tuples. See RelationFindReplTupleByIndex().
+ *
+ * attrmap is a map of local attributes to remote ones. We can consult this
+ * map to check whether the local index attribute has a corresponding remote
+ * attribute.
  */
 bool
-IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap)
 {
+	AttrNumber	keycol;
+
 	/* Ensure that the index access method has a valid equal strategy */
 	if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
 		return false;
 	if (indexInfo->ii_Predicate != NIL)
 		return false;
-	if (IsIndexOnlyOnExpression(indexInfo))
+
+	Assert(indexInfo->ii_NumIndexAttrs >= 1);
+
+	/* The leftmost index field must not be an expression */
+	keycol = indexInfo->ii_IndexAttrNumbers[0];
+	if (!AttributeNumberIsValid(keycol))
+		return false;
+
+	/*
+	 * And must reference the remote relation column. This is because if it
+	 * doesn't, the sequential scan is favorable over index scan in most
+	 * cases..
+	 */
+	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+		attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
 		return false;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dd353fd1cb..0dd69739fb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -140,6 +140,7 @@
 #include <sys/stat.h>
 #include <unistd.h>
 
+#include "access/genam.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/twophase.h"
@@ -410,7 +411,7 @@ static void apply_handle_delete_internal(ApplyExecutionData *edata,
 										 ResultRelInfo *relinfo,
 										 TupleTableSlot *remoteslot,
 										 Oid localindexoid);
-static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+static bool FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 									LogicalRepRelation *remoterel,
 									Oid localidxoid,
 									TupleTableSlot *remoteslot,
@@ -2663,7 +2664,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel,
+	found = FindReplTupleInLocalRel(edata, localrel,
 									&relmapentry->remoterel,
 									localindexoid,
 									remoteslot, &localslot);
@@ -2816,7 +2817,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel, remoterel, localindexoid,
+	found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid,
 									remoteslot, &localslot);
 
 	/* If found delete it. */
@@ -2855,12 +2856,13 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
  * Local tuple, if found, is returned in '*localslot'.
  */
 static bool
-FindReplTupleInLocalRel(EState *estate, Relation localrel,
+FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 						LogicalRepRelation *remoterel,
 						Oid localidxoid,
 						TupleTableSlot *remoteslot,
 						TupleTableSlot **localslot)
 {
+	EState	   *estate = edata->estate;
 	bool		found;
 
 	/*
@@ -2875,9 +2877,21 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(localidxoid))
+	{
+#ifdef USE_ASSERT_CHECKING
+		Relation	idxrel = index_open(localidxoid, AccessShareLock);
+
+		/* Index must be PK, RI, or usable for REPLICA IDENTITY FULL tables */
+		Assert(GetRelationIdentityOrPK(idxrel) == localidxoid ||
+			   IsIndexUsableForReplicaIdentityFull(BuildIndexInfo(idxrel),
+												   edata->targetRel->attrmap));
+		index_close(idxrel, AccessShareLock);
+#endif
+
 		found = RelationFindReplTupleByIndex(localrel, localidxoid,
 											 LockTupleExclusive,
 											 remoteslot, *localslot);
+	}
 	else
 		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
 										 remoteslot, *localslot);
@@ -2995,7 +3009,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
 				bool		found;
 
 				/* Get the matching local tuple from the partition. */
-				found = FindReplTupleInLocalRel(estate, partrel,
+				found = FindReplTupleInLocalRel(edata, partrel,
 												&part_entry->remoterel,
 												part_entry->localindexoid,
 												remoteslot_part, &localslot);
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 921b9974db..3f4d906d74 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -48,7 +48,7 @@ extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *r
 														Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
-extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo);
+extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap);
 extern Oid	GetRelationIdentityOrPK(Relation rel);
 
 #endif							/* LOGICALRELATION_H */
-- 
2.31.1

#31Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#30)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Tue, Jul 18, 2023 at 12:10 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

BTW, IsIndexOnlyExpression() is not necessary but the current code
still works fine. So do we need to backpatch it to PG16? I'm thinking
we can apply it to only HEAD.

Either way is fine but I think if we backpatch it then the code
remains consistent and the backpatching would be easier.

--
With Regards,
Amit Kapila.

#32Önder Kalacı
onderkalaci@gmail.com
In reply to: Masahiko Sawada (#30)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

Hi Masahiko, Amit, all

I've updated the patch.

I think the flow is much nicer now compared to the HEAD. I really don't
have any
comments regarding the accuracy of the code changes, all looks good to me.
Overall, I cannot see any behavioral changes as you already alluded to.

Maybe few minor notes regarding the comments:

/*

+ * And must reference the remote relation column. This is because if it
+ * doesn't, the sequential scan is favorable over index scan in most
+ * cases..
+ */

I think the reader might have lost the context (or say in the future due to
another refactor etc). So maybe start with:

/* And the leftmost index field must refer to the ...

Also, now in IsIndexUsableForReplicaIdentityFull() some of the conditions
have comments
some don't. Should we comment on the missing ones as well, maybe such as:

/* partial indexes are not support *

if (indexInfo->ii_Predicate != NIL)

and,

/* all indexes must have at least one attribute */
Assert(indexInfo->ii_NumIndexAttrs >= 1);

BTW, IsIndexOnlyExpression() is not necessary but the current code
still works fine. So do we need to backpatch it to PG16? I'm thinking
we can apply it to only HEAD.

Either way is fine but I think if we backpatch it then the code
remains consistent and the backpatching would be easier.

Yeah, I also have a slight preference for backporting. It could make it
easier to maintain the code
in the future in case of another backport(s). With the cost of making it
slightly harder for you now :)

Thanks,
Onder

#33Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Önder Kalacı (#32)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Wed, Jul 19, 2023 at 5:09 PM Önder Kalacı <onderkalaci@gmail.com> wrote:

Hi Masahiko, Amit, all

I've updated the patch.

I think the flow is much nicer now compared to the HEAD. I really don't have any
comments regarding the accuracy of the code changes, all looks good to me.
Overall, I cannot see any behavioral changes as you already alluded to.

Thank you for reviewing the patch.

Maybe few minor notes regarding the comments:

/*
+ * And must reference the remote relation column. This is because if it
+ * doesn't, the sequential scan is favorable over index scan in most
+ * cases..
+ */

I think the reader might have lost the context (or say in the future due to
another refactor etc). So maybe start with:

/* And the leftmost index field must refer to the ...

Fixed.

Also, now in IsIndexUsableForReplicaIdentityFull() some of the conditions have comments
some don't. Should we comment on the missing ones as well, maybe such as:

/* partial indexes are not support *
if (indexInfo->ii_Predicate != NIL)

and,

/* all indexes must have at least one attribute */
Assert(indexInfo->ii_NumIndexAttrs >= 1);

Agreed. But I don't think the latter comment is necessary as it's obvious.

BTW, IsIndexOnlyExpression() is not necessary but the current code
still works fine. So do we need to backpatch it to PG16? I'm thinking
we can apply it to only HEAD.

Either way is fine but I think if we backpatch it then the code
remains consistent and the backpatching would be easier.

Yeah, I also have a slight preference for backporting. It could make it easier to maintain the code
in the future in case of another backport(s). With the cost of making it slightly harder for you now :)

Agreed.

I've attached the updated patch. I'll push it early next week, barring
any objections.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchapplication/octet-stream; name=v4-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchDownload
From 1fb3cdddc8a73bdafaca804e0877d97185472019 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Jul 2023 15:11:27 +0900
Subject: [PATCH v4] Remove unnecessary checks for indexes for REPLICA IDENTITY
 FULL tables.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, when selecting an usable index for update/delete for the
REPLICA IDENTITY FULL table, in IsIndexOnlyExpression(), we used to
check if all index fields are not expressions. However, it was not
necessary, because it is enough to check if only the leftmost index
field is not an expression (and references the remote table column)
and this check has already been done by
RemoteRelContainsLeftMostColumnOnIdx().

This commit removes IsIndexOnlyExpression() and
RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable
indexes for REPLICA IDENTITY FULL tables are now performed by
IsIndexUsableForReplicaIdentityFull().

Backpatch this to remain the code consistent.

Reported-by: Peter Smith
Reviewed-by: Amit Kapila, Önder Kalacı
Discussion: https://postgr.es/m/CAHut%2BPsGRE5WSsY0jcLHJEoA17MrbP9yy8FxdjC_ZOAACxbt%2BQ%40mail.gmail.com
Backpatch-through: 16
---
 src/backend/executor/execReplication.c     |  9 ---
 src/backend/replication/logical/relation.c | 94 ++++++++--------------
 src/backend/replication/logical/worker.c   | 24 ++++--
 src/include/replication/logicalrelation.h  |  2 +-
 4 files changed, 53 insertions(+), 76 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index e776524227..81f27042bc 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -222,16 +222,7 @@ retry:
 		if (!isIdxSafeToSkipDuplicates)
 		{
 			if (eq == NULL)
-			{
-#ifdef USE_ASSERT_CHECKING
-				/* apply assertions only once for the input idxoid */
-				IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
-
-				Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
-#endif
-
 				eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
-			}
 
 			if (!tuples_equal(outslot, searchslot, eq))
 				continue;
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index ed57e5d2b6..6ca9839fe0 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -734,58 +734,9 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
 	return entry;
 }
 
-/*
- * Returns true if the given index consists only of expressions such as:
- * 	CREATE INDEX idx ON table(foo(col));
- *
- * Returns false even if there is one column reference:
- * 	 CREATE INDEX idx ON table(foo(col), col_2);
- */
-static bool
-IsIndexOnlyOnExpression(IndexInfo *indexInfo)
-{
-	for (int i = 0; i < indexInfo->ii_NumIndexKeyAttrs; i++)
-	{
-		AttrNumber	attnum = indexInfo->ii_IndexAttrNumbers[i];
-
-		if (AttributeNumberIsValid(attnum))
-			return false;
-	}
-
-	return true;
-}
-
-/*
- * Returns true if the attrmap contains the leftmost column of the index.
- * Otherwise returns false.
- *
- * attrmap is a map of local attributes to remote ones. We can consult this
- * map to check whether the local index attribute has a corresponding remote
- * attribute.
- */
-static bool
-RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
-{
-	AttrNumber	keycol;
-
-	Assert(indexInfo->ii_NumIndexAttrs >= 1);
-
-	keycol = indexInfo->ii_IndexAttrNumbers[0];
-	if (!AttributeNumberIsValid(keycol))
-		return false;
-
-	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
-		return false;
-
-	return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
-}
-
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree or hash, non-partial, and the leftmost
- * field must be a column (not an expression) that references the remote
- * relation column. These limitations help to keep the index scan similar
- * to PK/RI index scans.
+ * the relation.
  *
  * Note that the limitations of index scans for replica identity full only
  * adheres to a subset of the limitations of PK/RI. For example, we support
@@ -815,19 +766,16 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 	{
 		Oid			idxoid = lfirst_oid(lc);
 		bool		isUsableIdx;
-		bool		containsLeftMostCol;
 		Relation	idxRel;
 		IndexInfo  *idxInfo;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		idxInfo = BuildIndexInfo(idxRel);
-		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
-		containsLeftMostCol =
-			RemoteRelContainsLeftMostColumnOnIdx(idxInfo, attrmap);
+		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo, attrmap);
 		index_close(idxRel, AccessShareLock);
 
 		/* Return the first eligible index found */
-		if (isUsableIdx && containsLeftMostCol)
+		if (isUsableIdx)
 			return idxoid;
 	}
 
@@ -835,11 +783,13 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 }
 
 /*
- * Returns true if the index is usable for replica identity full. For details,
- * see FindUsableIndexForReplicaIdentityFull.
+ * Returns true if the index is usable for replica identity full.
  *
- * Currently, only Btree and Hash indexes can be returned as usable. This
- * is due to following reasons:
+ * The index must be btree or hash, non-partial, and the leftmost field must be
+ * a column (not an expression) that references the remote relation column. These
+ * limitations help to keep the index scan similar to PK/RI index scans.
+ *
+ * The reasons why only Btree and Hash indexes can be considered as usable are:
  *
  * 1) Other index access methods don't have a fixed strategy for equality
  * operation. Refer get_equal_strategy_number_for_am().
@@ -851,16 +801,38 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
  *
  * XXX: Note that BRIN and GIN indexes do not implement "amgettuple" which
  * will be used later to fetch the tuples. See RelationFindReplTupleByIndex().
+ *
+ * attrmap is a map of local attributes to remote ones. We can consult this
+ * map to check whether the local index attribute has a corresponding remote
+ * attribute.
  */
 bool
-IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap)
 {
+	AttrNumber	keycol;
+
 	/* Ensure that the index access method has a valid equal strategy */
 	if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
 		return false;
+
+	/* The index must not be a partial index */
 	if (indexInfo->ii_Predicate != NIL)
 		return false;
-	if (IsIndexOnlyOnExpression(indexInfo))
+
+	Assert(indexInfo->ii_NumIndexAttrs >= 1);
+
+	/* The leftmost index field must not be an expression */
+	keycol = indexInfo->ii_IndexAttrNumbers[0];
+	if (!AttributeNumberIsValid(keycol))
+		return false;
+
+	/*
+	 * And the leftmost index field must reference the remote relation column.
+	 * This is because if it doesn't, the sequential scan is favorable over
+	 * index scan in most cases.
+	 */
+	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+		attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
 		return false;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dd353fd1cb..0dd69739fb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -140,6 +140,7 @@
 #include <sys/stat.h>
 #include <unistd.h>
 
+#include "access/genam.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/twophase.h"
@@ -410,7 +411,7 @@ static void apply_handle_delete_internal(ApplyExecutionData *edata,
 										 ResultRelInfo *relinfo,
 										 TupleTableSlot *remoteslot,
 										 Oid localindexoid);
-static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+static bool FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 									LogicalRepRelation *remoterel,
 									Oid localidxoid,
 									TupleTableSlot *remoteslot,
@@ -2663,7 +2664,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel,
+	found = FindReplTupleInLocalRel(edata, localrel,
 									&relmapentry->remoterel,
 									localindexoid,
 									remoteslot, &localslot);
@@ -2816,7 +2817,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel, remoterel, localindexoid,
+	found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid,
 									remoteslot, &localslot);
 
 	/* If found delete it. */
@@ -2855,12 +2856,13 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
  * Local tuple, if found, is returned in '*localslot'.
  */
 static bool
-FindReplTupleInLocalRel(EState *estate, Relation localrel,
+FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 						LogicalRepRelation *remoterel,
 						Oid localidxoid,
 						TupleTableSlot *remoteslot,
 						TupleTableSlot **localslot)
 {
+	EState	   *estate = edata->estate;
 	bool		found;
 
 	/*
@@ -2875,9 +2877,21 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(localidxoid))
+	{
+#ifdef USE_ASSERT_CHECKING
+		Relation	idxrel = index_open(localidxoid, AccessShareLock);
+
+		/* Index must be PK, RI, or usable for REPLICA IDENTITY FULL tables */
+		Assert(GetRelationIdentityOrPK(idxrel) == localidxoid ||
+			   IsIndexUsableForReplicaIdentityFull(BuildIndexInfo(idxrel),
+												   edata->targetRel->attrmap));
+		index_close(idxrel, AccessShareLock);
+#endif
+
 		found = RelationFindReplTupleByIndex(localrel, localidxoid,
 											 LockTupleExclusive,
 											 remoteslot, *localslot);
+	}
 	else
 		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
 										 remoteslot, *localslot);
@@ -2995,7 +3009,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
 				bool		found;
 
 				/* Get the matching local tuple from the partition. */
-				found = FindReplTupleInLocalRel(estate, partrel,
+				found = FindReplTupleInLocalRel(edata, partrel,
 												&part_entry->remoterel,
 												part_entry->localindexoid,
 												remoteslot_part, &localslot);
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 921b9974db..3f4d906d74 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -48,7 +48,7 @@ extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *r
 														Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
-extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo);
+extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap);
 extern Oid	GetRelationIdentityOrPK(Relation rel);
 
 #endif							/* LOGICALRELATION_H */
-- 
2.31.1

#34Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#33)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Fri, Jul 21, 2023 at 6:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached the updated patch. I'll push it early next week, barring
any objections.

You have moved most of the comments related to the restriction of
which index can be picked atop IsIndexUsableForReplicaIdentityFull().
Now, the comments related to limitation atop
FindUsableIndexForReplicaIdentityFull() look slightly odd as it refers
to limitations but those limitation were not stated. The comments I am
referring to are: "Note that the limitations of index scans for
replica identity full only .... might not be a good idea in some
cases". Shall we move these as well atop
IsIndexUsableForReplicaIdentityFull()?

--
With Regards,
Amit Kapila.

#35Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#34)
1 attachment(s)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Sat, Jul 22, 2023 at 7:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Jul 21, 2023 at 6:55 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

I've attached the updated patch. I'll push it early next week, barring
any objections.

You have moved most of the comments related to the restriction of
which index can be picked atop IsIndexUsableForReplicaIdentityFull().
Now, the comments related to limitation atop
FindUsableIndexForReplicaIdentityFull() look slightly odd as it refers
to limitations but those limitation were not stated. The comments I am
referring to are: "Note that the limitations of index scans for
replica identity full only .... might not be a good idea in some
cases". Shall we move these as well atop
IsIndexUsableForReplicaIdentityFull()?

Good point.

Looking at neighbor comments, the following comment looks slightly odd to me:

* XXX: See IsIndexUsableForReplicaIdentityFull() to know the challenges in
* supporting indexes other than btree and hash. For partial indexes, the
* required changes are likely to be larger. If none of the tuples satisfy
* the expression for the index scan, we fall-back to sequential execution,
* which might not be a good idea in some cases.

Are the first and second sentences related actually?

I think we can move it as well to
IsIndexUsableForReplicaIdentityFull() with some adjustments. I've
attached the updated patch that incorporated your comment and included
my idea to update the comment.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchapplication/octet-stream; name=v5-0001-Remove-unnecessary-checks-for-indexes-for-REPLICA.patchDownload
From 47578d5fb5cc9ec7690bb3c71e75b8461f5e7553 Mon Sep 17 00:00:00 2001
From: Masahiko Sawada <sawada.mshk@gmail.com>
Date: Tue, 18 Jul 2023 15:11:27 +0900
Subject: [PATCH v5] Remove unnecessary checks for indexes for REPLICA IDENTITY
 FULL tables.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Previously, when selecting an usable index for update/delete for the
REPLICA IDENTITY FULL table, in IsIndexOnlyExpression(), we used to
check if all index fields are not expressions. However, it was not
necessary, because it is enough to check if only the leftmost index
field is not an expression (and references the remote table column)
and this check has already been done by
RemoteRelContainsLeftMostColumnOnIdx().

This commit removes IsIndexOnlyExpression() and
RemoteRelContainsLeftMostColumnOnIdx() and all checks for usable
indexes for REPLICA IDENTITY FULL tables are now performed by
IsIndexUsableForReplicaIdentityFull().

Backpatch this to remain the code consistent.

Reported-by: Peter Smith
Reviewed-by: Amit Kapila, Önder Kalacı
Discussion: https://postgr.es/m/CAHut%2BPsGRE5WSsY0jcLHJEoA17MrbP9yy8FxdjC_ZOAACxbt%2BQ%40mail.gmail.com
Backpatch-through: 16
---
 src/backend/executor/execReplication.c     |   9 --
 src/backend/replication/logical/relation.c | 118 ++++++++-------------
 src/backend/replication/logical/worker.c   |  24 ++++-
 src/include/replication/logicalrelation.h  |   2 +-
 4 files changed, 64 insertions(+), 89 deletions(-)

diff --git a/src/backend/executor/execReplication.c b/src/backend/executor/execReplication.c
index e776524227..81f27042bc 100644
--- a/src/backend/executor/execReplication.c
+++ b/src/backend/executor/execReplication.c
@@ -222,16 +222,7 @@ retry:
 		if (!isIdxSafeToSkipDuplicates)
 		{
 			if (eq == NULL)
-			{
-#ifdef USE_ASSERT_CHECKING
-				/* apply assertions only once for the input idxoid */
-				IndexInfo  *indexInfo = BuildIndexInfo(idxrel);
-
-				Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
-#endif
-
 				eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
-			}
 
 			if (!tuples_equal(outslot, searchslot, eq))
 				continue;
diff --git a/src/backend/replication/logical/relation.c b/src/backend/replication/logical/relation.c
index ed57e5d2b6..d62eefed13 100644
--- a/src/backend/replication/logical/relation.c
+++ b/src/backend/replication/logical/relation.c
@@ -734,71 +734,9 @@ logicalrep_partition_open(LogicalRepRelMapEntry *root,
 	return entry;
 }
 
-/*
- * Returns true if the given index consists only of expressions such as:
- * 	CREATE INDEX idx ON table(foo(col));
- *
- * Returns false even if there is one column reference:
- * 	 CREATE INDEX idx ON table(foo(col), col_2);
- */
-static bool
-IsIndexOnlyOnExpression(IndexInfo *indexInfo)
-{
-	for (int i = 0; i < indexInfo->ii_NumIndexKeyAttrs; i++)
-	{
-		AttrNumber	attnum = indexInfo->ii_IndexAttrNumbers[i];
-
-		if (AttributeNumberIsValid(attnum))
-			return false;
-	}
-
-	return true;
-}
-
-/*
- * Returns true if the attrmap contains the leftmost column of the index.
- * Otherwise returns false.
- *
- * attrmap is a map of local attributes to remote ones. We can consult this
- * map to check whether the local index attribute has a corresponding remote
- * attribute.
- */
-static bool
-RemoteRelContainsLeftMostColumnOnIdx(IndexInfo *indexInfo, AttrMap *attrmap)
-{
-	AttrNumber	keycol;
-
-	Assert(indexInfo->ii_NumIndexAttrs >= 1);
-
-	keycol = indexInfo->ii_IndexAttrNumbers[0];
-	if (!AttributeNumberIsValid(keycol))
-		return false;
-
-	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol))
-		return false;
-
-	return attrmap->attnums[AttrNumberGetAttrOffset(keycol)] >= 0;
-}
-
 /*
  * Returns the oid of an index that can be used by the apply worker to scan
- * the relation. The index must be btree or hash, non-partial, and the leftmost
- * field must be a column (not an expression) that references the remote
- * relation column. These limitations help to keep the index scan similar
- * to PK/RI index scans.
- *
- * Note that the limitations of index scans for replica identity full only
- * adheres to a subset of the limitations of PK/RI. For example, we support
- * columns that are marked as [NULL] or we are not interested in the [NOT
- * DEFERRABLE] aspect of constraints here. It works for us because we always
- * compare the tuples for non-PK/RI index scans. See
- * RelationFindReplTupleByIndex().
- *
- * XXX: See IsIndexUsableForReplicaIdentityFull() to know the challenges in
- * supporting indexes other than btree and hash. For partial indexes, the
- * required changes are likely to be larger. If none of the tuples satisfy
- * the expression for the index scan, we fall-back to sequential execution,
- * which might not be a good idea in some cases.
+ * the relation.
  *
  * We expect to call this function when REPLICA IDENTITY FULL is defined for
  * the remote relation.
@@ -815,19 +753,16 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 	{
 		Oid			idxoid = lfirst_oid(lc);
 		bool		isUsableIdx;
-		bool		containsLeftMostCol;
 		Relation	idxRel;
 		IndexInfo  *idxInfo;
 
 		idxRel = index_open(idxoid, AccessShareLock);
 		idxInfo = BuildIndexInfo(idxRel);
-		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo);
-		containsLeftMostCol =
-			RemoteRelContainsLeftMostColumnOnIdx(idxInfo, attrmap);
+		isUsableIdx = IsIndexUsableForReplicaIdentityFull(idxInfo, attrmap);
 		index_close(idxRel, AccessShareLock);
 
 		/* Return the first eligible index found */
-		if (isUsableIdx && containsLeftMostCol)
+		if (isUsableIdx)
 			return idxoid;
 	}
 
@@ -835,11 +770,24 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
 }
 
 /*
- * Returns true if the index is usable for replica identity full. For details,
- * see FindUsableIndexForReplicaIdentityFull.
+ * Returns true if the index is usable for replica identity full.
  *
- * Currently, only Btree and Hash indexes can be returned as usable. This
- * is due to following reasons:
+ * The index must be btree or hash, non-partial, and the leftmost field must be
+ * a column (not an expression) that references the remote relation column. These
+ * limitations help to keep the index scan similar to PK/RI index scans.
+ *
+ * attrmap is a map of local attributes to remote ones. We can consult this
+ * map to check whether the local index attribute has a corresponding remote
+ * attribute.
+ *
+ * Note that the limitations of index scans for replica identity full only
+ * adheres to a subset of the limitations of PK/RI. For example, we support
+ * columns that are marked as [NULL] or we are not interested in the [NOT
+ * DEFERRABLE] aspect of constraints here. It works for us because we always
+ * compare the tuples for non-PK/RI index scans. See
+ * RelationFindReplTupleByIndex().
+ *
+ * The reasons why only Btree and Hash indexes can be considered as usable are:
  *
  * 1) Other index access methods don't have a fixed strategy for equality
  * operation. Refer get_equal_strategy_number_for_am().
@@ -851,16 +799,38 @@ FindUsableIndexForReplicaIdentityFull(Relation localrel, AttrMap *attrmap)
  *
  * XXX: Note that BRIN and GIN indexes do not implement "amgettuple" which
  * will be used later to fetch the tuples. See RelationFindReplTupleByIndex().
+ *
+ * XXX: To support partial indexes, the required changes are likely to be larger.
+ * If none of the tuples satisfy the expression for the index scan, we fall-back
+ * to sequential execution, which might not be a good idea in some cases.
  */
 bool
-IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap)
 {
+	AttrNumber	keycol;
+
 	/* Ensure that the index access method has a valid equal strategy */
 	if (get_equal_strategy_number_for_am(indexInfo->ii_Am) == InvalidStrategy)
 		return false;
+
+	/* The index must not be a partial index */
 	if (indexInfo->ii_Predicate != NIL)
 		return false;
-	if (IsIndexOnlyOnExpression(indexInfo))
+
+	Assert(indexInfo->ii_NumIndexAttrs >= 1);
+
+	/* The leftmost index field must not be an expression */
+	keycol = indexInfo->ii_IndexAttrNumbers[0];
+	if (!AttributeNumberIsValid(keycol))
+		return false;
+
+	/*
+	 * And the leftmost index field must reference the remote relation column.
+	 * This is because if it doesn't, the sequential scan is favorable over
+	 * index scan in most cases.
+	 */
+	if (attrmap->maplen <= AttrNumberGetAttrOffset(keycol) ||
+		attrmap->attnums[AttrNumberGetAttrOffset(keycol)] < 0)
 		return false;
 
 #ifdef USE_ASSERT_CHECKING
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index dd353fd1cb..0dd69739fb 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -140,6 +140,7 @@
 #include <sys/stat.h>
 #include <unistd.h>
 
+#include "access/genam.h"
 #include "access/table.h"
 #include "access/tableam.h"
 #include "access/twophase.h"
@@ -410,7 +411,7 @@ static void apply_handle_delete_internal(ApplyExecutionData *edata,
 										 ResultRelInfo *relinfo,
 										 TupleTableSlot *remoteslot,
 										 Oid localindexoid);
-static bool FindReplTupleInLocalRel(EState *estate, Relation localrel,
+static bool FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 									LogicalRepRelation *remoterel,
 									Oid localidxoid,
 									TupleTableSlot *remoteslot,
@@ -2663,7 +2664,7 @@ apply_handle_update_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel,
+	found = FindReplTupleInLocalRel(edata, localrel,
 									&relmapentry->remoterel,
 									localindexoid,
 									remoteslot, &localslot);
@@ -2816,7 +2817,7 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
 	EvalPlanQualInit(&epqstate, estate, NULL, NIL, -1, NIL);
 	ExecOpenIndices(relinfo, false);
 
-	found = FindReplTupleInLocalRel(estate, localrel, remoterel, localindexoid,
+	found = FindReplTupleInLocalRel(edata, localrel, remoterel, localindexoid,
 									remoteslot, &localslot);
 
 	/* If found delete it. */
@@ -2855,12 +2856,13 @@ apply_handle_delete_internal(ApplyExecutionData *edata,
  * Local tuple, if found, is returned in '*localslot'.
  */
 static bool
-FindReplTupleInLocalRel(EState *estate, Relation localrel,
+FindReplTupleInLocalRel(ApplyExecutionData *edata, Relation localrel,
 						LogicalRepRelation *remoterel,
 						Oid localidxoid,
 						TupleTableSlot *remoteslot,
 						TupleTableSlot **localslot)
 {
+	EState	   *estate = edata->estate;
 	bool		found;
 
 	/*
@@ -2875,9 +2877,21 @@ FindReplTupleInLocalRel(EState *estate, Relation localrel,
 		   (remoterel->replident == REPLICA_IDENTITY_FULL));
 
 	if (OidIsValid(localidxoid))
+	{
+#ifdef USE_ASSERT_CHECKING
+		Relation	idxrel = index_open(localidxoid, AccessShareLock);
+
+		/* Index must be PK, RI, or usable for REPLICA IDENTITY FULL tables */
+		Assert(GetRelationIdentityOrPK(idxrel) == localidxoid ||
+			   IsIndexUsableForReplicaIdentityFull(BuildIndexInfo(idxrel),
+												   edata->targetRel->attrmap));
+		index_close(idxrel, AccessShareLock);
+#endif
+
 		found = RelationFindReplTupleByIndex(localrel, localidxoid,
 											 LockTupleExclusive,
 											 remoteslot, *localslot);
+	}
 	else
 		found = RelationFindReplTupleSeq(localrel, LockTupleExclusive,
 										 remoteslot, *localslot);
@@ -2995,7 +3009,7 @@ apply_handle_tuple_routing(ApplyExecutionData *edata,
 				bool		found;
 
 				/* Get the matching local tuple from the partition. */
-				found = FindReplTupleInLocalRel(estate, partrel,
+				found = FindReplTupleInLocalRel(edata, partrel,
 												&part_entry->remoterel,
 												part_entry->localindexoid,
 												remoteslot_part, &localslot);
diff --git a/src/include/replication/logicalrelation.h b/src/include/replication/logicalrelation.h
index 921b9974db..3f4d906d74 100644
--- a/src/include/replication/logicalrelation.h
+++ b/src/include/replication/logicalrelation.h
@@ -48,7 +48,7 @@ extern LogicalRepRelMapEntry *logicalrep_partition_open(LogicalRepRelMapEntry *r
 														Relation partrel, AttrMap *map);
 extern void logicalrep_rel_close(LogicalRepRelMapEntry *rel,
 								 LOCKMODE lockmode);
-extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo);
+extern bool IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo, AttrMap *attrmap);
 extern Oid	GetRelationIdentityOrPK(Relation rel);
 
 #endif							/* LOGICALRELATION_H */
-- 
2.31.1

#36Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#35)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Mon, Jul 24, 2023 at 6:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Jul 22, 2023 at 7:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You have moved most of the comments related to the restriction of
which index can be picked atop IsIndexUsableForReplicaIdentityFull().
Now, the comments related to limitation atop
FindUsableIndexForReplicaIdentityFull() look slightly odd as it refers
to limitations but those limitation were not stated. The comments I am
referring to are: "Note that the limitations of index scans for
replica identity full only .... might not be a good idea in some
cases". Shall we move these as well atop
IsIndexUsableForReplicaIdentityFull()?

Good point.

Looking at neighbor comments, the following comment looks slightly odd to me:

* XXX: See IsIndexUsableForReplicaIdentityFull() to know the challenges in
* supporting indexes other than btree and hash. For partial indexes, the
* required changes are likely to be larger. If none of the tuples satisfy
* the expression for the index scan, we fall-back to sequential execution,
* which might not be a good idea in some cases.

Are the first and second sentences related actually?

Not really.

I think we can move it as well to
IsIndexUsableForReplicaIdentityFull() with some adjustments. I've
attached the updated patch that incorporated your comment and included
my idea to update the comment.

LGTM.

--
With Regards,
Amit Kapila.

#37Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Amit Kapila (#36)
Re: doc: improve the restriction description of using indexes on REPLICA IDENTITY FULL table.

On Mon, Jul 24, 2023 at 12:05 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jul 24, 2023 at 6:39 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:

On Sat, Jul 22, 2023 at 7:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

You have moved most of the comments related to the restriction of
which index can be picked atop IsIndexUsableForReplicaIdentityFull().
Now, the comments related to limitation atop
FindUsableIndexForReplicaIdentityFull() look slightly odd as it refers
to limitations but those limitation were not stated. The comments I am
referring to are: "Note that the limitations of index scans for
replica identity full only .... might not be a good idea in some
cases". Shall we move these as well atop
IsIndexUsableForReplicaIdentityFull()?

Good point.

Looking at neighbor comments, the following comment looks slightly odd to me:

* XXX: See IsIndexUsableForReplicaIdentityFull() to know the challenges in
* supporting indexes other than btree and hash. For partial indexes, the
* required changes are likely to be larger. If none of the tuples satisfy
* the expression for the index scan, we fall-back to sequential execution,
* which might not be a good idea in some cases.

Are the first and second sentences related actually?

Not really.

I think we can move it as well to
IsIndexUsableForReplicaIdentityFull() with some adjustments. I've
attached the updated patch that incorporated your comment and included
my idea to update the comment.

LGTM.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com