Report replica identity in pg_publication_tables

Started by Ashutosh Bapat7 months ago7 messages

ashutosh.bapat.oss@gmail.com

7 months ago

Hi All,

The commit message in the patch says it all, but let me repeat it here.

When debugging issues with logical replication, replica identity
property of tables in publication is often useful, for example, to
determine the amount of data logged for an UPDATE or DELETE on a given
table.

Given a set of publications that a WAL sender is using,
pg_publication_tables can be used to get the list of tables whose
changes will be replicated including the columns of those tables and
row
filters. But the replica identity of those tables needs to be
separately found out by querying pg_class or joining pg_class with
pg_publication_tables on schemaname and relname. Adding the replica
identity column to pg_publication_tables avoids this extra step.

The replica identity of a given table is not a property of
publication, per say, so it's arguable whether it should be included
in pg_publication_tables or not. But the output seems to be useful.
E.g. from the tests

SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
pubname | schemaname | tablename | attnames | rowfilter |
replica_identity
----------+------------+---------------------+----------+-----------+------------------
testpub6 | public | rf_tbl_abcd_part_pk | {a,b} | (b > 99) | default
(1 row)

This line gives all the information related to logical replication of
table rf_tbl_abcd_part_pk together.

--
Best Wishes,
Ashutosh Bapat

Amit Kapila

amit.kapila16@gmail.com

7 months ago

In reply to: Ashutosh Bapat (#1)

Re: Report replica identity in pg_publication_tables

On Mon, Jun 30, 2025 at 3:44 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Hi All,

The commit message in the patch says it all, but let me repeat it here.

You forgot to attach the patch.

When debugging issues with logical replication, replica identity
property of tables in publication is often useful, for example, to
determine the amount of data logged for an UPDATE or DELETE on a given
table.

I think it can help to determine what is logged for the DELETE or
UPDATE operation, but not the exact amount of data. Can you please
explain with an example how such information can help with debugging?

Given a set of publications that a WAL sender is using,
pg_publication_tables can be used to get the list of tables whose
changes will be replicated including the columns of those tables and
row
filters. But the replica identity of those tables needs to be
separately found out by querying pg_class or joining pg_class with
pg_publication_tables on schemaname and relname. Adding the replica
identity column to pg_publication_tables avoids this extra step.

The replica identity of a given table is not a property of
publication, per say, so it's arguable whether it should be included
in pg_publication_tables or not.

Right, discussing the use case a bit more might help us to find if
this is the right place to add 'replica identity' information.

--
With Regards,
Amit Kapila.

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

7 months ago

In reply to: Amit Kapila (#2)

1 attachment(s)

Re: Report replica identity in pg_publication_tables

On Mon, Jun 30, 2025 at 5:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Jun 30, 2025 at 3:44 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Hi All,

The commit message in the patch says it all, but let me repeat it here.

You forgot to attach the patch.

Sorry. Here it is

When debugging issues with logical replication, replica identity
property of tables in publication is often useful, for example, to
determine the amount of data logged for an UPDATE or DELETE on a given
table.

I think it can help to determine what is logged for the DELETE or
UPDATE operation, but not the exact amount of data. Can you please
explain with an example how such information can help with debugging?

No. The change itself won't tell the amount of data that will be
logged. But given a publication it will tell what all tables being
published by that publication are using replica identity full - which
causes more columns to be logged compared to replica identity default
or index.

Given a set of publications that a WAL sender is using,
pg_publication_tables can be used to get the list of tables whose
changes will be replicated including the columns of those tables and
row
filters. But the replica identity of those tables needs to be
separately found out by querying pg_class or joining pg_class with
pg_publication_tables on schemaname and relname. Adding the replica
identity column to pg_publication_tables avoids this extra step.

The replica identity of a given table is not a property of
publication, per say, so it's arguable whether it should be included
in pg_publication_tables or not.

Right, discussing the use case a bit more might help us to find if
this is the right place to add 'replica identity' information.

In our case, we had multiple replication slots, each with a different
publication. One of the slots was lagging because of the amount of
data being sent through the slot. For that we wanted to know which
tables are being published through the corresponding publication and
what's the replica identity of each of the tables.
pg_publication_tables gave us the list of tables but in order to get
its replica identity we needed to join it with pg_class again.
pg_publication_tables already joins pg_class. Exposing replica
identity through pg_publication_tables makes it more convenient to get
all the information related to a tables replication through that
publication in a single line without much code change or run time
cost.

--
Best Wishes,
Ashutosh Bapat

Attachments:

0001-Report-replica-identity-property-of-tables--20250630.patchtext/x-patch; charset=US-ASCII; name=0001-Report-replica-identity-property-of-tables--20250630.patchDownload

From 534135e55ea228a42f735f9dd3cc3bead9b12f70 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Date: Fri, 27 Jun 2025 12:25:31 +0530
Subject: [PATCH] Report replica identity property of tables in
 pg_publication_tables

When debugging issues with logical replication, replica identity
property of tables in publication is often useful, for example, to
determine the amount of data logged for an UPDATE or DELETE on a table.

Given a set of publications that a WAL sender is using,
pg_publication_tables can be used to get the list of tables whose
changes will be replicated including the columns of those tables and row
filters. But the replica identity of those tables needs to be separately
found out by querying pg_class or joining pg_class with
pg_publication_tables. Adding replica identity column to
pg_publication_tables avoids this extra step.

The replica identity for a given table does not change with publication
so the information will be repeated as many times the number of
publications a given table is part of. But the repetition is worth the
convenience.

Ashutosh Bapat
---
 doc/src/sgml/system-views.sgml            |  9 +++
 src/backend/catalog/system_views.sql      |  9 ++-
 src/test/regress/expected/publication.out | 96 ++++++++++++++++++-----
 src/test/regress/expected/rules.out       |  9 ++-
 src/test/regress/sql/publication.sql      |  9 +++
 5 files changed, 109 insertions(+), 23 deletions(-)

diff --git a/doc/src/sgml/system-views.sgml b/doc/src/sgml/system-views.sgml
index 986ae1f543d..8ec1b7ba499 100644
--- a/doc/src/sgml/system-views.sgml
+++ b/doc/src/sgml/system-views.sgml
@@ -2558,6 +2558,15 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
        Expression for the table's publication qualifying condition
       </para></entry>
      </row>
+
+     <row>
+      <entry role="catalog_table_entry"><para role="column_definition">
+       <structfield>replica_identity</structfield> <type>text</type>
+      </para>
+      <para>
+       Replica identity setting of the table.
+      </para></entry>
+     </row>
     </tbody>
    </tgroup>
   </table>
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 08f780a2e63..34fca1bba54 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -388,7 +388,14 @@ CREATE VIEW pg_publication_tables AS
           WHERE a.attrelid = GPT.relid AND
                 a.attnum = ANY(GPT.attrs)
         ) AS attnames,
-        pg_get_expr(GPT.qual, GPT.relid) AS rowfilter
+        pg_get_expr(GPT.qual, GPT.relid) AS rowfilter,
+        case C.relreplident
+            when 'd' then 'default'
+            when 'n' then 'nothing'
+            when 'f' then 'full'
+            when 'i' then 'index'
+            else NULL
+        end as replica_identity
     FROM pg_publication P,
          LATERAL pg_get_publication_tables(P.pubname) GPT,
          pg_class C JOIN pg_namespace N ON (N.oid = C.relnamespace)
diff --git a/src/test/regress/expected/publication.out b/src/test/regress/expected/publication.out
index 3a2eacd793f..2ecad5cfacd 100644
--- a/src/test/regress/expected/publication.out
+++ b/src/test/regress/expected/publication.out
@@ -588,21 +588,45 @@ ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
 UPDATE rf_tbl_abcd_pk SET a = 1;
 ERROR:  cannot update table "rf_tbl_abcd_pk"
 DETAIL:  Column used in the publication WHERE expression is not part of the replica identity.
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |   tablename    | attnames  | rowfilter | replica_identity 
+----------+------------+----------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_pk | {a,b,c,d} | (d > 99)  | default
+(1 row)
+
 -- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not part of REPLICA IDENTITY
 UPDATE rf_tbl_abcd_nopk SET a = 1;
 ERROR:  cannot update table "rf_tbl_abcd_nopk"
 DETAIL:  Column used in the publication WHERE expression is not part of the replica identity.
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |    tablename     | attnames  | rowfilter | replica_identity 
+----------+------------+------------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_nopk | {a,b,c,d} | (a > 99)  | default
+(1 row)
+
 -- Case 2. REPLICA IDENTITY FULL
 ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
 ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 -- ok - "c" is in REPLICA IDENTITY now even though not in PK
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |   tablename    | attnames  | rowfilter | replica_identity 
+----------+------------+----------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_pk | {a,b,c,d} | (c > 99)  | full
+(1 row)
+
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- ok - "a" is in REPLICA IDENTITY now
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |    tablename     | attnames  | rowfilter | replica_identity 
+----------+------------+------------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_nopk | {a,b,c,d} | (a > 99)  | full
+(1 row)
+
 -- Case 3. REPLICA IDENTITY NOTHING
 ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
 ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY NOTHING;
@@ -616,11 +640,23 @@ ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 UPDATE rf_tbl_abcd_pk SET a = 1;
 ERROR:  cannot update table "rf_tbl_abcd_pk"
 DETAIL:  Column used in the publication WHERE expression is not part of the replica identity.
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |   tablename    | attnames  | rowfilter | replica_identity 
+----------+------------+----------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_pk | {a,b,c,d} | (c > 99)  | nothing
+(1 row)
+
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not in REPLICA IDENTITY NOTHING
 UPDATE rf_tbl_abcd_nopk SET a = 1;
 ERROR:  cannot update table "rf_tbl_abcd_nopk"
 DETAIL:  Column used in the publication WHERE expression is not part of the replica identity.
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |    tablename     | attnames  | rowfilter | replica_identity 
+----------+------------+------------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_nopk | {a,b,c,d} | (a > 99)  | nothing
+(1 row)
+
 -- Case 4. REPLICA IDENTITY INDEX
 ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
 CREATE UNIQUE INDEX idx_abcd_pk_c ON rf_tbl_abcd_pk(c);
@@ -636,6 +672,12 @@ DETAIL:  Column used in the publication WHERE expression is not part of the repl
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 -- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |   tablename    | attnames  | rowfilter | replica_identity 
+----------+------------+----------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_pk | {a,b,c,d} | (c > 99)  | index
+(1 row)
+
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not in REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_nopk SET a = 1;
@@ -644,6 +686,12 @@ DETAIL:  Column used in the publication WHERE expression is not part of the repl
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
 -- ok - "c" is part of REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |    tablename     | attnames  | rowfilter | replica_identity 
+----------+------------+------------------+-----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_nopk | {a,b,c,d} | (c > 99)  | index
+(1 row)
+
 -- Tests for partitioned table
 -- set PUBLISH_VIA_PARTITION_ROOT to false and test row filter for partitioned
 -- table
@@ -690,6 +738,12 @@ ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
 UPDATE rf_tbl_abcd_part_pk SET a = 1;
 ERROR:  cannot update table "rf_tbl_abcd_part_pk_1"
 DETAIL:  Column used in the publication WHERE expression is not part of the replica identity.
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
+ pubname  | schemaname |      tablename      | attnames | rowfilter | replica_identity 
+----------+------------+---------------------+----------+-----------+------------------
+ testpub6 | public     | rf_tbl_abcd_part_pk | {a,b}    | (b > 99)  | default
+(1 row)
+
 DROP PUBLICATION testpub6;
 DROP TABLE rf_tbl_abcd_pk;
 DROP TABLE rf_tbl_abcd_nopk;
@@ -1774,52 +1828,52 @@ CREATE TABLE sch2.tbl1_part1 PARTITION OF sch1.tbl1 FOR VALUES FROM (1) to (10);
 -- Schema publication that does not include the schema that has the parent table
 CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch2 WITH (PUBLISH_VIA_PARTITION_ROOT=1);
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename  | attnames | rowfilter 
----------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pubname | schemaname | tablename  | attnames | rowfilter | replica_identity 
+---------+------------+------------+----------+-----------+------------------
+ pub     | sch2       | tbl1_part1 | {a}      |           | default
 (1 row)
 
 DROP PUBLICATION pub;
 -- Table publication that does not include the parent table
 CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH (PUBLISH_VIA_PARTITION_ROOT=1);
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename  | attnames | rowfilter 
----------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pubname | schemaname | tablename  | attnames | rowfilter | replica_identity 
+---------+------------+------------+----------+-----------+------------------
+ pub     | sch2       | tbl1_part1 | {a}      |           | default
 (1 row)
 
 -- Table publication that includes both the parent table and the child table
 ALTER PUBLICATION pub ADD TABLE sch1.tbl1;
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename | attnames | rowfilter 
----------+------------+-----------+----------+-----------
- pub     | sch1       | tbl1      | {a}      | 
+ pubname | schemaname | tablename | attnames | rowfilter | replica_identity 
+---------+------------+-----------+----------+-----------+------------------
+ pub     | sch1       | tbl1      | {a}      |           | default
 (1 row)
 
 DROP PUBLICATION pub;
 -- Schema publication that does not include the schema that has the parent table
 CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch2 WITH (PUBLISH_VIA_PARTITION_ROOT=0);
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename  | attnames | rowfilter 
----------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pubname | schemaname | tablename  | attnames | rowfilter | replica_identity 
+---------+------------+------------+----------+-----------+------------------
+ pub     | sch2       | tbl1_part1 | {a}      |           | default
 (1 row)
 
 DROP PUBLICATION pub;
 -- Table publication that does not include the parent table
 CREATE PUBLICATION pub FOR TABLE sch2.tbl1_part1 WITH (PUBLISH_VIA_PARTITION_ROOT=0);
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename  | attnames | rowfilter 
----------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pubname | schemaname | tablename  | attnames | rowfilter | replica_identity 
+---------+------------+------------+----------+-----------+------------------
+ pub     | sch2       | tbl1_part1 | {a}      |           | default
 (1 row)
 
 -- Table publication that includes both the parent table and the child table
 ALTER PUBLICATION pub ADD TABLE sch1.tbl1;
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename  | attnames | rowfilter 
----------+------------+------------+----------+-----------
- pub     | sch2       | tbl1_part1 | {a}      | 
+ pubname | schemaname | tablename  | attnames | rowfilter | replica_identity 
+---------+------------+------------+----------+-----------+------------------
+ pub     | sch2       | tbl1_part1 | {a}      |           | default
 (1 row)
 
 DROP PUBLICATION pub;
@@ -1832,9 +1886,9 @@ CREATE TABLE sch1.tbl1_part3 (a int) PARTITION BY RANGE(a);
 ALTER TABLE sch1.tbl1 ATTACH PARTITION sch1.tbl1_part3 FOR VALUES FROM (20) to (30);
 CREATE PUBLICATION pub FOR TABLES IN SCHEMA sch1 WITH (PUBLISH_VIA_PARTITION_ROOT=1);
 SELECT * FROM pg_publication_tables;
- pubname | schemaname | tablename | attnames | rowfilter 
----------+------------+-----------+----------+-----------
- pub     | sch1       | tbl1      | {a}      | 
+ pubname | schemaname | tablename | attnames | rowfilter | replica_identity 
+---------+------------+-----------+----------+-----------+------------------
+ pub     | sch1       | tbl1      | {a}      |           | default
 (1 row)
 
 RESET client_min_messages;
diff --git a/src/test/regress/expected/rules.out b/src/test/regress/expected/rules.out
index 6cf828ca8d0..682408f3598 100644
--- a/src/test/regress/expected/rules.out
+++ b/src/test/regress/expected/rules.out
@@ -1464,7 +1464,14 @@ pg_publication_tables| SELECT p.pubname,
     ( SELECT array_agg(a.attname ORDER BY a.attnum) AS array_agg
            FROM pg_attribute a
           WHERE ((a.attrelid = gpt.relid) AND (a.attnum = ANY ((gpt.attrs)::smallint[])))) AS attnames,
-    pg_get_expr(gpt.qual, gpt.relid) AS rowfilter
+    pg_get_expr(gpt.qual, gpt.relid) AS rowfilter,
+        CASE c.relreplident
+            WHEN 'd'::"char" THEN 'default'::text
+            WHEN 'n'::"char" THEN 'nothing'::text
+            WHEN 'f'::"char" THEN 'full'::text
+            WHEN 'i'::"char" THEN 'index'::text
+            ELSE NULL::text
+        END AS replica_identity
    FROM pg_publication p,
     LATERAL pg_get_publication_tables(VARIADIC ARRAY[(p.pubname)::text]) gpt(pubid, relid, attrs, qual),
     (pg_class c
diff --git a/src/test/regress/sql/publication.sql b/src/test/regress/sql/publication.sql
index c9e309190df..d78a4b1cbf7 100644
--- a/src/test/regress/sql/publication.sql
+++ b/src/test/regress/sql/publication.sql
@@ -317,10 +317,12 @@ UPDATE rf_tbl_abcd_pk SET a = 1;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (d > 99);
 -- fail - "d" is not part of the PK
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 -- 1b. REPLICA IDENTITY is DEFAULT and table has no PK
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not part of REPLICA IDENTITY
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 
 -- Case 2. REPLICA IDENTITY FULL
 ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY FULL;
@@ -328,9 +330,11 @@ ALTER TABLE rf_tbl_abcd_nopk REPLICA IDENTITY FULL;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 -- ok - "c" is in REPLICA IDENTITY now even though not in PK
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- ok - "a" is in REPLICA IDENTITY now
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 
 -- Case 3. REPLICA IDENTITY NOTHING
 ALTER TABLE rf_tbl_abcd_pk REPLICA IDENTITY NOTHING;
@@ -341,9 +345,11 @@ UPDATE rf_tbl_abcd_pk SET a = 1;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 -- fail - "c" is not in PK and not in REPLICA IDENTITY NOTHING
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not in REPLICA IDENTITY NOTHING
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 
 -- Case 4. REPLICA IDENTITY INDEX
 ALTER TABLE rf_tbl_abcd_pk ALTER COLUMN c SET NOT NULL;
@@ -358,12 +364,14 @@ UPDATE rf_tbl_abcd_pk SET a = 1;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_pk WHERE (c > 99);
 -- ok - "c" is not in PK but it is part of REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (a > 99);
 -- fail - "a" is not in REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_nopk SET a = 1;
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_nopk WHERE (c > 99);
 -- ok - "c" is part of REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_nopk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 
 -- Tests for partitioned table
 
@@ -404,6 +412,7 @@ ALTER PUBLICATION testpub6 SET (PUBLISH_VIA_PARTITION_ROOT=1);
 ALTER PUBLICATION testpub6 SET TABLE rf_tbl_abcd_part_pk WHERE (b > 99);
 -- fail - "b" is not in REPLICA IDENTITY INDEX
 UPDATE rf_tbl_abcd_part_pk SET a = 1;
+SELECT * FROM pg_publication_tables WHERE pubname = 'testpub6';
 
 DROP PUBLICATION testpub6;
 DROP TABLE rf_tbl_abcd_pk;

base-commit: a6a4641252ed166ba187d7fbe0504ddb5a5f0e33
-- 
2.34.1

Melih Mutlu

m.melihmutlu@gmail.com

6 months ago

In reply to: Amit Kapila (#2)

Re: Report replica identity in pg_publication_tables

Hi Ashutosh,

Thanks for the patch.

Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>, 30 Haz 2025 Pzt, 13:14
tarihinde şunu yazdı:

The replica identity of a given table is not a property of
publication, per say, so it's arguable whether it should be included
in pg_publication_tables or not. But the output seems to be useful.
E.g. from the tests

To my understanding, pg_publication_tables doesn’t just contain publication
properties, but also information about the tables included in the
publication. As long as the exposed information about the published tables
is relevant in the context of publications in logical replication, I think
it’s reasonable to include it in pg_publication_tables. IMHO replica
identity seems like relevant information.

Regards,
--
Melih Mutlu

Zhijie Hou (Fujitsu)

houzj.fnst@fujitsu.com

6 months ago

In reply to: Ashutosh Bapat (#3)

RE: Report replica identity in pg_publication_tables

On Mon, Jun 30, 2025 at 8:09 PM Ashutosh Bapat wrote:

On Mon, Jun 30, 2025 at 5:17 PM Amit Kapila <amit.kapila16@gmail.com>
wrote:

On Mon, Jun 30, 2025 at 3:44 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

Given a set of publications that a WAL sender is using,
pg_publication_tables can be used to get the list of tables whose
changes will be replicated including the columns of those tables
and row filters. But the replica identity of those tables needs to
be separately found out by querying pg_class or joining pg_class
with pg_publication_tables on schemaname and relname. Adding the
replica identity column to pg_publication_tables avoids this extra step.

The replica identity of a given table is not a property of
publication, per say, so it's arguable whether it should be
included in pg_publication_tables or not.

Right, discussing the use case a bit more might help us to find if
this is the right place to add 'replica identity' information.

In our case, we had multiple replication slots, each with a different publication.
One of the slots was lagging because of the amount of data being sent
through the slot. For that we wanted to know which tables are being
published through the corresponding publication and what's the replica
identity of each of the tables.
pg_publication_tables gave us the list of tables but in order to get
its replica identity we needed to join it with pg_class again.
pg_publication_tables already joins pg_class. Exposing replica
identity through pg_publication_tables makes it more convenient to get
all the information related to a tables replication through that
publication in a single line without much code change or run time cost.

Thanks for explaining the use case.

I'm concerned about whether we can correctly display replica identity in the
view for partitioned tables.

In the case of partitioned tables, we display only the root table in the view
if publish_via_partition_root is enabled for the publication. However, the
logged information depends on the replica identity (RI) of each child
partition. Displaying only the root table's RI could be a bit confusing since
each partition might have different RI settings, making it challenging to
represent all of them in the entry for the root table.

Best Regards,
Hou zj

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

6 months ago

In reply to: Zhijie Hou (Fujitsu) (#5)

Re: Report replica identity in pg_publication_tables

On Wed, Jul 2, 2025 at 1:46 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

I'm concerned about whether we can correctly display replica identity in the
view for partitioned tables.

In the case of partitioned tables, we display only the root table in the view
if publish_via_partition_root is enabled for the publication. However, the
logged information depends on the replica identity (RI) of each child
partition. Displaying only the root table's RI could be a bit confusing since
each partition might have different RI settings, making it challenging to
represent all of them in the entry for the root table.

I see following possibilities:
1. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show "partitioned" or "partition root" in
replica identity column with a note in documentation that it means
that the replica identity of the actual leaf partition would be used
instead of replica identity of the partitioned table. And then it will
be for the user to find out all the replica identities.
2. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show a set of distinct replica identities of
the leaf partitions. That way, if all the leaf partitions have the
same replica identity setting, that will be reported. Otherwise, the
user will be able to know all the possible replica identity settings
that may be used. Again, documentation should clarify what this value
means for a partitioned table. But then somehow we need to indicate
whether a given table is partitioned or not.
3. We report "partitioned" or "partition root" followed by all the
distinct replica identities, thus letting the user know that the table
is partitioned and what replica identity settings its partitions have.
And document.

Since the targeted use of this column is to know the replica
identities of the tables published by a given publication so that they
can guess what might be WAL logged, these options seem ok along with
the clarifying documentation.

What do you think?
--
Best Wishes,
Ashutosh Bapat

Ashutosh Bapat

ashutosh.bapat.oss@gmail.com

about 2 months ago

In reply to: Ashutosh Bapat (#6)

Re: Report replica identity in pg_publication_tables

On Fri, Jul 4, 2025 at 8:16 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:

On Wed, Jul 2, 2025 at 1:46 PM Zhijie Hou (Fujitsu)
<houzj.fnst@fujitsu.com> wrote:

I'm concerned about whether we can correctly display replica identity in the
view for partitioned tables.

In the case of partitioned tables, we display only the root table in the view
if publish_via_partition_root is enabled for the publication. However, the
logged information depends on the replica identity (RI) of each child
partition. Displaying only the root table's RI could be a bit confusing since
each partition might have different RI settings, making it challenging to
represent all of them in the entry for the root table.

I see following possibilities:
1. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show "partitioned" or "partition root" in
replica identity column with a note in documentation that it means
that the replica identity of the actual leaf partition would be used
instead of replica identity of the partitioned table. And then it will
be for the user to find out all the replica identities.
2. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show a set of distinct replica identities of
the leaf partitions. That way, if all the leaf partitions have the
same replica identity setting, that will be reported. Otherwise, the
user will be able to know all the possible replica identity settings
that may be used. Again, documentation should clarify what this value
means for a partitioned table. But then somehow we need to indicate
whether a given table is partitioned or not.
3. We report "partitioned" or "partition root" followed by all the
distinct replica identities, thus letting the user know that the table
is partitioned and what replica identity settings its partitions have.
And document.

Since the targeted use of this column is to know the replica
identities of the tables published by a given publication so that they
can guess what might be WAL logged, these options seem ok along with
the clarifying documentation.

What do you think?

I researched this a bit more and then I found some inconsistency
between the code and the documentation

At https://www.postgresql.org/docs/current/sql-createpublication.html,
we mention that changes to a partition table will use identity of a
partititioned table when publish_via_partition_root is true.

--- quote documentation
This parameter determines whether changes in a partitioned table (or
on its partitions) contained in the publication will be published
using the identity and schema of the partitioned table rather than
that of the individual partitions that are actually changed; the
latter is the default. Enabling this allows the changes to be
replicated into a non-partitioned table or a partitioned table
consisting of a different set of partitions.

There can be a case where a subscription combines multiple
publications. If a partitioned table is published by any subscribed
publications which set publish_via_partition_root = true, changes on
this partitioned table (or on its partitions) will be published using
the identity and schema of this partitioned table rather than that of
the individual partitions.
--- unquote documentation

Which identity are we talking about here? If that's replica identity,
the documentation is wrong since code says otherwise
In pub_rf_contains_invalid_column()

/*
* For a partition, if pubviaroot is true, find the topmost ancestor that
* is published via this publication as we need to use its row filter
* expression to filter the partition's changes.
*
* Note that even though the row filter used is for an ancestor, the
* REPLICA IDENTITY used will be for the actual child table.
*/

Let me revise possible solutions so that this view can report replica
identity even in case of publish_via_partition_root

1. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show replica identity of the root partition
suffixed by (mixed). E.g. full (mixed). Add a note in documentation
explaining the meaning of suffix "mixed".

2. If a publication has publish_via_partition_root true, then for all
the partitioned tables we show a set of distinct replica identities of
its partitions and also the replica identity of the root. If all the
partitions have the same replica identity setting, only one replica
identity type will be reported. Otherwise, the user will be able to
know all the possible replica identity settings that may be used. Add
a note to the documentation explaining the meaning of this value for a
partitioned table.

3. Let the view report all the partitions as well as the partition
root. Against each relation, report its replica identity. Expand the
pg_publication_tables view to also report the relation via which the
changes are published (under column name "publish_via"). When
publish_via_partition_root is true, publish_via reports the top
partitioned relation. I think that gives full information about tables
being published through the publication.

What do you think?

--
Best Wishes,
Ashutosh Bapat